RAG (Retrieval Augmented Generation) continues to be all the rage in LLM deployments.
In our niche at Robotic Online Intelligence (ROI), we continue to explore practical use-case-specific applications, as well as simplified approaches to address the major RAG challenges - here's our note from April this year.
From RAGs to Regex: This short video (with my digital R2C2 replica) gives an example of our 'Regexy RAG' in action: imagine you are looking at a 224-page public document from a pension fund, with a question on how this investor is involved with one of the major private equity funds, AND you want to do the same systematically across other similar documents - e.g. within our Kubro(TM) platform - with the full workflow for collecting, storing, classifying, extracting, and publishing information.
With the usual LLM disclaimer that the answers may not be accurate, how are we trying here to make it as accurate as possible?
We first convert the user's question into a set of keywords or regular expressions and search for these within the text, then capture the text around the matches (say 400 characters before and after the match) to generate relevant 'chunks'. We then also look at how to add various metadata to each chunk so that when the chunks get passed to the LLM prompt, more context is preserved.
Such a method is for smaller-scale cases and will not give a degree of relevance (which can be better accomplished with a vector database) but is simple, cheap, and good enough to explore the use cases, with less uncertainty and good explainability.
Of course, you could directly load the whole document(s) into ChatGPT (or another app) but with limited control, and for self-deployed open-source LLMs, the context length support may be much smaller.
We will soon share much more about the 'traditional' RAG deployment as well - applied to China property developers' reports for a start.