Here's our attempt to outline in a simplified format what the RAG (Retrieval Augmented Generation) technique could mean in practice (picture below).
The goal is to use natural language to interrogate specific data sets. For example, when the quarterly US SEC 10-Q filings come out, an analyst might want to ask, "Tell me the capex trends and related themes for these 20 energy companies".
A fund manager sitting on a pool of 1,000 research reports emailed by brokers could ask, "List specific reasons to be bullish or bearish on the data center sector in South East Asia".
The answers must be based on specific information in the defined pool of sources.
Our analyst Roya Ai (powered by GPT-4-Turbo Model), explains the RAG this way "for the non-tech business audience":
"Retrieval Augmented Generation (RAG) is a technology that enhances automated text creation by first searching a large database of information to find relevant content and then using that content to help generate accurate and informative responses, much like a skilled researcher might do when writing a report."
In the illustration below, on the document preparation side (Steps 1-3), RAG would normally involve converting the (text) document sets/databases into vector databases (which may require splitting the original documents into smaller 'chunks') by embedding the text - i.e. converting text into numbers (vectors).
Then in the handling of the user 'queries' (Steps 4-6), the natural language question gets also converted into vectors, and the database search is done with one of many similarity algorithms to identify the 'best matches' (Step 7).
Finally (Steps 7-10), those best matches are passed as a context to an LLM Prompt with the original question, and the response is presented to the user.
There is hardly a limit to variations of the above scenario, which could include e.g. combining a vector database with a graph database to better capture relationships, or other advancements in techniques to optimize the retrieval of the relevant chunks of the text (Steps 1-3, 6-7).
There is a lot of experimentation and excitement about RAG in the market, as well as arguments about major constraints and practical challenges, or why RAG one day might become obsolete.
More of our take on it at Robotic Online Intelligence (ROI) in Part 2.