30th November, 2022 will be remembered as the watershed moment in artificial intelligence. OpenAI released ChatGPT and the world was mesmerised. Interest in previously obscure terms like Generative AI and Large Language Models (LLMs), was unstoppable over the following 12 months.
The Curse Of The LLM's
As usage exploded, so did the expectations. Many users started using ChatGPT as a source of information, like an alternative to Google. As a result, they also started encountering prominent weaknesses of the system. Concerns around copyright, privacy, security, ability to do mathematical calculations etc. aside, people realised that there are two major limitations of Large Language Models.
Users look at LLMs for knowledge and wisdom, yet LLMs are sophisticated predictors of what word comes next.
The Hunger For More
While the weaknesses of LLMs were being discussed, a parallel discourse around providing context to the models started. In essence, it meant creating a ChatGPT on proprietary data.
Make LLMs respond with up-to-date information
Make LLMs not respond with factually inaccurate information
Make LLMs aware of proprietary information
While model re-training/fine-tuning/reinforcement learning are options that solve the aforementioned challenges, these approaches are time-consuming and costly. In majority of the use-case, these costs are prohibitive.
In May 2020, researchers in their paper “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks” explored models which combine pre-trained parametric and non-parametric memory for language generation.
So, What is RAG?
In 2023, RAG has become one of the most used technique in the domain of Large Language Models.
User writes a prompt or a query that is passed to an orchestrator
Orchestrator sends a search query to the retriever
Retriever fetches the relevant information from the knowledge sources and sends back
Orchestrator augments the prompt with the context and sends to the LLM
LLM responds with the generated text which is displayed to the user via the orchestrator
How does RAG help?
The Retriever of an RAG system can have access to external sources of information. Therefore, the LLM is not limited to its internal knowledge. The external sources can be proprietary documents and data or even the internet.
Confidence in Responses
With the context (extra information that is retrieved) made available to the LLM, the confidence in LLM responses is increased.
As RAG technique evolves and becomes accessable with frameworks like LangChain and LlamaIndex, it is finding more and more application in LLM powered applications like QnA with documents, conversational agents, recommendation systems and for content generation.
If you’re someone interested in generative AI and Large Language Models, let’s connect on LinkedIn — https://www.linkedin.com/in/abhinav-kimothi/
Also, please read a free copy of my notes on Large Language Models — https://abhinavkimothi.gumroad.com/l/GenAILLM