Introduction
Knowledge is domain-specific information that the Agent can search at runtime to make better decisions (dynamic few-shot learning) and provide accurate responses (agentic RAG). Knowledge is stored in a vector db and this searching on demand pattern is called Agentic RAG.
Agno Agents use Agentic RAG by default, meaning if you add knowledge
to an Agent, it will search this knowledge base, at runtime, for the specific information it needs to achieve its task.
The pseudo steps for adding knowledge to an Agent are:
We can give our agent access to the knowledge base in the following ways:
- We can set
search_knowledge=True
to add asearch_knowledge_base()
tool to the Agent.search_knowledge
isTrue
by default if you addknowledge
to an Agent. - We can set
add_references=True
to automatically add references from the knowledge base to the Agent’s prompt. This is the traditional 2023 RAG approach.
If you need complete control over the knowledge base search, you can pass your own retriever
function with the following signature:
This function is called during search_knowledge_base()
and is used by the Agent to retrieve references from the knowledge base.
Vector Databases
While any type of storage can act as a knowledge base, vector databases offer the best solution for retrieving relevant results from dense information quickly. Here’s how vector databases are used with Agents:
Chunk the information
Break down the knowledge into smaller chunks to ensure our search query returns only relevant results.
Load the knowledge base
Convert the chunks into embedding vectors and store them in a vector database.
Search the knowledge base
When the user sends a message, we convert the input message into an embedding and “search” for nearest neighbors in the vector database.
Loading the Knowledge Base
Before you can use a knowledge base, it needs to be loaded with embeddings that will be used for retrieval.
Asynchronous Loading
Many vector databases support asynchronous operations, which can significantly improve performance when loading large knowledge bases. You can leverage this capability using the aload()
method:
Using aload()
ensures you take full advantage of the non-blocking operations, concurrent processing, and reduced latency that async vector database operations offer. This is especially valuable in production environments with high throughput requirements.
For more details on vector database async capabilities, see the Vector Database Introduction.
Use one of the following knowledge bases to simplify the chunking, loading, searching and optimization process:
- ArXiv knowledge base: Load ArXiv papers to a knowledge base
- Combined knowledge base: Combine multiple knowledge bases into 1
- CSV knowledge base: Load local CSV files to a knowledge base
- CSV URL knowledge base: Load CSV files from a URL to a knowledge base
- Document knowledge base: Load local docx files to a knowledge base
- JSON knowledge base: Load JSON files to a knowledge base
- LangChain knowledge base: Use a Langchain retriever as a knowledge base
- PDF knowledge base: Load local PDF files to a knowledge base
- PDF URL knowledge base: Load PDF files from a URL to a knowledge base
- S3 PDF knowledge base: Load PDF files from S3 to a knowledge base
- S3 Text knowledge base: Load text files from S3 to a knowledge base
- Text knowledge base: Load text/docx files to a knowledge base
- Website knowledge base: Load website data to a knowledge base
- Wikipedia knowledge base: Load wikipedia articles to a knowledge base