Manual Knowledge Filters
Manual filtering gives you full control over which documents are searched by specifying filters directly in your code.
There are two ways to attach metadata to your documents:
-
Attach Metadata When Initializing the Knowledge Base
knowledge_base = PDFKnowledgeBase(
path=[
{
"path": "path/to/cv1.pdf",
"metadata": {
"user_id": "jordan_mitchell",
"document_type": "cv",
"year": 2025,
},
},
# ... more documents ...
],
vector_db=vector_db,
)
knowledge_base.load(recreate=True)
-
Attach Metadata When Loading Documents One by One
# Initialize the PDFKnowledgeBase
knowledge_base = PDFKnowledgeBase(
vector_db=vector_db,
num_documents=5,
)
# Load first document with user_1 metadata
knowledge_base.load_document(
path=path/to/cv1.pdf,
metadata={"user_id": "jordan_mitchell", "document_type": "cv", "year": 2025},
recreate=True, # Set to True only for the first run, then set to False
)
# Load second document with user_2 metadata
knowledge_base.load_document(
path=path/to/cv2.pdf,
metadata={"user_id": "taylor_brooks", "document_type": "cv", "year": 2025},
)
💡 Tips:
• Use Option 1 if you have all your documents and metadata ready at once.
• Use Option 2 if you want to add documents incrementally or as they become available.
Step 2: Query with Filters
You can pass filters in two ways:
1. On the Agent (applies to all queries)
agent = Agent(
knowledge=knowledge_base,
search_knowledge=True,
knowledge_filters={"user_id": "jordan_mitchell"},
)
agent.print_response(
"Tell me about Jordan Mitchell's experience and skills",
markdown=True,
)
2. On Each Query (overrides Agent filters for that run)
agent = Agent(
knowledge=knowledge_base,
search_knowledge=True,
)
agent.print_response(
"Tell me about Jordan Mitchell's experience and skills",
knowledge_filters={"user_id": "jordan_mitchell"},
markdown=True,
)
If you pass filters both on the Agent and on the query, the query-level filters take precedence.
Combining Multiple Filters
You can filter by multiple fields:
agent = Agent(
knowledge=knowledge_base,
search_knowledge=True,
knowledge_filters={
"user_id": "jordan_mitchell",
"document_type": "cv",
"year": 2025,
}
)
agent.print_response(
"Tell me about Jordan Mitchell's experience and skills",
markdown=True,
)
Try It Yourself!
- Load documents with different metadata.
- Query with different filter combinations.
- Observe how the results change!
Developer Resources