The PDFBytesKnowledgeBase reads PDF content from bytes or IO streams, converts them into vector embeddings and loads them to a vector database. This is useful when working with dynamically generated PDFs, API responses, or file uploads without needing to save files to disk.Documentation Index
Fetch the complete documentation index at: https://docs-v1.agno.com/llms.txt
Use this file to discover all available pages before exploring further.
Usage
We are using a local LanceDB database for this example. Make sure it’s running
knowledge_base.py
Params
| Parameter | Type | Default | Description |
|---|---|---|---|
| pdfs | Union[List[bytes], List[IO]] | - | List of PDF content as bytes or IO streams. |
| exclude_files | List[str] | [] | List of file patterns to exclude (inherited from base class). |
| reader | Union[PDFReader, PDFImageReader] | PDFReader() | A PDFReader or PDFImageReader that converts the PDFs into Documents for the vector database. |
PDFBytesKnowledgeBase is a subclass of the AgentKnowledge class and has access to the same params.
Developer Resources
- View Sync loading Cookbook
- View Async loading Cookbook