Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs-v1.agno.com/llms.txt

Use this file to discover all available pages before exploring further.

The PDFBytesKnowledgeBase reads PDF content from bytes or IO streams, converts them into vector embeddings and loads them to a vector database. This is useful when working with dynamically generated PDFs, API responses, or file uploads without needing to save files to disk.

Usage

We are using a local LanceDB database for this example. Make sure it’s running
pip install pypdf
knowledge_base.py
from agno.agent import Agent
from agno.knowledge.pdf import PDFBytesKnowledgeBase
from agno.vectordb.lancedb import LanceDb

vector_db = LanceDb(
    table_name="recipes_async",
    uri="tmp/lancedb",
)

with open("data/pdfs/ThaiRecipes.pdf", "rb") as f:
    pdf_bytes = f.read()

knowledge_base = PDFBytesKnowledgeBase(
    pdfs=[pdf_bytes],
    vector_db=vector_db,
)
knowledge_base.load(recreate=False)  # Comment out after first run

agent = Agent(
    knowledge=knowledge_base,
    search_knowledge=True,
)

agent.print_response("How to make Tom Kha Gai?", markdown=True)

Params

ParameterTypeDefaultDescription
pdfsUnion[List[bytes], List[IO]]-List of PDF content as bytes or IO streams.
exclude_filesList[str][]List of file patterns to exclude (inherited from base class).
readerUnion[PDFReader, PDFImageReader]PDFReader()A PDFReader or PDFImageReader that converts the PDFs into Documents for the vector database.
PDFBytesKnowledgeBase is a subclass of the AgentKnowledge class and has access to the same params.

Developer Resources