Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs-v1.agno.com/llms.txt

Use this file to discover all available pages before exploring further.

The JinaEmbedder class is used to embed text data into vectors using the Jina AI API. Jina provides high-quality embeddings with support for different embedding types and late chunking for improved processing of long documents. Get your API key from here.

Setup

Set your JINA_API_KEY environment variable.
export JINA_API_KEY="xxx"

Run PgVector

docker run - d \
    - e POSTGRES_DB = ai \
    - e POSTGRES_USER = ai \
    - e POSTGRES_PASSWORD = ai \
    - e PGDATA = /var/lib/postgresql/data/pgdata \
    - v pgvolume: / var/lib/postgresql/data \
    - p 5532: 5432 \
    - -name pgvector \
    agnohq/pgvector: 16

Usage

cookbook/embedders/jina_embedder.py
from agno.agent import AgentKnowledge
from agno.vectordb.pgvector import PgVector
from agno.embedder.jina import JinaEmbedder

# Basic usage - automatically loads from JINA_API_KEY environment variable
embeddings = JinaEmbedder().get_embedding(
    "The quick brown fox jumps over the lazy dog."
)

# Print the embeddings and their dimensions
print(f"Embeddings: {embeddings[:5]}")
print(f"Dimensions: {len(embeddings)}")

# Custom configuration with late chunking for long documents
custom_embedder = JinaEmbedder(
    dimensions=1024,
    late_chunking=True,  # Improved processing for long documents
    timeout=30.0,  # Request timeout in seconds
)

# Get embedding with usage information
embedding, usage = custom_embedder.get_embedding_and_usage(
    "Advanced text processing with Jina embeddings and late chunking."
)
print(f"Embedding dimensions: {len(embedding)}")
if usage:
    print(f"Usage info: {usage}")

# Use an embedder in a knowledge base
knowledge_base = AgentKnowledge(
    vector_db=PgVector(
        db_url="postgresql+psycopg://ai:ai@localhost:5532/ai",
        table_name="jina_embeddings",
        embedder=JinaEmbedder(
            late_chunking=True,  # Better handling of long documents
            timeout=30.0,  # Configure request timeout
        ),
    ),
    num_documents=2,
)

Params

ParameterTypeDefaultDescription
idstr"jina-embeddings-v3"The model ID used for generating embeddings.
dimensionsint1024The dimensionality of the embeddings generated by the model.
embedding_typeLiteral['float', 'base64', 'int8']"float"The format in which the embeddings are encoded. Options are “float”, “base64”, or “int8”.
late_chunkingboolFalseWhether to use late chunking for improved processing of long documents.
userOptional[str]-The user associated with the API request.
api_keyOptional[str]-The API key used for authenticating requests.
base_urlstr"https://api.jina.ai/v1/embeddings"The base URL for the API endpoint.
headersOptional[Dict[str, str]]-Additional headers to include in the API request.
request_paramsOptional[Dict[str, Any]]-Additional parameters to include in the API request.
timeoutOptional[float]-Request timeout in seconds.

Developer Resources