Jina Embedder

The JinaEmbedder class is used to embed text data into vectors using the Jina AI API. Jina provides high-quality embeddings with support for different embedding types and late chunking for improved processing of long documents. Get your API key from here.

Setup

Set your JINA_API_KEY environment variable.

export JINA_API_KEY="xxx"

Run PgVector

docker run - d \
    - e POSTGRES_DB = ai \
    - e POSTGRES_USER = ai \
    - e POSTGRES_PASSWORD = ai \
    - e PGDATA = /var/lib/postgresql/data/pgdata \
    - v pgvolume: / var/lib/postgresql/data \
    - p 5532: 5432 \
    - -name pgvector \
    agnohq/pgvector: 16

Usage

cookbook/embedders/jina_embedder.py

from agno.agent import AgentKnowledge
from agno.vectordb.pgvector import PgVector
from agno.embedder.jina import JinaEmbedder

# Basic usage - automatically loads from JINA_API_KEY environment variable
embeddings = JinaEmbedder().get_embedding(
    "The quick brown fox jumps over the lazy dog."
)

# Print the embeddings and their dimensions
print(f"Embeddings: {embeddings[:5]}")
print(f"Dimensions: {len(embeddings)}")

# Custom configuration with late chunking for long documents
custom_embedder = JinaEmbedder(
    dimensions=1024,
    late_chunking=True,  # Improved processing for long documents
    timeout=30.0,  # Request timeout in seconds
)

# Get embedding with usage information
embedding, usage = custom_embedder.get_embedding_and_usage(
    "Advanced text processing with Jina embeddings and late chunking."
)
print(f"Embedding dimensions: {len(embedding)}")
if usage:
    print(f"Usage info: {usage}")

# Use an embedder in a knowledge base
knowledge_base = AgentKnowledge(
    vector_db=PgVector(
        db_url="postgresql+psycopg://ai:ai@localhost:5532/ai",
        table_name="jina_embeddings",
        embedder=JinaEmbedder(
            late_chunking=True,  # Better handling of long documents
            timeout=30.0,  # Configure request timeout
        ),
    ),
    num_documents=2,
)

Params

Parameter	Type	Default	Description
`id`	`str`	`"jina-embeddings-v3"`	The model ID used for generating embeddings.
`dimensions`	`int`	`1024`	The dimensionality of the embeddings generated by the model.
`embedding_type`	`Literal['float', 'base64', 'int8']`	`"float"`	The format in which the embeddings are encoded. Options are “float”, “base64”, or “int8”.
`late_chunking`	`bool`	`False`	Whether to use late chunking for improved processing of long documents.
`user`	`Optional[str]`	-	The user associated with the API request.
`api_key`	`Optional[str]`	-	The API key used for authenticating requests.
`base_url`	`str`	`"https://api.jina.ai/v1/embeddings"`	The base URL for the API endpoint.
`headers`	`Optional[Dict[str, str]]`	-	Additional headers to include in the API request.
`request_params`	`Optional[Dict[str, Any]]`	-	Additional parameters to include in the API request.
`timeout`	`Optional[float]`	-	Request timeout in seconds.

Developer Resources

View Cookbook

Introduction

Concepts

Other

How to

Jina Embedder

Setup

Run PgVector

Usage

Params

Developer Resources

​Setup

​Run PgVector

​Usage

​Params

​Developer Resources

Setup

Run PgVector

Usage

Params

Developer Resources