The AwsBedrockEmbedder class is used to embed text data into vectors using the AWS Bedrock API. By default, it uses the Cohere Embed Multilingual V3 model for generating embeddings.

Setup

Set your AWS credentials

export AWS_ACCESS_KEY_ID = xxx
export AWS_SECRET_ACCESS_KEY = xxx
export AWS_REGION = xxx

By default, this embedder uses the cohere.embed-multilingual-v3 model. You must enable access to this model from the AWS Bedrock model catalog before using this embedder.

Run PgVector

docker run - d \
    - e POSTGRES_DB = ai \
    - e POSTGRES_USER = ai \
    - e POSTGRES_PASSWORD = ai \
    - e PGDATA = /var/lib/postgresql/data/pgdata \
    - v pgvolume: / var/lib/postgresql/data \
    - p 5532: 5432 \
    - -name pgvector \
    agnohq/pgvector: 16

Usage

cookbook/embedders/aws_bedrock_embedder.py

# Embed sentence in database
embeddings = AwsBedrockEmbedder().get_embedding(
    "The quick brown fox jumps over the lazy dog."
)
# Print the embeddings and their dimensions
print(f"Embeddings: {embeddings[:5]}")
print(f"Dimensions: {len(embeddings)}")

# Example usage with a PDF knowledge base
knowledge_base = PDFUrlKnowledgeBase(
    urls=["https://agno-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"],
    reader=PDFUrlReader(
        chunk_size=2048
    ),  # Required because Cohere model has a fixed size of 2048
    vector_db=PgVector(
        table_name="recipes",
        db_url="postgresql+psycopg://ai:ai@localhost:5532/ai",
        embedder=AwsBedrockEmbedder(),
    ),
)
knowledge_base.load(recreate=False)

Params

ParameterTypeDefaultDescription
idstr"cohere.embed-multilingual-v3"The model ID to use. You need to enable this model in your AWS Bedrock model catalog.
dimensionsint1024The dimensionality of the embeddings generated by the model(1024 for Cohere models).
input_typestr"search_query"Prepends special tokens to differentiate types. Options: ‘search_document’, ‘search_query’, ‘classification’, ‘clustering’.
truncateOptional[str]NoneHow to handle inputs longer than the maximum token length. Options: ‘NONE’, ‘START’, ‘END’.
embedding_typesOptional[List[str]]NoneTypes of embeddings to return . Options: ‘float’, ‘int8’, ‘uint8’, ‘binary’, ‘ubinary’.
aws_regionOptional[str]NoneThe AWS region to use. If not provided, falls back to AWS_REGION env variable.
aws_access_key_idOptional[str]NoneThe AWS access key ID. If not provided, falls back to AWS_ACCESS_KEY_ID env variable.
aws_secret_access_keyOptional[str]NoneThe AWS secret access key. If not provided, falls back to AWS_SECRET_ACCESS_KEY env variable.
sessionOptional[Session]NoneA boto3 Session object to use for authentication.
request_paramsOptional[Dict[str, Any]]NoneAdditional parameters to pass to the API requests.
client_paramsOptional[Dict[str, Any]]NoneAdditional parameters to pass to the boto3 client.
clientOptional[AwsClient]NoneAn instance of the AWS Bedrock client to use for making API requests.

Developer Resources