bedrock-knowledge-bases

Amazon Bedrock Knowledge Bases for RAG (Retrieval-Augmented Generation). Create knowledge bases with vector stores, ingest data from S3/web/Confluence/SharePoint, configure chunking strategies, query with retrieve and generate APIs, manage sessions. Use when building RAG applications, implementing semantic search, creating document Q&A systems, integrating knowledge bases with agents, optimizing chunking for accuracy, or querying enterprise knowledge.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "bedrock-knowledge-bases" with this command: npx skills add adaptationio/skrillz/adaptationio-skrillz-bedrock-knowledge-bases

Amazon Bedrock Knowledge Bases

Amazon Bedrock Knowledge Bases is a fully managed RAG (Retrieval-Augmented Generation) solution that handles data ingestion, embedding generation, vector storage, retrieval with reranking, source attribution, and session context management.

Overview

What It Does

Amazon Bedrock Knowledge Bases provides:

  • Data Ingestion: Automatically process documents from S3, web, Confluence, SharePoint, Salesforce
  • Embedding Generation: Convert text to vectors using foundation models
  • Vector Storage: Store embeddings in multiple vector database options
  • Retrieval: Semantic and hybrid search with metadata filtering
  • Generation: RAG workflows with source attribution
  • Session Management: Multi-turn conversations with context
  • Chunking Strategies: Fixed, semantic, hierarchical, and custom chunking

When to Use This Skill

Use this skill when you need to:

  • Build RAG applications for document Q&A
  • Implement semantic search over enterprise knowledge
  • Create chatbots with knowledge bases
  • Integrate retrieval with Bedrock Agents
  • Configure optimal chunking strategies
  • Query documents with source attribution
  • Manage multi-turn conversations with context
  • Optimize RAG performance and cost

Key Capabilities

  1. Multiple Vector Store Options: OpenSearch, S3 Vectors, Neptune, Pinecone, MongoDB, Redis
  2. Flexible Data Sources: S3, web crawlers, Confluence, SharePoint, Salesforce
  3. Advanced Chunking: Fixed-size, semantic, hierarchical, custom Lambda
  4. Hybrid Search: Combine semantic (vector) and keyword search
  5. Session Management: Built-in conversation context tracking
  6. GraphRAG: Relationship-aware retrieval with Neptune Analytics
  7. Cost Optimization: S3 Vectors for up to 90% storage savings

Quick Start

Basic RAG Workflow

import boto3
import json

# Initialize clients
bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

# 1. Create Knowledge Base
kb_response = bedrock_agent.create_knowledge_base(
    name='enterprise-docs-kb',
    description='Company documentation knowledge base',
    roleArn='arn:aws:iam::123456789012:role/BedrockKBRole',
    knowledgeBaseConfiguration={
        'type': 'VECTOR',
        'vectorKnowledgeBaseConfiguration': {
            'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0'
        }
    },
    storageConfiguration={
        'type': 'OPENSEARCH_SERVERLESS',
        'opensearchServerlessConfiguration': {
            'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
            'vectorIndexName': 'bedrock-knowledge-base-index',
            'fieldMapping': {
                'vectorField': 'bedrock-knowledge-base-default-vector',
                'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
                'metadataField': 'AMAZON_BEDROCK_METADATA'
            }
        }
    }
)

knowledge_base_id = kb_response['knowledgeBase']['knowledgeBaseId']
print(f"Knowledge Base ID: {knowledge_base_id}")

# 2. Add S3 Data Source
ds_response = bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='s3-documents',
    description='Company documents from S3',
    dataSourceConfiguration={
        'type': 'S3',
        's3Configuration': {
            'bucketArn': 'arn:aws:s3:::my-docs-bucket',
            'inclusionPrefixes': ['documents/']
        }
    },
    vectorIngestionConfiguration={
        'chunkingConfiguration': {
            'chunkingStrategy': 'FIXED_SIZE',
            'fixedSizeChunkingConfiguration': {
                'maxTokens': 512,
                'overlapPercentage': 20
            }
        }
    }
)

data_source_id = ds_response['dataSource']['dataSourceId']

# 3. Start Ingestion
ingestion_response = bedrock_agent.start_ingestion_job(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    description='Initial document ingestion'
)

print(f"Ingestion Job ID: {ingestion_response['ingestionJob']['ingestionJobId']}")

# 4. Query with Retrieve and Generate
response = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'What is our vacation policy?'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': knowledge_base_id,
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
            'retrievalConfiguration': {
                'vectorSearchConfiguration': {
                    'numberOfResults': 5,
                    'overrideSearchType': 'HYBRID'
                }
            }
        }
    }
)

print(f"Answer: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
    for reference in citation['retrievedReferences']:
        print(f"  - {reference['location']['s3Location']['uri']}")

Vector Store Options

1. Amazon OpenSearch Serverless

Best for: Production RAG applications with auto-scaling requirements

Benefits:

  • Fully managed, serverless operation
  • Auto-scaling compute and storage
  • High availability with multi-AZ deployment
  • Fast query performance

Configuration:

storageConfiguration={
    'type': 'OPENSEARCH_SERVERLESS',
    'opensearchServerlessConfiguration': {
        'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
        'vectorIndexName': 'bedrock-knowledge-base-index',
        'fieldMapping': {
            'vectorField': 'bedrock-knowledge-base-default-vector',
            'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
            'metadataField': 'AMAZON_BEDROCK_METADATA'
        }
    }
}

2. Amazon S3 Vectors (Preview)

Best for: Cost-optimized, large-scale RAG applications

Benefits:

  • Up to 90% cost reduction for vector storage
  • Built-in vector support in S3
  • Subsecond query performance
  • Massive scale and durability

Ideal Use Cases:

  • Large document collections (millions of chunks)
  • Cost-sensitive applications
  • Archival knowledge bases
  • Low-to-medium QPS workloads

Configuration:

storageConfiguration={
    'type': 'S3_VECTORS',
    's3VectorsConfiguration': {
        'bucketArn': 'arn:aws:s3:::my-vector-bucket',
        'prefix': 'vectors/'
    }
}

Limitations:

  • Still in preview (no CloudFormation/CDK support yet)
  • Not suitable for high QPS, millisecond-latency requirements
  • Best for cost optimization over ultra-low latency

3. Amazon Neptune Analytics (GraphRAG)

Best for: Interconnected knowledge domains requiring relationship-aware retrieval

Benefits:

  • Automatic graph creation linking related content
  • Improved retrieval accuracy through relationships
  • Comprehensive responses leveraging knowledge graph
  • Explainable results with relationship context

Use Cases:

  • Legal document analysis with case precedents
  • Scientific research with paper citations
  • Product catalogs with dependencies
  • Organizational knowledge with team relationships

Configuration:

storageConfiguration={
    'type': 'NEPTUNE_ANALYTICS',
    'neptuneAnalyticsConfiguration': {
        'graphArn': 'arn:aws:neptune-graph:us-east-1:123456789012:graph/g-12345678',
        'vectorSearchConfiguration': {
            'vectorField': 'embedding'
        }
    }
}

4. Amazon OpenSearch Service Managed Cluster

Best for: Existing OpenSearch infrastructure, advanced customization

Configuration:

storageConfiguration={
    'type': 'OPENSEARCH_SERVICE',
    'opensearchServiceConfiguration': {
        'clusterArn': 'arn:aws:es:us-east-1:123456789012:domain/my-domain',
        'vectorIndexName': 'bedrock-kb-index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

5. Third-Party Vector Databases

Pinecone:

storageConfiguration={
    'type': 'PINECONE',
    'pineconeConfiguration': {
        'connectionString': 'https://my-index-abc123.svc.us-west1-gcp.pinecone.io',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:pinecone-api-key',
        'namespace': 'bedrock-kb',
        'fieldMapping': {
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

MongoDB Atlas:

storageConfiguration={
    'type': 'MONGODB_ATLAS',
    'mongoDbAtlasConfiguration': {
        'endpoint': 'https://cluster0.mongodb.net',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mongodb-creds',
        'databaseName': 'bedrock_kb',
        'collectionName': 'vectors',
        'vectorIndexName': 'vector_index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

Redis Enterprise Cloud:

storageConfiguration={
    'type': 'REDIS_ENTERPRISE_CLOUD',
    'redisEnterpriseCloudConfiguration': {
        'endpoint': 'redis-12345.c1.us-east-1-2.ec2.cloud.redislabs.com:12345',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:redis-creds',
        'vectorIndexName': 'bedrock-kb-index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}

Data Source Configuration

1. Amazon S3

Supported File Types: PDF, TXT, MD, HTML, DOC, DOCX, CSV, XLS, XLSX

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='s3-technical-docs',
    description='Technical documentation from S3',
    dataSourceConfiguration={
        'type': 'S3',
        's3Configuration': {
            'bucketArn': 'arn:aws:s3:::my-docs-bucket',
            'inclusionPrefixes': ['docs/technical/', 'docs/manuals/'],
            'exclusionPrefixes': ['docs/archive/']
        }
    }
)

2. Web Crawler

Automatic website scraping and indexing:

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='company-website',
    description='Public company website content',
    dataSourceConfiguration={
        'type': 'WEB',
        'webConfiguration': {
            'sourceConfiguration': {
                'urlConfiguration': {
                    'seedUrls': [
                        {'url': 'https://www.example.com/docs'},
                        {'url': 'https://www.example.com/blog'}
                    ]
                }
            },
            'crawlerConfiguration': {
                'crawlerLimits': {
                    'rateLimit': 300  # Pages per minute
                }
            }
        }
    }
)

3. Confluence

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='confluence-wiki',
    description='Company Confluence knowledge base',
    dataSourceConfiguration={
        'type': 'CONFLUENCE',
        'confluenceConfiguration': {
            'sourceConfiguration': {
                'hostUrl': 'https://company.atlassian.net/wiki',
                'hostType': 'SAAS',
                'authType': 'BASIC',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:confluence-creds'
            },
            'crawlerConfiguration': {
                'filterConfiguration': {
                    'type': 'PATTERN',
                    'patternObjectFilter': {
                        'filters': [
                            {
                                'objectType': 'Space',
                                'inclusionFilters': ['Engineering', 'Product'],
                                'exclusionFilters': ['Archive']
                            }
                        ]
                    }
                }
            }
        }
    }
)

4. SharePoint

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='sharepoint-docs',
    description='SharePoint document library',
    dataSourceConfiguration={
        'type': 'SHAREPOINT',
        'sharePointConfiguration': {
            'sourceConfiguration': {
                'siteUrls': [
                    'https://company.sharepoint.com/sites/Engineering',
                    'https://company.sharepoint.com/sites/Product'
                ],
                'tenantId': 'tenant-id',
                'domain': 'company',
                'authType': 'OAUTH2_CLIENT_CREDENTIALS',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:sharepoint-creds'
            }
        }
    }
)

5. Salesforce

bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='salesforce-knowledge',
    description='Salesforce knowledge articles',
    dataSourceConfiguration={
        'type': 'SALESFORCE',
        'salesforceConfiguration': {
            'sourceConfiguration': {
                'hostUrl': 'https://company.my.salesforce.com',
                'authType': 'OAUTH2_CLIENT_CREDENTIALS',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:salesforce-creds'
            },
            'crawlerConfiguration': {
                'filterConfiguration': {
                    'type': 'PATTERN',
                    'patternObjectFilter': {
                        'filters': [
                            {
                                'objectType': 'Knowledge',
                                'inclusionFilters': ['Product_Documentation', 'Support_Articles']
                            }
                        ]
                    }
                }
            }
        }
    }
)

Chunking Strategies

1. Fixed-Size Chunking

Best for: Simple documents with uniform structure

How it works: Splits text into chunks of fixed token size with overlap

Parameters:

  • maxTokens: 200-8192 tokens (typically 512-1024)
  • overlapPercentage: 10-50% (typically 20%)

Configuration:

vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'FIXED_SIZE',
        'fixedSizeChunkingConfiguration': {
            'maxTokens': 512,
            'overlapPercentage': 20
        }
    }
}

Use Cases:

  • Blog posts and articles
  • Technical documentation with consistent formatting
  • FAQs and Q&A content
  • Simple text files

Pros:

  • Fast and predictable
  • No additional costs
  • Easy to tune

Cons:

  • May split semantic units awkwardly
  • Doesn't respect document structure
  • Can break context mid-sentence

2. Semantic Chunking

Best for: Documents without clear boundaries (legal, technical, academic)

How it works: Uses sentence similarity to group related content

Parameters:

  • maxTokens: 20-8192 tokens (typically 300-500)
  • bufferSize: Number of neighboring sentences (default: 1)
  • breakpointPercentileThreshold: Similarity threshold (recommended: 95%)

Configuration:

vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'SEMANTIC',
        'semanticChunkingConfiguration': {
            'maxTokens': 300,
            'bufferSize': 1,
            'breakpointPercentileThreshold': 95
        }
    }
}

Use Cases:

  • Legal documents and contracts
  • Academic papers
  • Technical specifications
  • Medical records
  • Research reports

Pros:

  • Preserves semantic meaning
  • Better context preservation
  • Improved retrieval accuracy

Cons:

  • Additional cost (foundation model usage)
  • Slower ingestion
  • Less predictable chunk sizes

Cost Consideration: Semantic chunking uses foundation models for similarity analysis, incurring additional costs beyond storage and retrieval.

3. Hierarchical Chunking

Best for: Complex documents with nested structure

How it works: Creates parent and child chunks; retrieves child, returns parent for context

Parameters:

  • levelConfigurations: Array of chunk sizes (parent → child)
  • overlapTokens: Overlap between chunks

Configuration:

vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'HIERARCHICAL',
        'hierarchicalChunkingConfiguration': {
            'levelConfigurations': [
                {
                    'maxTokens': 1500  # Parent chunk (comprehensive context)
                },
                {
                    'maxTokens': 300   # Child chunk (focused retrieval)
                }
            ],
            'overlapTokens': 60
        }
    }
}

Use Cases:

  • Technical manuals with sections and subsections
  • Academic papers with abstract, sections, and subsections
  • Legal documents with articles and clauses
  • Product documentation with categories and details

How Retrieval Works:

  1. Query matches against child chunks (fast, focused)
  2. Returns parent chunks (comprehensive context)
  3. Best of both: precision retrieval + complete context

Pros:

  • Optimal balance of precision and context
  • Excellent for nested documents
  • Better accuracy for complex queries

Cons:

  • More complex configuration
  • Larger storage footprint
  • Requires understanding of document structure

4. Custom Chunking (Lambda)

Best for: Specialized domain logic, custom parsing requirements

How it works: Invoke Lambda function for custom chunking logic

Configuration:

vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'NONE'  # Custom via Lambda
    },
    'customTransformationConfiguration': {
        'intermediateStorage': {
            's3Location': {
                'uri': 's3://my-kb-bucket/intermediate/'
            }
        },
        'transformations': [
            {
                'stepToApply': 'POST_CHUNKING',
                'transformationFunction': {
                    'transformationLambdaConfiguration': {
                        'lambdaArn': 'arn:aws:lambda:us-east-1:123456789012:function:custom-chunker'
                    }
                }
            }
        ]
    }
}

Example Lambda Handler:

# Lambda function for custom chunking
import json

def lambda_handler(event, context):
    """
    Custom chunking logic for specialized documents

    Input: event contains document content and metadata
    Output: array of chunks with text and metadata
    """

    # Extract document content
    document = event['document']
    content = document['content']
    metadata = document.get('metadata', {})

    # Custom chunking logic (example: split by custom delimiter)
    chunks = []
    sections = content.split('---SECTION---')

    for idx, section in enumerate(sections):
        if section.strip():
            chunks.append({
                'text': section.strip(),
                'metadata': {
                    **metadata,
                    'chunk_id': f'section_{idx}',
                    'chunk_type': 'custom_section'
                }
            })

    return {
        'chunks': chunks
    }

Use Cases:

  • Medical records with structured sections (SOAP notes)
  • Financial documents with tables and calculations
  • Code documentation with code blocks and explanations
  • Domain-specific formats (HL7, FHIR, etc.)

Pros:

  • Complete control over chunking logic
  • Can handle any document format
  • Integrate domain expertise

Cons:

  • Requires Lambda development and maintenance
  • Additional operational complexity
  • Harder to debug and iterate

Chunking Strategy Selection Guide

Document TypeRecommended StrategyRationale
Blog posts, articlesFixed-sizeSimple, uniform structure
Legal documentsSemanticPreserve legal reasoning flow
Technical manualsHierarchicalNested sections and subsections
Academic papersHierarchicalAbstract, sections, subsections
FAQsFixed-sizeIndependent Q&A pairs
Medical recordsCustom LambdaStructured sections (SOAP, HL7)
Code documentationCustom LambdaCode blocks + explanations
Product catalogsFixed-sizeUniform product descriptions
Research reportsSemanticPreserve research narrative

Retrieval Operations

1. Retrieve API (Retrieval Only)

Returns raw retrieved chunks without generation.

Use Cases:

  • Custom generation logic
  • Debugging retrieval quality
  • Building custom RAG pipelines
  • Integrating with non-Bedrock models
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={
        'text': 'What are the benefits of hierarchical chunking?'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 5,
            'overrideSearchType': 'HYBRID',  # SEMANTIC, HYBRID
            'filter': {
                'andAll': [
                    {
                        'equals': {
                            'key': 'document_type',
                            'value': 'technical_guide'
                        }
                    },
                    {
                        'greaterThan': {
                            'key': 'publish_year',
                            'value': 2024
                        }
                    }
                ]
            }
        }
    }
)

# Process retrieved chunks
for result in response['retrievalResults']:
    print(f"Score: {result['score']}")
    print(f"Content: {result['content']['text']}")
    print(f"Location: {result['location']}")
    print(f"Metadata: {result.get('metadata', {})}")
    print("---")

2. Retrieve and Generate API (RAG)

Returns generated response with source attribution.

Use Cases:

  • Complete RAG workflows
  • Question answering
  • Document summarization
  • Chatbots with knowledge bases
response = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'Explain semantic chunking benefits and when to use it'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB123456',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
            'retrievalConfiguration': {
                'vectorSearchConfiguration': {
                    'numberOfResults': 5,
                    'overrideSearchType': 'HYBRID'
                }
            },
            'generationConfiguration': {
                'inferenceConfig': {
                    'textInferenceConfig': {
                        'temperature': 0.7,
                        'maxTokens': 2048,
                        'topP': 0.9
                    }
                },
                'promptTemplate': {
                    'textPromptTemplate': '''You are a helpful assistant. Answer the user's question based on the provided context.

Context: $search_results$

Question: $query$

Answer:'''
                }
            }
        }
    }
)

print(f"Generated Response: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
    for reference in citation['retrievedReferences']:
        print(f"  - {reference['location']}")
        print(f"    Relevance Score: {reference.get('score', 'N/A')}")

3. Multi-Turn Conversations with Session Management

Bedrock automatically manages conversation context across turns.

# First turn - creates session automatically
response1 = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'What is Amazon Bedrock Knowledge Bases?'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB123456',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
        }
    }
)

session_id = response1['sessionId']
print(f"Session ID: {session_id}")
print(f"Response: {response1['output']['text']}\n")

# Follow-up turn - reuse session for context
response2 = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'What chunking strategies does it support?'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB123456',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
        }
    },
    sessionId=session_id  # Continue conversation with context
)

print(f"Follow-up Response: {response2['output']['text']}")

# Third turn
response3 = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'Which strategy would you recommend for legal documents?'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB123456',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
        }
    },
    sessionId=session_id
)

print(f"Third Response: {response3['output']['text']}")

4. Advanced Metadata Filtering

Filter retrieval by metadata attributes for precision.

response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={
        'text': 'Security best practices for production deployments'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 10,
            'overrideSearchType': 'HYBRID',
            'filter': {
                'andAll': [
                    {
                        'equals': {
                            'key': 'document_type',
                            'value': 'security_guide'
                        }
                    },
                    {
                        'greaterThanOrEquals': {
                            'key': 'publish_year',
                            'value': 2024
                        }
                    },
                    {
                        'in': {
                            'key': 'category',
                            'value': ['production', 'security', 'compliance']
                        }
                    }
                ]
            }
        }
    }
)

Supported Filter Operators:

  • equals: Exact match
  • notEquals: Not equal
  • greaterThan, greaterThanOrEquals: Numeric comparison
  • lessThan, lessThanOrEquals: Numeric comparison
  • in: Match any value in array
  • notIn: Not match any value in array
  • startsWith: String prefix match
  • andAll: Combine filters with AND
  • orAll: Combine filters with OR

Ingestion Management

1. Start Ingestion Job

ingestion_response = bedrock_agent.start_ingestion_job(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    description='Monthly document sync',
    clientToken='unique-idempotency-token-123'
)

job_id = ingestion_response['ingestionJob']['ingestionJobId']
print(f"Ingestion Job ID: {job_id}")

2. Monitor Ingestion Job

# Get job status
job_status = bedrock_agent.get_ingestion_job(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    ingestionJobId=job_id
)

print(f"Status: {job_status['ingestionJob']['status']}")
print(f"Started: {job_status['ingestionJob']['startedAt']}")
print(f"Updated: {job_status['ingestionJob']['updatedAt']}")

if 'statistics' in job_status['ingestionJob']:
    stats = job_status['ingestionJob']['statistics']
    print(f"Documents Scanned: {stats['numberOfDocumentsScanned']}")
    print(f"Documents Indexed: {stats['numberOfDocumentsIndexed']}")
    print(f"Documents Failed: {stats['numberOfDocumentsFailed']}")

# Wait for completion
import time

while True:
    status = bedrock_agent.get_ingestion_job(
        knowledgeBaseId=knowledge_base_id,
        dataSourceId=data_source_id,
        ingestionJobId=job_id
    )

    current_status = status['ingestionJob']['status']

    if current_status in ['COMPLETE', 'FAILED']:
        print(f"Ingestion job {current_status}")
        break

    print(f"Status: {current_status}, waiting...")
    time.sleep(30)

3. List Ingestion Jobs

list_response = bedrock_agent.list_ingestion_jobs(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    maxResults=50
)

for job in list_response['ingestionJobSummaries']:
    print(f"Job ID: {job['ingestionJobId']}")
    print(f"Status: {job['status']}")
    print(f"Started: {job['startedAt']}")
    print(f"Updated: {job['updatedAt']}")
    print("---")

Integration with Bedrock Agents

1. Agent with Knowledge Base Action

bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')

# Create agent with knowledge base
agent_response = bedrock_agent.create_agent(
    agentName='customer-support-agent',
    description='Customer support agent with knowledge base access',
    instruction='''You are a customer support agent. When answering questions:
1. Search the knowledge base for relevant information
2. Provide accurate answers based on retrieved context
3. Cite your sources
4. Admit when you don't know something''',
    foundationModel='anthropic.claude-3-sonnet-20240229-v1:0',
    agentResourceRoleArn='arn:aws:iam::123456789012:role/BedrockAgentRole'
)

agent_id = agent_response['agent']['agentId']

# Associate knowledge base with agent
kb_association = bedrock_agent.associate_agent_knowledge_base(
    agentId=agent_id,
    agentVersion='DRAFT',
    knowledgeBaseId='KB123456',
    description='Company documentation knowledge base',
    knowledgeBaseState='ENABLED'
)

# Prepare and create alias
bedrock_agent.prepare_agent(agentId=agent_id)

alias_response = bedrock_agent.create_agent_alias(
    agentId=agent_id,
    agentAliasName='production',
    description='Production alias'
)

agent_alias_id = alias_response['agentAlias']['agentAliasId']

# Invoke agent (automatically queries knowledge base)
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

response = bedrock_agent_runtime.invoke_agent(
    agentId=agent_id,
    agentAliasId=agent_alias_id,
    sessionId='session-123',
    inputText='What is our return policy for defective products?'
)

for event in response['completion']:
    if 'chunk' in event:
        chunk = event['chunk']
        print(chunk['bytes'].decode())

2. Agent with Multiple Knowledge Bases

# Associate multiple knowledge bases
bedrock_agent.associate_agent_knowledge_base(
    agentId=agent_id,
    agentVersion='DRAFT',
    knowledgeBaseId='KB-PRODUCT-DOCS',
    description='Product documentation'
)

bedrock_agent.associate_agent_knowledge_base(
    agentId=agent_id,
    agentVersion='DRAFT',
    knowledgeBaseId='KB-SUPPORT-ARTICLES',
    description='Support knowledge articles'
)

bedrock_agent.associate_agent_knowledge_base(
    agentId=agent_id,
    agentVersion='DRAFT',
    knowledgeBaseId='KB-COMPANY-POLICIES',
    description='Company policies and procedures'
)

# Agent automatically searches all knowledge bases and combines results

Best Practices

1. Chunking Strategy Selection

Decision Framework:

  1. Simple, uniform documents → Fixed-size chunking

    • Blog posts, articles, simple FAQs
    • Fast, predictable, cost-effective
  2. Documents without clear boundaries → Semantic chunking

    • Legal documents, contracts, academic papers
    • Preserves semantic meaning, better accuracy
    • Consider additional cost
  3. Nested, hierarchical documents → Hierarchical chunking

    • Technical manuals, product docs, research papers
    • Best balance of precision and context
    • Optimal for complex structures
  4. Specialized formats → Custom Lambda chunking

    • Medical records (HL7, FHIR), code docs, custom formats
    • Complete control, domain expertise
    • Higher operational complexity

Tuning Guidelines:

  • Fixed-size: Start with 512 tokens, 20% overlap
  • Semantic: Start with 300 tokens, bufferSize=1, threshold=95%
  • Hierarchical: Parent 1500 tokens, child 300 tokens, overlap 60 tokens
  • Custom: Test extensively with domain experts

2. Retrieval Optimization

Number of Results:

  • Start with 5-10 results
  • Increase if answers lack detail
  • Decrease if too much noise

Search Type:

  • SEMANTIC: Pure vector similarity (faster, good for conceptual queries)
  • HYBRID: Vector + keyword (better recall, recommended for production)

Use Hybrid Search when:

  • Queries contain specific terms or names
  • Need to match exact keywords
  • Domain has specialized vocabulary

Use Semantic Search when:

  • Purely conceptual queries
  • Prioritizing speed over perfect recall
  • Well-embedded domain knowledge

Metadata Filters:

  • Always use when applicable
  • Dramatically improves precision
  • Reduces retrieval latency
  • Examples: document_type, publish_date, category, author

3. Cost Optimization

S3 Vectors:

  • Use for large-scale knowledge bases (millions of chunks)
  • Up to 90% cost savings vs. OpenSearch
  • Ideal for cost-sensitive applications
  • Trade-off: Slightly higher latency

Semantic Chunking:

  • Incurs foundation model costs during ingestion
  • Consider cost vs. accuracy benefit
  • May not be worth it for simple documents
  • Best for complex, high-value content

Ingestion Frequency:

  • Schedule ingestion during off-peak hours
  • Use incremental updates when possible
  • Don't re-ingest unchanged documents

Model Selection:

  • Use smaller embedding models when accuracy permits
  • Titan Embed Text v2 is cost-effective
  • Consider Cohere Embed for multilingual

Token Usage:

  • Monitor generation token usage
  • Set appropriate maxTokens limits
  • Use prompt templates to control verbosity

4. Session Management

Always Reuse Sessions:

  • Pass sessionId for follow-up turns
  • Bedrock handles context automatically
  • No manual conversation history needed

Session Lifecycle:

  • Sessions expire after inactivity (default: 60 minutes)
  • Create new session for unrelated conversations
  • Use unique sessionId per user/conversation

Context Limits:

  • Monitor conversation length
  • Long sessions may hit context limits
  • Consider summarization for very long conversations

5. GraphRAG with Neptune

When to Use:

  • Interconnected knowledge domains
  • Relationship-aware queries
  • Need for explainability
  • Complex knowledge graphs

Benefits:

  • Automatic graph creation
  • Improved accuracy through relationships
  • Comprehensive answers
  • Explainable results

Considerations:

  • Higher setup complexity
  • Neptune Analytics costs
  • Best for domains with rich relationships

6. Data Source Management

S3 Best Practices:

  • Organize with clear prefixes
  • Use inclusion/exclusion filters
  • Maintain consistent metadata
  • Version documents when updating

Web Crawler:

  • Set appropriate rate limits
  • Use robots.txt for guidance
  • Monitor for broken links
  • Schedule regular re-crawls

Confluence/SharePoint:

  • Filter by spaces/sites
  • Exclude archived content
  • Use fine-grained permissions
  • Schedule incremental syncs

Metadata Enrichment:

  • Add custom metadata to documents
  • Include: document_type, publish_date, category, author, version
  • Enables powerful filtering
  • Improves retrieval precision

7. Monitoring and Debugging

Enable CloudWatch Logs:

# Monitor retrieval quality
# Track: query latency, retrieval scores, generation quality
# Set alarms for: high latency, low scores, high error rates

Test Retrieval Quality:

# Use retrieve API to debug
response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={'text': 'test query'}
)

# Analyze retrieval scores
for result in response['retrievalResults']:
    print(f"Score: {result['score']}")
    print(f"Content preview: {result['content']['text'][:200]}")

Common Issues:

  1. Low Retrieval Scores:

    • Check chunking strategy
    • Verify embedding model
    • Ensure documents are properly ingested
    • Consider semantic or hierarchical chunking
  2. Irrelevant Results:

    • Add metadata filters
    • Use hybrid search
    • Refine chunking strategy
    • Increase numberOfResults
  3. Missing Information:

    • Verify data source configuration
    • Check ingestion job status
    • Ensure documents are not excluded by filters
    • Increase numberOfResults
  4. Slow Retrieval:

    • Use metadata filters to narrow scope
    • Optimize vector database configuration
    • Consider S3 Vectors for cost over latency
    • Reduce numberOfResults

8. Security Best Practices

IAM Permissions:

  • Use least privilege for Knowledge Base role
  • Separate roles for data sources, ingestion, retrieval
  • Enable VPC endpoints for private connectivity

Data Encryption:

  • All data encrypted at rest (AWS KMS)
  • Data encrypted in transit (TLS)
  • Use customer-managed KMS keys for compliance

Access Control:

  • Use IAM policies to control who can query
  • Implement fine-grained access control
  • Monitor access with CloudTrail

PII Handling:

  • Use Bedrock Guardrails for PII redaction
  • Implement data masking for sensitive fields
  • Consider custom Lambda for advanced PII handling

Complete Production Example

End-to-End RAG Application

import boto3
import json
from typing import List, Dict, Optional

class BedrockKnowledgeBaseRAG:
    """Production RAG application with Amazon Bedrock Knowledge Bases"""

    def __init__(self, region_name: str = 'us-east-1'):
        self.bedrock_agent = boto3.client('bedrock-agent', region_name=region_name)
        self.bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name=region_name)

    def create_knowledge_base(
        self,
        name: str,
        description: str,
        role_arn: str,
        vector_store_config: Dict,
        embedding_model: str = 'amazon.titan-embed-text-v2:0'
    ) -> str:
        """Create knowledge base with vector store"""

        response = self.bedrock_agent.create_knowledge_base(
            name=name,
            description=description,
            roleArn=role_arn,
            knowledgeBaseConfiguration={
                'type': 'VECTOR',
                'vectorKnowledgeBaseConfiguration': {
                    'embeddingModelArn': f'arn:aws:bedrock:us-east-1::foundation-model/{embedding_model}'
                }
            },
            storageConfiguration=vector_store_config
        )

        return response['knowledgeBase']['knowledgeBaseId']

    def add_s3_data_source(
        self,
        knowledge_base_id: str,
        name: str,
        bucket_arn: str,
        inclusion_prefixes: List[str],
        chunking_strategy: str = 'FIXED_SIZE',
        chunking_config: Optional[Dict] = None
    ) -> str:
        """Add S3 data source with chunking configuration"""

        if chunking_config is None:
            chunking_config = {
                'maxTokens': 512,
                'overlapPercentage': 20
            }

        vector_ingestion_config = {
            'chunkingConfiguration': {
                'chunkingStrategy': chunking_strategy
            }
        }

        if chunking_strategy == 'FIXED_SIZE':
            vector_ingestion_config['chunkingConfiguration']['fixedSizeChunkingConfiguration'] = chunking_config
        elif chunking_strategy == 'SEMANTIC':
            vector_ingestion_config['chunkingConfiguration']['semanticChunkingConfiguration'] = chunking_config
        elif chunking_strategy == 'HIERARCHICAL':
            vector_ingestion_config['chunkingConfiguration']['hierarchicalChunkingConfiguration'] = chunking_config

        response = self.bedrock_agent.create_data_source(
            knowledgeBaseId=knowledge_base_id,
            name=name,
            description=f'S3 data source: {name}',
            dataSourceConfiguration={
                'type': 'S3',
                's3Configuration': {
                    'bucketArn': bucket_arn,
                    'inclusionPrefixes': inclusion_prefixes
                }
            },
            vectorIngestionConfiguration=vector_ingestion_config
        )

        return response['dataSource']['dataSourceId']

    def ingest_data(self, knowledge_base_id: str, data_source_id: str) -> str:
        """Start ingestion job and wait for completion"""

        import time

        # Start ingestion
        response = self.bedrock_agent.start_ingestion_job(
            knowledgeBaseId=knowledge_base_id,
            dataSourceId=data_source_id,
            description='Automated ingestion'
        )

        job_id = response['ingestionJob']['ingestionJobId']

        # Wait for completion
        while True:
            status_response = self.bedrock_agent.get_ingestion_job(
                knowledgeBaseId=knowledge_base_id,
                dataSourceId=data_source_id,
                ingestionJobId=job_id
            )

            status = status_response['ingestionJob']['status']

            if status == 'COMPLETE':
                print(f"Ingestion completed successfully")
                if 'statistics' in status_response['ingestionJob']:
                    stats = status_response['ingestionJob']['statistics']
                    print(f"Documents indexed: {stats.get('numberOfDocumentsIndexed', 0)}")
                break
            elif status == 'FAILED':
                print(f"Ingestion failed")
                break

            print(f"Ingestion status: {status}")
            time.sleep(30)

        return job_id

    def query(
        self,
        knowledge_base_id: str,
        query: str,
        model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
        num_results: int = 5,
        search_type: str = 'HYBRID',
        metadata_filter: Optional[Dict] = None,
        session_id: Optional[str] = None
    ) -> Dict:
        """Query knowledge base with retrieve and generate"""

        retrieval_config = {
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': knowledge_base_id,
                'modelArn': model_arn,
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'numberOfResults': num_results,
                        'overrideSearchType': search_type
                    }
                },
                'generationConfiguration': {
                    'inferenceConfig': {
                        'textInferenceConfig': {
                            'temperature': 0.7,
                            'maxTokens': 2048
                        }
                    }
                }
            }
        }

        # Add metadata filter if provided
        if metadata_filter:
            retrieval_config['knowledgeBaseConfiguration']['retrievalConfiguration']['vectorSearchConfiguration']['filter'] = metadata_filter

        # Build request
        request = {
            'input': {'text': query},
            'retrieveAndGenerateConfiguration': retrieval_config
        }

        # Add session if provided
        if session_id:
            request['sessionId'] = session_id

        response = self.bedrock_agent_runtime.retrieve_and_generate(**request)

        return {
            'answer': response['output']['text'],
            'citations': response.get('citations', []),
            'session_id': response['sessionId']
        }

    def multi_turn_conversation(
        self,
        knowledge_base_id: str,
        queries: List[str],
        model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
    ) -> List[Dict]:
        """Execute multi-turn conversation with context"""

        session_id = None
        conversation = []

        for query in queries:
            result = self.query(
                knowledge_base_id=knowledge_base_id,
                query=query,
                model_arn=model_arn,
                session_id=session_id
            )

            session_id = result['session_id']

            conversation.append({
                'query': query,
                'answer': result['answer'],
                'citations': result['citations']
            })

        return conversation


# Example Usage
if __name__ == '__main__':
    rag = BedrockKnowledgeBaseRAG(region_name='us-east-1')

    # Create knowledge base
    kb_id = rag.create_knowledge_base(
        name='production-docs-kb',
        description='Production documentation knowledge base',
        role_arn='arn:aws:iam::123456789012:role/BedrockKBRole',
        vector_store_config={
            'type': 'OPENSEARCH_SERVERLESS',
            'opensearchServerlessConfiguration': {
                'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
                'vectorIndexName': 'bedrock-kb-index',
                'fieldMapping': {
                    'vectorField': 'bedrock-knowledge-base-default-vector',
                    'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
                    'metadataField': 'AMAZON_BEDROCK_METADATA'
                }
            }
        }
    )

    # Add data source
    ds_id = rag.add_s3_data_source(
        knowledge_base_id=kb_id,
        name='technical-docs',
        bucket_arn='arn:aws:s3:::my-docs-bucket',
        inclusion_prefixes=['docs/'],
        chunking_strategy='HIERARCHICAL',
        chunking_config={
            'levelConfigurations': [
                {'maxTokens': 1500},
                {'maxTokens': 300}
            ],
            'overlapTokens': 60
        }
    )

    # Ingest data
    rag.ingest_data(kb_id, ds_id)

    # Single query
    result = rag.query(
        knowledge_base_id=kb_id,
        query='What are the best practices for RAG applications?',
        metadata_filter={
            'equals': {
                'key': 'document_type',
                'value': 'best_practices'
            }
        }
    )

    print(f"Answer: {result['answer']}")
    print(f"\nSources:")
    for citation in result['citations']:
        for ref in citation['retrievedReferences']:
            print(f"  - {ref['location']}")

    # Multi-turn conversation
    conversation = rag.multi_turn_conversation(
        knowledge_base_id=kb_id,
        queries=[
            'What is hierarchical chunking?',
            'When should I use it?',
            'What are the configuration parameters?'
        ]
    )

    for turn in conversation:
        print(f"\nQ: {turn['query']}")
        print(f"A: {turn['answer']}")

Related Skills

Amazon Bedrock Core Skills

  • bedrock-guardrails: Content safety, PII redaction, hallucination detection
  • bedrock-agents: Agentic workflows with tool use and knowledge bases
  • bedrock-flows: Visual workflow builder for generative AI
  • bedrock-model-customization: Fine-tuning, reinforcement fine-tuning, distillation
  • bedrock-prompt-management: Prompt versioning and deployment

AWS Infrastructure Skills

  • opensearch-serverless: Vector database configuration and management
  • neptune-analytics: GraphRAG configuration and queries
  • s3-management: S3 bucket configuration for data sources and vectors
  • iam-bedrock: IAM roles and policies for Knowledge Bases

Observability Skills

  • cloudwatch-bedrock-monitoring: Monitor Knowledge Bases metrics and logs
  • bedrock-cost-optimization: Track and optimize Knowledge Bases costs

Additional Resources

Official Documentation

Best Practices

Research Document

  • /mnt/c/data/github/skrillz/AMAZON-BEDROCK-COMPREHENSIVE-RESEARCH-2025.md - Section 2 (Complete Knowledge Bases research)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

multi-ai-research

No summary provided by upstream source.

Repository SourceNeeds Review
Research

analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

ac-knowledge-graph

No summary provided by upstream source.

Repository SourceNeeds Review