pgvector-rag

pgvector setup, embeddings, similarity search, RAG pipelines, Ollama integration, and agentic-kit RAG. Use when asked to "set up vector database", "generate embeddings", "semantic search", "build RAG", "use Ollama", "run local LLM", "configure RAG", "create AI search", "embed documents", or when building any RAG or vector search application with PostgreSQL.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "pgvector-rag" with this command: npx skills add constructive-io/constructive-skills/constructive-io-constructive-skills-pgvector-rag

pgvector & RAG

Complete toolkit for building vector search and RAG (Retrieval-Augmented Generation) applications with PostgreSQL. Covers pgvector schema setup, embedding generation with Ollama, similarity search, full RAG pipelines, and agentic-kit integration.

When to Apply

Use this skill when:

  • Setting up pgvector: Creating tables, indexes, vector storage schema
  • Generating embeddings: Using Ollama to embed documents and chunks
  • Similarity search: Querying vectors with cosine/L2/inner product distance
  • Building RAG: Combining retrieval with LLM generation
  • Ollama integration: Local LLM inference, model management, streaming
  • Agentic-kit RAG: Wiring RAG into agentic-kit chat applications

Architecture

Document → Chunking → Embedding → pgvector Storage
                                        ↓
Query → Embedding → Similarity Search → Context Retrieval → LLM Response

Quick Start

# 1. Start PostgreSQL with pgvector
pgpm docker start
eval "$(pgpm env)"

# 2. Pull Ollama models
ollama pull nomic-embed-text
ollama pull llama3.2

# 3. Create vector storage module
pgpm init my-vectors
cd my-vectors
pgpm add schemas/intelligence
pgpm add schemas/intelligence/tables/documents --requires schemas/intelligence
pgpm add schemas/intelligence/tables/chunks --requires schemas/intelligence/tables/documents

Schema Design

Documents Table

CREATE TABLE intelligence.documents (
    id SERIAL PRIMARY KEY,
    title TEXT,
    content TEXT NOT NULL,
    metadata JSONB DEFAULT '{}'::jsonb,
    embedding VECTOR(768),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

Chunks Table

CREATE TABLE intelligence.chunks (
    id SERIAL PRIMARY KEY,
    document_id INTEGER NOT NULL REFERENCES intelligence.documents(id) ON DELETE CASCADE,
    content TEXT NOT NULL,
    embedding VECTOR(768),
    chunk_index INTEGER NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_chunks_document_id ON intelligence.chunks(document_id);

Similarity Search Function

CREATE FUNCTION intelligence.find_similar_chunks(
    p_embedding VECTOR(768),
    p_limit INTEGER DEFAULT 5,
    p_similarity_threshold FLOAT DEFAULT 0.7
)
RETURNS TABLE (
    id INTEGER,
    content TEXT,
    similarity FLOAT
) AS $$
BEGIN
    RETURN QUERY
    SELECT
        c.id,
        c.content,
        1 - (c.embedding <=> p_embedding) AS similarity
    FROM intelligence.chunks c
    WHERE c.embedding IS NOT NULL
      AND 1 - (c.embedding <=> p_embedding) > p_similarity_threshold
    ORDER BY c.embedding <=> p_embedding
    LIMIT p_limit;
END;
$$ LANGUAGE plpgsql;

Embedding Models

ModelDimensionsSpeedQuality
nomic-embed-text768FastGood
mxbai-embed-large1024MediumBetter
all-minilm384Very FastAcceptable

Distance Operators

OperatorTypeUse Case
<=>CosineMost common, normalized vectors
<->Euclidean (L2)When magnitude matters
<#>Inner productDot product similarity

TypeScript: OllamaClient

import fetch from 'cross-fetch';

export class OllamaClient {
  private baseUrl: string;

  constructor(baseUrl?: string) {
    this.baseUrl = baseUrl || process.env.OLLAMA_HOST || 'http://localhost:11434';
  }

  async generateEmbedding(text: string, model = 'nomic-embed-text'): Promise<number[]> {
    const response = await fetch(`${this.baseUrl}/api/embeddings`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ model, prompt: text }),
    });
    if (!response.ok) throw new Error(`Embedding failed: ${response.statusText}`);
    const data = await response.json();
    return data.embedding;
  }

  async generateResponse(prompt: string, context?: string, model = 'mistral'): Promise<string> {
    const fullPrompt = context
      ? `Context: ${context}\n\nQuestion: ${prompt}\n\nAnswer:`
      : prompt;
    const response = await fetch(`${this.baseUrl}/api/generate`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ model, prompt: fullPrompt, stream: false }),
    });
    if (!response.ok) throw new Error(`Generation failed: ${response.statusText}`);
    const data = await response.json();
    return data.response;
  }
}

Vector Format

pgvector expects bracket notation:

const formatVector = (embedding: number[]): string => `[${embedding.join(',')}]`;

RAG Query Pattern

// 1. Embed the question
const queryEmbedding = await ollama.generateEmbedding(question);

// 2. Retrieve context
const result = await pool.query(
  `SELECT string_agg(content, E'\n\n') as context
   FROM intelligence.find_similar_chunks($1::vector, $2)`,
  [formatVector(queryEmbedding), 5]
);

// 3. Generate response with context
const response = await ollama.generateResponse(question, result.rows[0].context);

Indexes for Performance

-- IVFFlat (good balance, add after initial data load)
CREATE INDEX idx_chunks_embedding ON intelligence.chunks
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

-- HNSW (better recall, more memory)
CREATE INDEX idx_chunks_embedding_hnsw ON intelligence.chunks
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Environment Variables

VariableDefaultDescription
OLLAMA_HOSThttp://localhost:11434Ollama server URL
RAG_DATABASE_URL-PostgreSQL connection string
RAG_EMBEDDING_MODELnomic-embed-textEmbedding model
RAG_CHAT_MODELllama3.2Chat model
RAG_SIMILARITY_THRESHOLD0.5Minimum similarity score
RAG_CONTEXT_LIMIT5Max chunks to retrieve

Troubleshooting Quick Reference

IssueQuick Fix
"type vector does not exist"pgvector extension not installed; use pgvector-enabled Docker image
"Connection refused" to OllamaStart Ollama: ollama serve
"Model not found"Pull model: ollama pull <model>
Dimension mismatchEnsure VECTOR(n) matches model output dimensions
No results returnedLower similarity threshold
Slow queriesAdd IVFFlat or HNSW index

Reference Guide

Consult these reference files for detailed documentation on specific topics:

ReferenceTopicConsult When
references/setup.mdpgvector schema setupCreating tables, indexes, vector dimensions, pgpm module structure
references/embeddings.mdGenerating and storing embeddingsOllamaClient, document chunking, ingestion pipeline, batch processing
references/similarity-search.mdSimilarity search queriesDistance operators, thresholds, metadata filtering, performance tuning
references/rag-pipeline.mdComplete RAG pipelineRAGService implementation, streaming, chat history, prompt engineering
references/ollama.mdOllama integrationInstallation, API endpoints, model selection, chat API, CI/CD setup
references/agentic-kit.mdAgentic-kit RAGRAGProvider, createRAGKit, useAgent hook, environment config, database schema

Cross-References

Related skills (separate from this skill):

  • graphile-pgvector — Integrate pgvector with PostGraphile v5 GraphQL
  • pgpm (references/docker.md) — PostgreSQL container management for pgvector
  • github-workflows-ollama — GitHub Actions for Ollama and pgvector testing

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

agentic-kit-rag

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

constructive-agent-e2e

No summary provided by upstream source.

Repository SourceNeeds Review
General

drizzle-orm

No summary provided by upstream source.

Repository SourceNeeds Review
General

planning-blueprinting

No summary provided by upstream source.

Repository SourceNeeds Review