RKnowledge - Knowledge Graph Builder

Build knowledge graphs from any text corpus using LLMs. This skill helps you extract concepts and relationships from documents and store them in a queryable graph database.

When to Use

Use this skill when you need to:

Extract knowledge from documents (PDF, Markdown, HTML, TXT)
Build a knowledge graph for Graph RAG applications
Analyze relationships between concepts in a corpus
Create visual representations of document content
Query extracted knowledge using natural language or Cypher

Quick Start

Initialize

Install and initialize rknowledge

rknowledge init

This creates a configuration file and starts Neo4j via Docker.

Configure API Keys

Use the auth command to configure your LLM provider:

Interactive setup

rknowledge auth

Or specify provider directly

rknowledge auth --provider anthropic

Or set directly with key

rknowledge auth --provider anthropic --key your-key-here

List configured providers

rknowledge auth --list

Alternatively, use environment variables:

export ANTHROPIC_API_KEY=your-key-here

or

export OPENAI_API_KEY=your-key-here

or

export GOOGLE_API_KEY=your-key-here # also accepts GEMINI_API_KEY

or use Ollama for local models (no API key needed)

Build Knowledge Graph

Process a single document

rknowledge build ./document.pdf

Process a directory of documents

rknowledge build ./docs/

Specify provider and model

rknowledge build ./docs/ --provider anthropic --model claude-sonnet-4-20250514

Query the Graph

Natural language search

rknowledge query "What concepts relate to authentication?"

Direct Cypher query

rknowledge query "cypher: MATCH (n)-[r]->(m) RETURN n, r, m LIMIT 10"

Export

Export to JSON

rknowledge export --format json --output graph.json

Export to CSV (creates nodes.csv and edges.csv)

rknowledge export --format csv --output graph

Export to GraphML

rknowledge export --format graphml --output graph.graphml

Export to Cypher statements

rknowledge export --format cypher --output import.cypher

Visualize

Open interactive visualization in browser

rknowledge viz

The visualization features a premium glassmorphism dashboard with:

Entity Filtering: Show/hide specific concept types.
Neighborhood Search: Highlight concepts and their relations.
Interactive Cards: Explore properties and connections.

Commands Reference

Command Description

rknowledge init

Initialize config and start Neo4j

rknowledge auth

Configure API keys for LLM providers

rknowledge build <path>

Process documents and build graph

rknowledge add

Manually insert relations into the graph

rknowledge query <query>

Search or query the graph

rknowledge path <from> <to>

Find shortest path between concepts

rknowledge stats

Show graph statistics and analytics

rknowledge communities

List detected communities and members

rknowledge export

Export graph to various formats

rknowledge viz

Open interactive visualization in browser

rknowledge doctor

Check system health and diagnose problems

Build Options

Option Description Default

--provider

LLM provider (anthropic, openai, ollama, google) anthropic

--model

Model to use Provider default

--output

Output destination (neo4j, json, csv) neo4j

--tenant

Tenant namespace for knowledge isolation default

--domain

Domain name for specialized extraction None

--context

Custom context for extraction prompts None

--context-file

Path to file containing custom prompt None

--chunk-size

Text chunk size in characters 1500

--chunk-overlap

Overlap between chunks 150

--concurrency, -j

Number of concurrent LLM requests 4

--append

Append to existing graph (incremental) false

Supported File Types

PDF (.pdf) - Extracts text from PDF documents
Markdown (.md) - Parses and extracts text from Markdown
HTML (.html, .htm) - Extracts text content from HTML
Plain Text (.txt) - Direct text processing

LLM Providers

Anthropic (Recommended)

export ANTHROPIC_API_KEY=your-key rknowledge build ./docs --provider anthropic --model claude-sonnet-4-20250514

OpenAI

export OPENAI_API_KEY=your-key rknowledge build ./docs --provider openai --model gpt-4o

Google (Gemini)

Accepts either GOOGLE_API_KEY or GEMINI_API_KEY

export GEMINI_API_KEY=your-key rknowledge build ./docs --provider google --model gemini-2.0-flash

Ollama (Local - Free)

Start Ollama and pull a model first

ollama pull mistral rknowledge build ./docs --provider ollama --model mistral

OpenAI-Compatible APIs (Groq, DeepSeek, Mistral, Together, etc.)

The OpenAI provider works with any OpenAI-compatible API. Set base_url in your config file to point to the service:

~/.config/rknowledge/config.toml

Groq (fast inference)

[providers.openai] api_key = "${GROQ_API_KEY}" base_url = "https://api.groq.com/openai/v1" model = "llama-3.3-70b-versatile"

DeepSeek

[providers.openai] api_key = "${DEEPSEEK_API_KEY}" base_url = "https://api.deepseek.com/v1" model = "deepseek-chat"

Mistral

[providers.openai] api_key = "${MISTRAL_API_KEY}" base_url = "https://api.mistral.ai/v1" model = "mistral-large-latest"

Together AI

[providers.openai] api_key = "${TOGETHER_API_KEY}" base_url = "https://api.together.xyz/v1" model = "meta-llama/Llama-3-70b-chat-hf"

OpenRouter (access many models through one API)

[providers.openai] api_key = "${OPENROUTER_API_KEY}" base_url = "https://openrouter.ai/api/v1" model = "anthropic/claude-sonnet-4-20250514"

LM Studio or vLLM (local)

[providers.openai] api_key = "not-needed" base_url = "http://localhost:1234/v1" model = "local-model"

Then build with:

rknowledge build ./docs --provider openai

Advanced Features

Tenant Isolation

Isolate multiple projects or users within the same Neo4j instance using the --tenant flag:

Add data to project-alpha

rknowledge build ./docs-alpha --tenant alpha

Add data to project-beta

rknowledge build ./docs-beta --tenant beta

Queries only see data from the specified tenant

rknowledge stats --tenant alpha rknowledge query "Who is the lead dev?" --tenant beta

Domain-Aware Extraction

Customize LLM prompts for specialized domains (medical, legal, technical) using domain flags:

Specify domain-specific context directly

rknowledge build ./medical-docs --domain medical --context "Focus on clinical trials and drug interactions"

Or use a custom prompt file for complex instructions

rknowledge build ./codebase --context-file extraction_prompt.txt

Manual Relation Insertion

Add individual relations directly to the graph without document processing. This is useful for adding "ground truth" or linking concepts that weren't captured by the LLM:

Interactive mode (walks you through node names, types, and relation)

rknowledge add --interactive

Single relation with types

rknowledge add "Knowledge Graph" "contains" "Nodes" --type1 "concept" --type2 "structure"

Single relation for a specific tenant

rknowledge add "Project A" "depends on" "Library B" --tenant my-client --relation "dependency"

Batch import from JSON (see docs for schema)

rknowledge add --from-file relations.json --tenant alpha

Tenant-Aware Visualization

The redesigned visualization dashboard supports all v0.2.0 features:

Visualize a specific project

rknowledge viz --tenant project-x

Filter Sidebar: Toggle entire entity types (e.g., hide all "proximity" nodes) to clean up the graph.
Search & Focus: Use the search bar to find a node; it will automatically highlight and pull in its neighbors.
Deep Exploration: Click any node to see its full property list and all related nodes in a dedicated detail panel.

Example Workflows

Build a Knowledge Base from Documentation

Clone a repo's docs

git clone https://github.com/example/project docs

Build knowledge graph

rknowledge build ./docs --provider anthropic

Query for specific topics

rknowledge query "How does authentication work?"

Analyze Research Papers

Process PDF papers

rknowledge build ./papers/ --chunk-size 2000

Export for further analysis

rknowledge export --format json --output research-graph.json

Create Graph RAG Backend

Build comprehensive graph

rknowledge build ./knowledge-base/

Query programmatically via Neo4j

Connect to bolt://localhost:7687 with neo4j/rknowledge

Neo4j Access

After running rknowledge init , Neo4j is available at:

Browser: http://localhost:7474
Bolt: bolt://localhost:7687
Credentials: neo4j / rknowledge

Troubleshooting

Run the built-in diagnostics first:

rknowledge doctor

This checks config, Docker, Neo4j connectivity, LLM providers, and graph status.

Neo4j Connection Failed

Check if Docker is running

docker ps

Restart Neo4j

cd ~/.config/rknowledge docker compose up -d

API Key Issues

Verify API key is set

echo $ANTHROPIC_API_KEY

Or check config file

cat ~/.config/rknowledge/config.toml

Large Documents

For very large documents, increase chunk size:

rknowledge build ./large-doc.pdf --chunk-size 3000 --chunk-overlap 300

rknowledge

Safety Notice

Copy this and send it to your AI assistant to learn

Install and initialize rknowledge

Interactive setup

Or specify provider directly

Or set directly with key

List configured providers

or

or

or use Ollama for local models (no API key needed)

Process a single document

Process a directory of documents

Specify provider and model

Natural language search

Direct Cypher query

Export to JSON

Export to CSV (creates nodes.csv and edges.csv)

Export to GraphML

Export to Cypher statements

Open interactive visualization in browser

Accepts either GOOGLE_API_KEY or GEMINI_API_KEY

Start Ollama and pull a model first

~/.config/rknowledge/config.toml

Groq (fast inference)

DeepSeek

Mistral

Together AI

OpenRouter (access many models through one API)

LM Studio or vLLM (local)

Add data to project-alpha

Add data to project-beta

Queries only see data from the specified tenant

Specify domain-specific context directly

Or use a custom prompt file for complex instructions

Interactive mode (walks you through node names, types, and relation)

Single relation with types

Single relation for a specific tenant

Batch import from JSON (see docs for schema)

Visualize a specific project

Clone a repo's docs

Build knowledge graph

Query for specific topics

Process PDF papers

Export for further analysis

Build comprehensive graph

Query programmatically via Neo4j

Connect to bolt://localhost:7687 with neo4j/rknowledge

Check if Docker is running

Restart Neo4j

Verify API key is set

Or check config file

Source Transparency

Related Skills

rusty-page-indexer

learn-anything-in-one-hour

X/Twitter Research

council