insight-engine

Data-driven insights from operational logs: collect → stats → LLM interpretation → Notion.

Architecture

collect (Python stats only)
  ├── Langfuse OTEL traces/scores/observations
  ├── OpenClaw/gateway logs
  ├── Git activity
  └── Control plane scores
↓
build_*_data_packet()  ← all stats computed in Python before LLM call
↓
call_claude(system_prompt, structured_json)  ← LLM interprets, doesn't compute
↓
write_*_reflection() → Notion

See references/architecture.md for full design rationale.

Quick start

# Install deps
pip install anthropic requests pyyaml

# Configure
cp scripts/config/analyst.yaml.example config/analyst.yaml
# Edit config/analyst.yaml — set langfuse URL, notion IDs, model choices

# Dry run (local Ollama, no Notion write)
python3 scripts/src/engine.py --mode daily --dry-run

# Print data packet + prompt to stdout (for agent consumption, no API calls)
python3 scripts/src/engine.py --mode daily --data-only

# Live run
python3 scripts/src/engine.py --mode daily
python3 scripts/src/engine.py --mode weekly
python3 scripts/src/engine.py --mode monthly

Required env vars

ANTHROPIC_API_KEY=sk-ant-...    # Anthropic API key
NOTION_API_KEY=secret_...       # Notion integration token
LANGFUSE_BASE_URL=http://localhost:3100   # Langfuse server URL
LANGFUSE_PUBLIC_KEY=pk-lf-...   # Langfuse public key
LANGFUSE_SECRET_KEY=sk-lf-...   # Langfuse secret key
NOTION_ROOT_PAGE_ID=<uuid>      # Root Notion page for reports
NOTION_DAILY_DB_ID=<uuid>       # Notion database for daily entries

Or configure in config/analyst.yaml.

Key design principles

Stats before LLM — Python computes all numbers. The LLM interprets, doesn't aggregate.
Citation-enforcing prompts — System prompts require every claim to cite a specific number.
No hallucinated trends — < 7 data points → report "insufficient data (n=X)"
Dry-run mode — Uses local Ollama (free) to preview output; skip Notion write.
Data-only mode — Outputs the full data packet + prompts for agent/subagent use.

Cron setup (LaunchAgent example)

<!-- ~/Library/LaunchAgents/com.yourname.insight-engine-daily.plist -->
<key>StartCalendarInterval</key>
<dict>
  <key>Hour</key><integer>23</integer>
  <key>Minute</key><integer>0</integer>
</dict>
<key>ProgramArguments</key>
<array>
  <string>/usr/bin/python3</string>
  <string>/path/to/insight-engine/scripts/src/engine.py</string>
  <string>--mode</string><string>daily</string>
</array>

Extending to new data sources

Add a collector in scripts/src/collectors/:

Create my_source.py with a fetch_*() function returning a plain dict
Import and call it in build_daily_data_packet() in engine.py
Reference the new key in prompts/daily_analyst.md under "Data sources"

insight-engine

Safety Notice

Copy this and send it to your AI assistant to learn

insight-engine

Architecture

Quick start

Required env vars

Key design principles

Cron setup (LaunchAgent example)

Extending to new data sources

See also

Source Transparency

Related Skills

Cortex Engine

AI Image & Video Toolkit — Free Upscale, Face Enhance, BG Remove & Generation

agent-bom compliance