insight-engine
Data-driven insights from operational logs: collect → stats → LLM interpretation → Notion.
Architecture
collect (Python stats only)
├── Langfuse OTEL traces/scores/observations
├── OpenClaw/gateway logs
├── Git activity
└── Control plane scores
↓
build_*_data_packet() ← all stats computed in Python before LLM call
↓
call_claude(system_prompt, structured_json) ← LLM interprets, doesn't compute
↓
write_*_reflection() → Notion
See references/architecture.md for full design rationale.
Quick start
# Install deps
pip install anthropic requests pyyaml
# Configure
cp scripts/config/analyst.yaml.example config/analyst.yaml
# Edit config/analyst.yaml — set langfuse URL, notion IDs, model choices
# Dry run (local Ollama, no Notion write)
python3 scripts/src/engine.py --mode daily --dry-run
# Print data packet + prompt to stdout (for agent consumption, no API calls)
python3 scripts/src/engine.py --mode daily --data-only
# Live run
python3 scripts/src/engine.py --mode daily
python3 scripts/src/engine.py --mode weekly
python3 scripts/src/engine.py --mode monthly
Required env vars
ANTHROPIC_API_KEY=sk-ant-... # Anthropic API key
NOTION_API_KEY=secret_... # Notion integration token
LANGFUSE_BASE_URL=http://localhost:3100 # Langfuse server URL
LANGFUSE_PUBLIC_KEY=pk-lf-... # Langfuse public key
LANGFUSE_SECRET_KEY=sk-lf-... # Langfuse secret key
NOTION_ROOT_PAGE_ID=<uuid> # Root Notion page for reports
NOTION_DAILY_DB_ID=<uuid> # Notion database for daily entries
Or configure in config/analyst.yaml.
Key design principles
- Stats before LLM — Python computes all numbers. The LLM interprets, doesn't aggregate.
- Citation-enforcing prompts — System prompts require every claim to cite a specific number.
- No hallucinated trends —
< 7 data points→ report "insufficient data (n=X)" - Dry-run mode — Uses local Ollama (free) to preview output; skip Notion write.
- Data-only mode — Outputs the full data packet + prompts for agent/subagent use.
Cron setup (LaunchAgent example)
<!-- ~/Library/LaunchAgents/com.yourname.insight-engine-daily.plist -->
<key>StartCalendarInterval</key>
<dict>
<key>Hour</key><integer>23</integer>
<key>Minute</key><integer>0</integer>
</dict>
<key>ProgramArguments</key>
<array>
<string>/usr/bin/python3</string>
<string>/path/to/insight-engine/scripts/src/engine.py</string>
<string>--mode</string><string>daily</string>
</array>
Extending to new data sources
Add a collector in scripts/src/collectors/:
- Create
my_source.pywith afetch_*()function returning a plain dict - Import and call it in
build_daily_data_packet()inengine.py - Reference the new key in
prompts/daily_analyst.mdunder "Data sources"
See also
references/architecture.md— full design rationale and layer descriptionsscripts/prompts/daily_analyst.md— system prompt with citation rulesscripts/config/analyst.yaml.example— config template