Extracting Session Data Skill

Core Responsibility

Provide raw access to Claude Code session logs stored in ~/.claude/projects/{project-dir}/{session-id}.jsonl .

Key Principle: This skill extracts data only - return raw data to calling skills for analysis. Do not analyze or interpret within this skill.

Available Scripts

All scripts located in scripts/ subdirectory relative to this skill.

locate-logs.sh

Find log directory or specific session file path.

Get logs directory for current working directory

scripts/locate-logs.sh

Get logs directory for specific project

scripts/locate-logs.sh /path/to/project

Get specific session log file path

scripts/locate-logs.sh /path/to/project abc123-session-id

Use when: Building dynamic paths, verifying logs exist before processing.

list-sessions.sh

Enumerate all sessions with metadata (ID, size, lines, date, branch).

List all sessions (table format)

scripts/list-sessions.sh

JSON output

scripts/list-sessions.sh --format json

Sort by size or lines

scripts/list-sessions.sh --sort size scripts/list-sessions.sh --sort lines

Specific project

scripts/list-sessions.sh /path/to/project

Output formats: table , json , csv

Sort options: date , size , lines

Use when: Starting retrospective, showing available sessions to user, checking for recent sessions.

extract-data.sh

Parse JSONL logs and extract specific data types.

Available extraction types:

metadata
Session info (ID, timestamps, branch, working dir)
user-prompts
All user messages
tool-usage
Tool call statistics
errors
Failed tool calls with timestamps
thinking
Thinking blocks (if extended thinking enabled)
text-responses
Assistant text responses only
statistics
Session metrics (message counts, tool calls, errors)
all
Combined extraction

Extract from specific session

scripts/extract-data.sh --type statistics --session SESSION_ID scripts/extract-data.sh --type errors --session SESSION_ID scripts/extract-data.sh --type tool-usage --session SESSION_ID

Extract from all sessions (omit --session)

scripts/extract-data.sh --type statistics

Limit output

scripts/extract-data.sh --type user-prompts --limit 10

Different project

scripts/extract-data.sh --type metadata --project /path/to/project

Use when: Need specific data without loading entire log, generating metrics, identifying errors.

filter-sessions.sh

Find sessions matching criteria.

Filter options:

--since DATE
Sessions modified since date ("2 days ago", "2025-10-20")
--until DATE
Sessions modified until date
--branch NAME
Sessions on specific git branch
--min-size SIZE
Minimum file size ("1M", "500K")
--max-size SIZE
Maximum file size
--min-lines N
Minimum line count
--max-lines N
Maximum line count
--has-errors
Only sessions with failed tool calls
--keyword WORD
Sessions containing keyword

Output formats: list , paths , json

Recent sessions

scripts/filter-sessions.sh --since "2 days ago"

Large sessions with errors

scripts/filter-sessions.sh --min-lines 500 --has-errors

Sessions on main branch in last week

scripts/filter-sessions.sh --branch main --since "7 days ago"

Sessions containing keyword

scripts/filter-sessions.sh --keyword "authentication"

Get paths only (for piping)

scripts/filter-sessions.sh --since "1 day ago" --format paths

Use when: User requests analysis of recent sessions, finding sessions for specific feature/branch, identifying problematic sessions.

Working Process

Single Session Analysis

1. Verify session exists and get metadata

scripts/extract-data.sh --type metadata --session SESSION_ID

2. Get session statistics (to determine size)

scripts/extract-data.sh --type statistics --session SESSION_ID

3. Extract specific data as needed

scripts/extract-data.sh --type errors --session SESSION_ID scripts/extract-data.sh --type tool-usage --session SESSION_ID

Multiple Session Analysis

1. Filter to find relevant sessions

scripts/filter-sessions.sh --since "7 days ago" --branch main

2. Extract data from all filtered sessions

scripts/extract-data.sh --type statistics

3. Or iterate through filtered subset

SESSIONS=$(scripts/filter-sessions.sh --has-errors --format paths) for session in $SESSIONS; do SESSION_ID=$(basename "$session" .jsonl) scripts/extract-data.sh --type errors --session $SESSION_ID done

Integration Pattern for Calling Skills

When another skill (like retrospecting) needs session data:

Discovery: Use list-sessions.sh or filter-sessions.sh to find relevant sessions
Size Check: Use extract-data.sh --type statistics to determine session complexity
Targeted Extraction: Use extract-data.sh with specific types for needed data
Return Raw Data: Return extracted data to caller for analysis

Example:

Get latest session ID

LATEST=$(scripts/list-sessions.sh --format json --sort date | jq -r '.[0].sessionId')

Check size before processing

STATS=$(scripts/extract-data.sh --type statistics --session $LATEST) LINE_COUNT=$(echo "$STATS" | grep "Total Lines:" | awk '{print $3}')

Extract based on size

if [ "$LINE_COUNT" -lt 500 ]; then # Small session: extract detail scripts/extract-data.sh --type errors --session $LATEST scripts/extract-data.sh --type tool-usage --session $LATEST else # Large session: summary only scripts/extract-data.sh --type statistics --session $LATEST fi

Context Budget Management

CRITICAL: This skill is designed for context efficiency

Use Bash Processing, Not Read Tool

GOOD: Extract via bash, stays in bash context

STATS=$(scripts/extract-data.sh --type statistics)

Process $STATS in bash

BAD: Reading full log files

Read ~/.claude/projects/-path/session.jsonl

Loads entire file into context unnecessarily

Check Session Size Before Loading

Never load full session logs into context without checking size first.

Always check statistics first

scripts/extract-data.sh --type statistics --session SESSION_ID

Shows total lines, message counts, etc.

Decision rules:

- Small (<500 lines): Can extract detail safely

- Medium (500-2000 lines): Use selective extraction

- Large (>2000 lines): Statistics only, offer targeted deep-dives

Return Raw Data to Caller

This skill should:

Execute bash scripts to extract data
Return raw text output to calling skill
Let calling skill manage context for analysis
Avoid interpretation or analysis within this skill

Output Format

Return raw extracted data with minimal formatting:

Statistics output

Session: abc123-def456-ghi789 Total Lines: 450 User Messages: 12 Assistant Messages: 23 Tool Calls: 45 Errors: 2

Tool usage output

=== Tool Usage: abc123-def456-ghi789 === Read 15 Bash 12 Edit 8 Grep 5 Write 3

No analysis, no interpretation - just data extraction.

Error Handling

All scripts exit with non-zero status on errors and output to stderr.

Check exit status before processing:

if ! scripts/locate-logs.sh /path/to/project &>/dev/null; then # Handle: logs directory doesn't exist echo "Project has no session logs yet" fi

if ! scripts/extract-data.sh --type metadata --session abc123 &>/dev/null; then # Handle: session doesn't exist echo "Session not found" fi

Common error messages:

Error: Logs directory not found: ~/.claude/projects/-path
Error: Session file not found: ~/.claude/projects/-path/session-id.jsonl
Error: --type is required
Error: jq is required but not installed. Install with: brew install jq

Path Calculation

Claude Code stores sessions using this pattern:

~/.claude/projects/{project-identifier}/{session-id}.jsonl

Where {project-identifier} is calculated by replacing all / with - in the absolute working directory path:

Example: /Users/user/project → -Users-user-project

PROJECT_ID=$(echo "${PWD}" | sed 's///-/g') LOGS_DIR="${HOME}/.claude/projects/${PROJECT_ID}"

All scripts use locate-logs.sh internally for consistent path calculation.

Anti-Patterns to Avoid

Don't:

Load full session logs into context without checking size
Parse JSONL manually - use extract-data.sh
Hardcode log paths - use locate-logs.sh
Analyze or interpret data - return raw data to caller
Process large logs synchronously without user awareness

Do:

Check session size with --type statistics before processing
Use appropriate extraction type for specific needs
Filter sessions before extraction for efficiency
Stream/pipe data when processing multiple sessions
Return raw data for caller to analyze

Success Criteria

Effective use of this skill means:

Efficient Discovery: Quickly find relevant sessions without manual searching
Targeted Extraction: Get exactly the data needed, nothing more
Context Preservation: Avoid loading unnecessary data into context
Raw Data Focus: Return unprocessed data for caller to analyze
Multi-Session Support: Handle analysis across timeframes or branches efficiently

Dependencies

Required:

bash (v4.0+)
jq (JSON parser)

Scripts check for jq and provide installation instructions if missing:

Error: jq is required but not installed. Install with: brew install jq