Bilibili Video Summary Tool

Extract full content from a Bilibili video — transcript/subtitles, danmaku, comments, and description — then use your own LLM capabilities to produce a deep summary. No external AI API required (no OpenAI / Gemini key needed).

Capabilities

Data Source	Method	Priority
CC Subtitles	Bilibili API	Fastest, used if available
Audio Transcription	whisper.cpp + Vulkan GPU	Automatic fallback when no subtitles
Video Description	yt-dlp	Always captured
Danmaku (scrolling comments)	yt-dlp	Parsed, analyzed for frequent content
Comments	Bilibili Comment API	Hot-sorted, deduplicated, top liked extracted

Workflow

When you receive a Bilibili video link and are asked to summarize it, follow these steps:

Step 1: Extract all data

python bili-transcript.py "<video_url>"

The script automatically:

Gets video title, uploader, duration, description
Attempts Bilibili CC subtitles (fastest, used if available)
Falls back to GPU transcription: download audio → convert to wav → whisper.cpp with Vulkan
Downloads and analyzes danmaku (scrolling comments)
Fetches video comments, sorted by likes

Output files are saved to ./bili-output/:

transcript.txt — full transcript/subtitle text
danmaku.json — danmaku data with statistics
comments.json — comment data with top-liked

The JSON output includes preview text, danmaku summary, and top comments.

Step 2: Read full transcript

The JSON preview truncates at 2000 characters. Read the full file:

cat ./bili-output/transcript.txt

Step 3: Read danmaku and comments

Review community response data:

cat ./bili-output/danmaku.json
cat ./bili-output/comments.json

Step 4: Compose your summary

Use your own LLM capabilities to produce a comprehensive summary. Suggested structure:

Video Overview — Title, uploader, duration, transcription source (subtitle / GPU). Key info from the description (project links, update notes, etc.).

Core Content — What the video is about. Fluent paragraph summary of the main narrative.

Key Points — Notable arguments, data points, or information worth highlighting.

Community Response (optional) — Reactions from danmaku and comments. Skip if content is insubstantial (spam, trolling, no valuable discussion).

Danmaku analysis: look for frequently repeated phrases (community memes/reactions), informative questions, technical discussions, controversy points
Comment analysis: look for top-liked opinions, creator interactions, user-reported issues, technical insights

Assessment (optional) — Content quality, information density, notable strengths or weaknesses.

Available Actions

# Video metadata only
python bili-transcript.py "<URL>" --action info

# CC subtitles only (if available)
python bili-transcript.py "<URL>" --action subtitle

# Force GPU transcription (skip subtitle check)
python bili-transcript.py "<URL>" --action transcribe

# Danmaku only
python bili-transcript.py "<URL>" --action danmaku

# Comments only
python bili-transcript.py "<URL>" --action comments

# Custom output directory
python bili-transcript.py "<URL>" --output ./my-output

Environment Variables

Variable	Purpose
`WHISPER_CPP_DIR`	Path to whisper.cpp directory (containing whisper-cli)
`WHISPER_MODEL`	Path to whisper model file (e.g., ggml-large-v3-turbo.bin)
`BILI_OUTPUT_DIR`	Default output directory (default: ./bili-output)

Performance Reference

Video Length	Total Time	Notes
5 minutes	~15s	GPU transcription is fast
12 minutes	~22s	Download + convert + transcribe
1 hour	~2-3 min	Depends on audio density
Danmaku/Comments	~5-10s	Depends on comment volume

Dependencies

Python packages: yt-dlp, av (PyAV)
Transcription engine: whisper.cpp with Vulkan support (optional, only needed if no CC subtitles)
Model: ggml-large-v3-turbo.bin (~1.6GB, download separately)
GPU: Any Vulkan-compatible GPU (NVIDIA, AMD, Intel) — auto-detected
No external AI API keys required

Limitations

Requires internet access to Bilibili
Some content requires login (paid courses, restricted videos) — may fail
Danmaku and comment APIs may be rate-limited
whisper.cpp does not support m4a; script auto-converts via PyAV
Very long videos (>2 hours) take significant transcription time; try --action subtitle first
Comments are fetched from the first 3 pages (~60 comments); may not cover very hot videos fully

bili-summary

Safety Notice

Copy this and send it to your AI assistant to learn

Bilibili Video Summary Tool

Capabilities

Workflow

Step 1: Extract all data

Step 2: Read full transcript

Step 3: Read danmaku and comments

Step 4: Compose your summary

Available Actions

Environment Variables

Performance Reference

Dependencies

Limitations

Source Transparency

Related Skills

Belmond Ltd

Best Buy Company

Alnylam Pharma

火一五 ASR 转写与纪要