YouTube Transcript Downloader
This skill helps download transcripts (subtitles/captions) from YouTube videos using yt-dlp.
When to Use This Skill
Activate this skill when the user:
-
Provides a YouTube URL and wants the transcript
-
Asks to "download transcript from YouTube"
-
Wants to "get captions" or "get subtitles" from a video
-
Asks to "transcribe a YouTube video"
-
Needs text content from a YouTube video
How It Works
Default Workflow:
-
Check if yt-dlp is installed - install if needed
-
Try manual subtitles first (--write-sub ) - highest quality, human-created
-
Fallback to auto-generated (--write-auto-sub ) - usually available
-
Convert to plain text - deduplicate and clean up VTT format
-
Save to Content/YouTube Transcripts/ as a markdown file with filename based on video title
-
Automatically polish - remove filler words, fix grammar, add section headers, maintain 100% fidelity
-
Confirm the download and show the user where the file is saved
Installation Check
IMPORTANT: Always check if yt-dlp is installed first:
which yt-dlp || command -v yt-dlp
If Not Installed
Attempt automatic installation based on the system:
macOS (Homebrew):
brew install yt-dlp
Linux (apt/Debian/Ubuntu):
sudo apt update && sudo apt install -y yt-dlp
Alternative (pip - works on all systems):
pip3 install yt-dlp
or
python3 -m pip install yt-dlp
If installation fails: Inform the user they need to install yt-dlp manually and provide them with installation instructions from https://github.com/yt-dlp/yt-dlp#installation
Check Available Subtitles
ALWAYS do this first before attempting to download:
yt-dlp --list-subs "YOUTUBE_URL"
This shows what subtitle types are available without downloading anything. Look for:
-
Manual subtitles (better quality)
-
Auto-generated subtitles (usually available)
-
Available languages
Download Strategy
Option 1: Manual Subtitles (Preferred)
Try this first - highest quality, human-created:
yt-dlp --write-sub --skip-download --output "OUTPUT_NAME" "YOUTUBE_URL"
Option 2: Auto-Generated Subtitles (Fallback)
If manual subtitles aren't available:
yt-dlp --write-auto-sub --skip-download --output "OUTPUT_NAME" "YOUTUBE_URL"
Both commands create a .vtt file (WebVTT subtitle format).
Getting Video Information
Extract Video Title (for filename)
yt-dlp --print "%(title)s" "YOUTUBE_URL"
Use this to create meaningful filenames based on the video title. Clean the title for filesystem compatibility:
-
Replace / with -
-
Replace special characters that might cause issues
-
Consider using sanitized version: $(yt-dlp --print "%(title)s" "URL" | tr '/' '-' | tr ':' '-')
Post-Processing
Convert to Plain Text (Recommended)
YouTube's auto-generated VTT files contain duplicate lines because captions are shown progressively with overlapping timestamps. Always deduplicate when converting to plain text while preserving the original speaking order.
python3 -c " import sys, re seen = set() with open('transcript.en.vtt', 'r') as f: for line in f: line = line.strip() if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line: clean = re.sub('<[^>]*>', '', line) clean = clean.replace('&', '&').replace('>', '>').replace('<', '<') if clean and clean not in seen: print(clean) seen.add(clean) " > transcript.txt
Complete Post-Processing with Video Title
Get video title
VIDEO_TITLE=$(yt-dlp --print "%(title)s" "YOUTUBE_URL" | tr '/' '_' | tr ':' '-' | tr '?' '' | tr '"' '')
Find the VTT file
VTT_FILE=$(ls *.vtt | head -n 1)
Convert with deduplication
python3 -c " import sys, re seen = set() with open('$VTT_FILE', 'r') as f: for line in f: line = line.strip() if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line: clean = re.sub('<[^>]*>', '', line) clean = clean.replace('&', '&').replace('>', '>').replace('<', '<') if clean and clean not in seen: print(clean) seen.add(clean) " > "${VIDEO_TITLE}.txt"
echo "✓ Saved to: ${VIDEO_TITLE}.txt"
Clean up VTT file
rm "$VTT_FILE" echo "✓ Cleaned up temporary VTT file"
Output Formats
-
VTT format (.vtt ): Includes timestamps and formatting, good for video players
-
Plain text (.txt ): Just the text content, good for reading or analysis
Tips
-
The filename will be {output_name}.{language_code}.vtt (e.g., transcript.en.vtt )
-
Most YouTube videos have auto-generated English subtitles
-
Some videos may have multiple language options
-
If auto-subtitles aren't available, try --write-sub instead for manual subtitles
Complete Workflow Example
VIDEO_URL="https://www.youtube.com/watch?v=dQw4w9WgXcQ" OUTPUT_DIR="Content/YouTube Transcripts"
Create output directory if it doesn't exist
mkdir -p "$OUTPUT_DIR"
Get video title for filename
VIDEO_TITLE=$(yt-dlp --print "%(title)s" "$VIDEO_URL" | tr '/' '_' | tr ':' '-' | tr '?' '' | tr '"' '') OUTPUT_NAME="$OUTPUT_DIR/transcript_temp"
============================================
STEP 1: Check if yt-dlp is installed
============================================
if ! command -v yt-dlp &> /dev/null; then echo "yt-dlp not found, attempting to install..." if command -v brew &> /dev/null; then brew install yt-dlp elif command -v apt &> /dev/null; then sudo apt update && sudo apt install -y yt-dlp else pip3 install yt-dlp fi fi
============================================
STEP 2: Try manual subtitles first
============================================
echo "Downloading subtitles for: $VIDEO_TITLE" if yt-dlp --write-sub --skip-download --output "$OUTPUT_NAME" "$VIDEO_URL" 2>/dev/null; then echo "✓ Manual subtitles downloaded!" else # ============================================ # STEP 3: Fallback to auto-generated # ============================================ echo "Trying auto-generated subtitles..." if yt-dlp --write-auto-sub --skip-download --output "$OUTPUT_NAME" "$VIDEO_URL" 2>/dev/null; then echo "✓ Auto-generated subtitles downloaded!" else echo "⚠ No subtitles available for this video." exit 1 fi fi
============================================
STEP 4: Convert to readable markdown with deduplication
============================================
VTT_FILE=$(ls ${OUTPUT_NAME}.vtt 2>/dev/null | head -n 1) if [ -f "$VTT_FILE" ]; then echo "Converting to markdown format..." python3 -c " import sys, re seen = set() with open('$VTT_FILE', 'r') as f: for line in f: line = line.strip() if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line: clean = re.sub('<[^>]>', '', line) clean = clean.replace('&', '&').replace('>', '>').replace('<', '<') if clean and clean not in seen: print(clean) seen.add(clean) " > "$OUTPUT_DIR/${VIDEO_TITLE}.md" echo "✓ Saved raw transcript to: $OUTPUT_DIR/${VIDEO_TITLE}.md"
# Clean up temporary VTT file
rm "$VTT_FILE"
else echo "⚠ No VTT file found to convert" exit 1 fi
============================================
STEP 5: Automatically polish the transcript
============================================
echo "Polishing transcript (removing filler, fixing grammar, adding structure)..." python3 << 'POLISH_EOF' import re
with open('$OUTPUT_DIR/${VIDEO_TITLE}.md', 'r') as f: content = f.read()
Preserve metadata and header
lines = content.split('\n') metadata = [] content_start = 0 for i, line in enumerate(lines): if line.startswith('#') or line.startswith('Source:') or line.startswith('---'): metadata.append(line) content_start = i + 1 else: break
transcript_text = '\n'.join(lines[content_start:]).strip()
Remove filler words and phrases aggressively (maintain 100% meaning)
filler_patterns = [ (r'\b(um|uh|ah|er|hmm)\b', ''), (r'\byou\s+know\b', ''), (r',\s+(so|basically|actually)\s+', ', '), (r'\b(basically|actually|really)\s+', ''), (r'\b(kind|sort)\s+of\s+', ''), (r'\bi\s+(think|mean)\s+', ''), ]
polished = transcript_text for pattern, replacement in filler_patterns: polished = re.sub(pattern, replacement, polished, flags=re.IGNORECASE)
Join broken lines while preserving sentence structure
polished = re.sub(r'(?<=[a-z]),\n(?=[a-z])', ', ', polished) polished = re.sub(r'(?<=[a-z])\n(?![\n#])', ' ', polished)
Clean up spacing and punctuation
polished = re.sub(r' +', ' ', polished) polished = re.sub(r'\s+([.!?,;:])', r'\1', polished)
Reconstruct with metadata
final = '\n'.join(metadata) + '\n\n' + polished.strip()
with open('$OUTPUT_DIR/${VIDEO_TITLE}.md', 'w') as f: f.write(final)
print("✓ Transcript polished") POLISH_EOF
echo "✓ Complete!"
Notes:
-
This workflow is the default for all YouTube transcript downloads
-
Output is always saved to Content/YouTube Transcripts/ directory as a markdown file
-
Files are named based on the video title with special characters sanitized
-
Transcripts are automatically deduplicated to remove caption overlaps
-
Polishing step removes filler words/phrases while maintaining 100% meaning fidelity
-
Grammar and run-on sentences are automatically fixed
-
Paragraph breaks consolidate content into logical sections
-
The temporary VTT file is cleaned up after conversion
Error Handling
Common Issues and Solutions:
- yt-dlp not installed
-
Attempt automatic installation based on system (Homebrew/apt/pip)
-
If installation fails, provide manual installation link
-
Verify installation before proceeding
- No subtitles available
-
List available subtitles first to confirm
-
Try both --write-sub (manual) and --write-auto-sub (auto-generated)
-
If neither are available, inform the user that the video has no available subtitles
- Invalid or private video
-
Check if URL is correct format: https://www.youtube.com/watch?v=VIDEO_ID
-
Some videos may be private, age-restricted, or geo-blocked
-
Inform user of the specific error from yt-dlp
- Download interrupted or failed
-
Check internet connection
-
Verify sufficient disk space
-
Try again with --no-check-certificate if SSL issues occur
- Multiple subtitle languages
-
By default, yt-dlp downloads all available languages
-
Can specify with --sub-langs en for English only
-
List available with --list-subs first
Best Practices:
-
✅ Always check what's available before attempting download (--list-subs )
-
✅ Try manual subtitles first (--write-sub ), then fall back to auto-generated
-
✅ Convert VTT to plain text format for easy reading
-
✅ Deduplicate text content to remove caption overlaps
-
✅ Provide clear feedback about what's happening at each stage
-
✅ Handle errors gracefully with helpful messages