tiktok-scraping-yt-dlp

Use for TikTok crawling, content retrieval, and analysis

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "tiktok-scraping-yt-dlp" with this command: npx skills add romneyda/tiktok-crawling

TikTok Scraping with yt-dlp

yt-dlp is a CLI for downloading video/audio from TikTok and many other sites.

Setup

# macOS
brew install yt-dlp ffmpeg

# pip (any platform)
pip install yt-dlp
# Also install ffmpeg separately for merging/post-processing

Download Patterns

Single Video

yt-dlp "https://www.tiktok.com/@handle/video/1234567890"

Entire Profile

yt-dlp "https://www.tiktok.com/@handle" \
  -P "./tiktok/data" \
  -o "%(uploader)s/%(upload_date)s-%(id)s/video.%(ext)s" \
  --write-info-json

Creates:

tiktok/data/
  handle/
    20260220-7331234567890/
      video.mp4
      video.info.json

Multiple Profiles

for handle in handle1 handle2 handle3; do
  yt-dlp "https://www.tiktok.com/@$handle" \
    -P "./tiktok/data" \
    -o "%(uploader)s/%(upload_date)s-%(id)s/video.%(ext)s" \
    --write-info-json \
    --download-archive "./tiktok/downloaded.txt"
done

Search, Hashtags & Sounds

# Search by keyword
yt-dlp "tiktoksearch:cooking recipes" --playlist-end 20

# Hashtag page
yt-dlp "https://www.tiktok.com/tag/booktok" --playlist-end 50

# Videos using a specific sound
yt-dlp "https://www.tiktok.com/music/original-sound-1234567890" --playlist-end 30

Format Selection

# List available formats
yt-dlp -F "https://www.tiktok.com/@handle/video/1234567890"

# Download specific format (e.g., best video without watermark if available)
yt-dlp -f "best" "https://www.tiktok.com/@handle/video/1234567890"

Filtering

By Date

# On or after a date
--dateafter 20260215

# Before a date
--datebefore 20260220

# Exact date
--date 20260215

# Date range
--dateafter 20260210 --datebefore 20260220

# Relative dates (macOS / Linux)
--dateafter "$(date -u -v-7d +%Y%m%d)"           # macOS: last 7 days
--dateafter "$(date -u -d '7 days ago' +%Y%m%d)" # Linux: last 7 days

By Metrics & Content

# 100k+ views
--match-filters "view_count >= 100000"

# Duration between 30-60 seconds
--match-filters "duration >= 30 & duration <= 60"

# Title contains "recipe" (case-insensitive)
--match-filters "title ~= (?i)recipe"

# Combine: 50k+ views from Feb 2026
yt-dlp "https://www.tiktok.com/@handle" \
  --match-filters "view_count >= 50000" \
  --dateafter 20260201

Metadata Only (No Download)

Preview What Would Download

yt-dlp "https://www.tiktok.com/@handle" \
  --simulate \
  --print "%(upload_date)s | %(view_count)s views | %(title)s"

Export to JSON

# Single JSON array
yt-dlp "https://www.tiktok.com/@handle" --simulate --dump-json > handle_videos.json

# JSONL (one object per line, better for large datasets)
yt-dlp "https://www.tiktok.com/@handle" --simulate -j > handle_videos.jsonl

Export to CSV

yt-dlp "https://www.tiktok.com/@handle" \
  --simulate \
  --print-to-file "%(uploader)s,%(id)s,%(upload_date)s,%(view_count)s,%(like_count)s,%(webpage_url)s" \
  "./tiktok/analysis/metadata.csv"

Analyze with jq

# Top 10 videos by views from downloaded .info.json files
jq -s 'sort_by(.view_count) | reverse | .[:10] | .[] | {title, view_count, url: .webpage_url}' \
  tiktok/data/*/*.info.json

# Total views across all videos
jq -s 'map(.view_count) | add' tiktok/data/*/*.info.json

# Videos grouped by upload date
jq -s 'group_by(.upload_date) | map({date: .[0].upload_date, count: length})' \
  tiktok/data/*/*.info.json

Tip: For deeper analysis and visualization, load JSONL/CSV exports into Python with pandas. Useful for engagement scatter plots, posting frequency charts, or comparing metrics across creators.


Ongoing Scraping

Archive (Skip Already Downloaded)

The --download-archive flag tracks downloaded videos, enabling incremental updates:

yt-dlp "https://www.tiktok.com/@handle" \
  -P "./tiktok/data" \
  -o "%(uploader)s/%(upload_date)s-%(id)s/video.%(ext)s" \
  --write-info-json \
  --download-archive "./tiktok/downloaded.txt"

Run the same command later—it skips videos already in downloaded.txt.

Authentication (Private/Restricted Content)

# Use cookies from browser (recommended)
yt-dlp --cookies-from-browser chrome "https://www.tiktok.com/@handle"

# Or export cookies to a file first
yt-dlp --cookies tiktok_cookies.txt "https://www.tiktok.com/@handle"

Scheduled Scraping (Cron)

# crontab -e
# Run daily at 2 AM, log output
0 2 * * * cd /path/to/project && ./scripts/scrape-tiktok.sh >> ./tiktok/logs/cron.log 2>&1

Example scripts/scrape-tiktok.sh:

#!/bin/bash
set -e

HANDLES="handle1 handle2 handle3"
DATA_DIR="./tiktok/data"
ARCHIVE="./tiktok/downloaded.txt"

for handle in $HANDLES; do
  echo "[$(date)] Scraping @$handle"
  yt-dlp "https://www.tiktok.com/@$handle" \
    -P "$DATA_DIR" \
    -o "%(uploader)s/%(upload_date)s-%(id)s/video.%(ext)s" \
    --write-info-json \
    --download-archive "$ARCHIVE" \
    --cookies-from-browser chrome \
    --dateafter "$(date -u -v-7d +%Y%m%d)" \
    --sleep-interval 2 \
    --max-sleep-interval 5
done
echo "[$(date)] Done"

Troubleshooting

ProblemSolution
Empty results / no videos foundAdd --cookies-from-browser chrome — TikTok rate-limits anonymous requests
403 Forbidden errorsRate limited. Wait 10-15 min, or use cookies/different IP
"Video unavailable"Region-locked. Try --geo-bypass or a VPN
Watermarked videosCheck -F for alternative formats; some may lack watermark
Slow downloadsAdd --concurrent-fragments 4 for faster downloads
Profile shows fewer videos than expectedTikTok API limits. Use --playlist-end N explicitly, try with cookies

Debug Mode

# Verbose output to diagnose issues
yt-dlp -v "https://www.tiktok.com/@handle" 2>&1 | tee debug.log

Reference

Key Options

OptionDescription
-o TEMPLATEOutput filename template
-P PATHBase download directory
--dateafter DATEVideos on/after date (YYYYMMDD)
--datebefore DATEVideos on/before date
--playlist-end NStop after N videos
--match-filters EXPRFilter by metadata (views, duration, title)
--write-info-jsonSave metadata JSON per video
--download-archive FILETrack downloads, skip duplicates
--simulate / -sDry run, no download
-j / --dump-jsonOutput metadata as JSON
--cookies-from-browser NAMEUse cookies from browser
--sleep-interval SECWait between downloads (avoid rate limits)

Output Template Variables

VariableExample Output
%(id)s7331234567890
%(uploader)shandle
%(upload_date)s20260215
%(title).50sFirst 50 chars of title
%(view_count)s1500000
%(like_count)s250000
%(ext)smp4

Full template reference →

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

Agentype

Run the Agentype workflow for local AI-agent usage analysis: collect and cache deterministic JSON, infer a persona/archetype from aggregate usage signals, th...

Registry SourceRecently Updated
Research

Postmortem Generator

Generate blameless incident postmortems from timeline data, alerts, and chat logs. Produce structured reports with root cause analysis, contributing factors,...

Registry SourceRecently Updated
00Profile unavailable
Research

Amazon Ops Agents

AI-driven multi-agent system for Amazon sellers offering product research, listing optimization, ad management, inventory, pricing, review, brand protection,...

Registry SourceRecently Updated
1320Profile unavailable
Research

Anygen Workflow Generate

AI-powered content creation suite. Create slides/PPT, documents, diagrams, websites, data visualizations, research reports, storybooks, financial analysis, a...

Registry SourceRecently Updated
00Profile unavailable