apify-scrapers

Scrape content from major social platforms using Apify actors. Each platform has optimized settings for cost and quality.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "apify-scrapers" with this command: npx skills add casper-studios/casper-marketplace/casper-studios-casper-marketplace-apify-scrapers

Apify Scrapers

Overview

Scrape content from major social platforms using Apify actors. Each platform has optimized settings for cost and quality.

Quick Decision Tree

What do you want to scrape? │ ├── Social Media Posts │ ├── Twitter/X → references/twitter.md │ │ └── Script: scripts/scrape_twitter_ai_trends.py │ │ │ ├── Reddit → references/reddit.md │ │ └── Script: scripts/scrape_reddit_ai_tech.py │ │ │ ├── LinkedIn → references/linkedin.md │ │ └── Script: scripts/scrape_linkedin_posts.py │ │ │ ├── Instagram → references/instagram.md │ │ └── Script: scripts/scrape_instagram.py │ │ └── Modes: profile, posts, hashtag, reels, comments │ │ │ ├── Facebook → references/facebook.md │ │ └── Script: scripts/scrape_facebook.py │ │ └── Modes: page, posts, reviews, groups, marketplace │ │ │ ├── TikTok → references/multi-platform.md │ │ └── Script: scripts/scrape_multi_platform.py │ │ │ └── YouTube → references/multi-platform.md │ └── Script: scripts/scrape_multi_platform.py │ ├── Business/Places │ ├── Google Maps businesses → references/google-maps.md │ │ └── Script: scripts/scrape_google_maps.py │ │ └── Modes: search, place, reviews │ │ │ └── Contact info from websites → references/contact-enrichment.md │ └── Script: scripts/scrape_contact_info.py │ └── Extract: emails, phone numbers, social profiles │ ├── Auto-detect URL type → references/url-detect.md │ └── Script: scripts/scrape_content_by_url.py │ ├── Trend Analysis (NEW) │ └── Enriched trend analysis → workflows/trend-analysis.md │ └── Script: scripts/analyze_trends.py │ └── Features: velocity scoring, lifecycle staging, opportunity scoring │ └── Workflows (multi-step) ├── Lead generation → workflows/lead-generation.md ├── Influencer discovery → workflows/influencer-discovery.md ├── Competitor analysis → workflows/competitor-intel.md ├── Trend analysis → workflows/trend-analysis.md └── Competitor Ads Intelligence (NEW) → workflows/competitor-ads.md └── Script: scripts/scrape_competitor_ads.py └── Platforms: Facebook Ads Library, Google Ads Transparency └── Features: Spend estimates, creative analysis, benchmarking

Environment Setup

Required in .env

APIFY_TOKEN=apify_api_xxxxx

Get your API key: https://console.apify.com/account/integrations

Common Usage Patterns

Scrape Twitter Trends

python scripts/scrape_twitter_ai_trends.py --query "AI agents" --max-tweets 50

Scrape Reddit Discussions

python scripts/scrape_reddit_ai_tech.py --subreddits "MachineLearning,LocalLLaMA" --max-posts 100

Scrape LinkedIn Author

python scripts/scrape_linkedin_posts.py author "https://linkedin.com/in/username" --max-posts 30

Auto-detect and Scrape URL

python scripts/scrape_content_by_url.py "https://x.com/user/status/123456"

Scrape Instagram Profile

python scripts/scrape_instagram.py profile "https://instagram.com/username" --max-posts 20

Scrape Instagram Hashtag

python scripts/scrape_instagram.py hashtag "#artificialintelligence" --max-posts 50

Scrape Instagram Reels

python scripts/scrape_instagram.py reels "https://instagram.com/username" --max-reels 30

Scrape Facebook Page

python scripts/scrape_facebook.py page "https://facebook.com/pagename" --max-posts 50

Scrape Facebook Reviews

python scripts/scrape_facebook.py reviews "https://facebook.com/pagename" --max-reviews 100

Scrape Facebook Marketplace

python scripts/scrape_facebook.py marketplace "laptops in san francisco" --max-items 30

Scrape Google Maps Businesses

python scripts/scrape_google_maps.py search "AI consulting firms in New York" --max-results 50

Scrape Google Maps Reviews

python scripts/scrape_google_maps.py reviews "ChIJN1t_tDeuEmsRUsoyG83frY4" --max-reviews 100

Extract Contact Info from Websites

python scripts/scrape_contact_info.py "https://example.com" --depth 2

Bulk Contact Enrichment

python scripts/scrape_contact_info.py --urls-file companies.txt --output contacts.json

Scrape Competitor Ads (Single Competitor)

python scripts/scrape_competitor_ads.py "Nike" --platforms facebook google --country US --days 30

Compare Multiple Competitors' Ads

python scripts/scrape_competitor_ads.py "Nike" "Adidas" "Puma" --compare --output comparison.json

Discover Advertisers by Keyword

python scripts/scrape_competitor_ads.py --search "running shoes" --country US --max-ads 200

Filter Competitor Ads by Media Type

python scripts/scrape_competitor_ads.py "Netflix" "Disney+" --platforms facebook --media-types video --days 7

Analyze Trends (NEW)

Analyze specific topic with enrichments

python scripts/analyze_trends.py "artificial intelligence" --sources google instagram tiktok --days 90

Discover trending topics in category

python scripts/analyze_trends.py --category technology --discover --top 50

Compare multiple trends

python scripts/analyze_trends.py "AI" "blockchain" "metaverse" --compare

Export HTML trend report

python scripts/analyze_trends.py "sustainable fashion" --format html --output trend_report.html

Cost Estimates

Platform Actor Cost per Item

Twitter kaitoeasyapi/twitter-x-data-tweet-scraper ~$0.00025

Reddit trudax/reddit-scraper ~$0.001-0.005

LinkedIn harvestapi/linkedin-post-search ~$0.01-0.05

YouTube streamers/youtube-scraper ~$0.01-0.05

TikTok clockworks/tiktok-scraper ~$0.005

Instagram (profile) apify/instagram-profile-scraper ~$0.005

Instagram (posts) apify/instagram-post-scraper ~$0.002-0.005

Instagram (hashtag) apify/instagram-hashtag-scraper ~$0.002-0.005

Instagram (reels) apify/instagram-reel-scraper ~$0.005-0.01

Instagram (comments) apify/instagram-comment-scraper ~$0.001-0.003

Facebook (page) apify/facebook-pages-scraper ~$0.005-0.01

Facebook (posts) apify/facebook-posts-scraper ~$0.003-0.005

Facebook (reviews) apify/facebook-reviews-scraper ~$0.002-0.005

Facebook (groups) apify/facebook-groups-scraper ~$0.005-0.01

Facebook (marketplace) apify/facebook-marketplace-scraper ~$0.005-0.01

Google Maps (search) compass/crawler-google-places ~$0.01-0.02

Google Maps (place) compass/google-maps-business-scraper ~$0.01

Google Maps (reviews) compass/google-maps-reviews-scraper ~$0.003-0.005

Contact Enrichment lukaskrivka/contact-info-scraper ~$0.01-0.03

Google Trends apify/google-trends-scraper ~$0.01

Trend Analysis (multi) Multiple actors ~$0.50-1.50/run

Facebook Ads Library apify/facebook-ads-scraper ~$0.75/1K ads

Facebook Ads (alt) curious_coder/facebook-ads-library-scraper ~$0.50/1K ads

Google Ads Transparency lexis-solutions/google-ads-scraper ~$1.00/1K ads

Google Ads (alt) xtech/google-ad-transparency-scraper ~$0.80/1K ads

Output Location

All scraped data saves to .tmp/ with timestamped filenames:

  • .tmp/twitter_ai_trends_YYYYMMDD.json

  • .tmp/reddit_ai_tech_YYYYMMDD.json

  • .tmp/linkedin_posts_YYYYMMDD_HHMMSS.json

Security Notes

Credential Handling

  • Store APIFY_TOKEN in .env file (never commit to git)

  • Rotate API tokens periodically via Apify Console

  • Never log or print API tokens in script output

  • Use environment variables, not hardcoded values

Data Privacy

  • Scraped data contains only publicly available content

  • Social media posts may include PII (names, handles, profile info)

  • Data is stored locally in .tmp/ directory

  • No data is retained by Apify after actor run completes

  • Consider data minimization - only scrape what you need

Access Scopes

  • Apify tokens have full account access (no granular scopes)

  • Use separate Apify accounts for different projects if needed

  • Monitor usage via Apify Console dashboard

Compliance Considerations

  • Terms of Service: Respect each platform's ToS (Twitter, Reddit, LinkedIn)

  • Rate Limiting: Actors have built-in rate limiting to avoid bans

  • Robots.txt: Some actors may bypass robots.txt - use responsibly

  • GDPR: Scraped PII may be subject to GDPR if EU residents

  • Ethical Use: Only scrape public data; never bypass authentication

  • Proxy Ethics: Residential proxies should be used ethically

Troubleshooting

Common Issues

Issue: Actor run failed

Symptoms: Script terminates with "Actor run failed" or timeout error Cause: Invalid actor ID, insufficient proxy credits, or actor configuration issue Solution:

  • Verify the actor ID is correct in the script

  • Check Apify Console for actor run logs

  • Ensure proxy settings match actor requirements

  • Try running with default proxy settings first

Issue: Empty results returned

Symptoms: Script completes but returns 0 items Cause: Content blocked by platform, invalid query, or proxy being detected Solution:

  • Try a different proxy type (residential vs datacenter)

  • Simplify the search query

  • Reduce the number of results requested

  • Check if the platform is blocking scraping attempts

Issue: Rate limited by platform

Symptoms: Script fails with 429 errors or "rate limited" messages Cause: Too many requests in a short time period Solution:

  • Add delays between requests (actor settings)

  • Reduce concurrent requests

  • Use proxy rotation

  • Wait and retry after a cooldown period

Issue: Invalid API token

Symptoms: Authentication error or "invalid token" message Cause: Token expired, revoked, or incorrectly set Solution:

  • Regenerate API token in Apify Console

  • Verify token is correctly set in .env file

  • Check for leading/trailing whitespace in token

  • Ensure APIFY_TOKEN environment variable is loaded

Issue: Proxy connection errors

Symptoms: Connection timeout or proxy errors Cause: Proxy pool exhausted or geo-restriction issues Solution:

  • Switch proxy type (basic, residential, or datacenter)

  • Verify proxy credit balance in Apify Console

  • Try a different proxy country/region

  • Disable proxy to test if that's the root cause

Resources

Platform References

  • references/twitter.md - Twitter/X scraping details

  • references/reddit.md - Reddit scraping with subreddit targeting

  • references/linkedin.md - LinkedIn post scraping (author or search mode)

  • references/instagram.md - Instagram profile, posts, hashtag, reels, and comments scraping

  • references/facebook.md - Facebook page, posts, reviews, groups, and marketplace scraping

  • references/multi-platform.md - TikTok and YouTube scraping

  • references/url-detect.md - Auto-detect URL type and scrape

Business/Places References

  • references/google-maps.md - Google Maps business search, place details, and reviews

  • references/contact-enrichment.md - Extract emails, phone numbers, and social profiles from websites

Workflow References

  • workflows/lead-generation.md - Multi-step lead generation workflow

  • workflows/influencer-discovery.md - Find and analyze influencers across platforms

  • workflows/competitor-intel.md - Competitive intelligence gathering workflow

  • workflows/trend-analysis.md - Enriched multi-platform trend analysis with scoring

Integration Patterns

Scrape and Enrich

Skills: apify-scrapers → parallel-research Use case: Scrape social media posts, then enrich with deep research Flow:

  • Scrape Twitter/Reddit for mentions of a topic

  • Extract company names or URLs from posts

  • Use parallel-research to get detailed info on each company

Scrape and Summarize

Skills: apify-scrapers → content-generation Use case: Create newsletter content from social media trends Flow:

  • Scrape trending AI posts from Twitter

  • Pass scraped data to content-generation summarize

  • Generate a formatted newsletter section

Scrape and Archive

Skills: apify-scrapers → google-workspace Use case: Save scraped data to Google Drive for team access Flow:

  • Scrape LinkedIn posts from target accounts

  • Format data as CSV or JSON

  • Upload to Google Drive client folder via google-workspace

Trend Analysis + Content Strategy

Skills: apify-scrapers (trend-analysis) → content-generation Use case: Identify trending topics and create content strategy Flow:

  • Run trend analysis: python scripts/analyze_trends.py "AI productivity" --sources all

  • Review lifecycle stage and opportunity score

  • Use content-generation to create content for high-opportunity trends

  • Focus on emerging trends with high velocity scores

Competitive Trend Monitoring

Skills: apify-scrapers (trend-analysis) → parallel-research Use case: Monitor competitor visibility in trending topics Flow:

  • Analyze industry trends: python scripts/analyze_trends.py --category "your-industry" --discover

  • Compare your brand vs competitors in those trends

  • Use parallel-research for deep dive on gaps

  • Generate competitive intelligence report

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

firecrawl-scraping

No summary provided by upstream source.

Repository SourceNeeds Review
General

pr-summary

No summary provided by upstream source.

Repository SourceNeeds Review
General

pr-comments

No summary provided by upstream source.

Repository SourceNeeds Review
General

content-generation

No summary provided by upstream source.

Repository SourceNeeds Review