local-ocr

Local OCR Pipeline Skill

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "local-ocr" with this command: npx skills add baphomet480/claude-skills/baphomet480-claude-skills-local-ocr

Local OCR Pipeline Skill

Robust Optical Character Recognition (OCR) pipeline driven by ocrmypdf and tesseract . Handles scanned PDFs, rotated image inputs, and raw text extraction securely and locally without external APIs.

Why not GPU via PyTorch/EasyOCR? The ocrmypdf tool is the industry standard for producing searchable PDFs. It leverages tesseract for pixel-accurate text placement. A pure-CPU pipeline is leaner (avoids a 1.5GB PyTorch payload) and reliably embeds text exactly where it appears in the scanned image.

Capabilities

  • Searchable PDF Generation: Converts rasterized/scanned PDFs or raw images (.jpg , .png , etc.) into PDFs with a selectable, searchable text layer.

  • Auto-Rotation & Deskew: Automatically detects incorrectly rotated text and straightens crooked scans.

  • Idempotent In-Place Processing: Safely processes files in-place using --skip-text , preventing double-processing of a PDF that already has embedded text.

  • Structured JSON Output: All commands output structured JSON, making failure states (like missing dependencies) parseable by agents.

  • Raw Text Extraction: Raw string extraction fallback for when agents need text directly in-memory instead of a PDF file.

Setup

Installs system dependencies (tesseract, ocrmypdf, ghostscript) and sets up isolated venv

bash skills/ocr/scripts/setup.sh

Usage

uv run --project ~/.local-ocr scripts/ocr.py <command>

  1. Generate a Searchable PDF (pdf )

Produces a standard, layered PDF. If you give it an image, it wraps it in a PDF. If you give it a scanned PDF, it adds the invisible text layer.

Overwrites the file in-place, skipping it safely if it already contains text

uv run --project ~/.local-ocr scripts/ocr.py pdf ./scanned_invoice.pdf

Output to a different file

uv run --project ~/.local-ocr scripts/ocr.py pdf ./scan_001.png -o ./contract.pdf

Force reprocessing (ignore existing text layer)

uv run --project ~/.local-ocr scripts/ocr.py pdf ./scanned_invoice.pdf --force

Note: By default, auto-rotate and deskew are enabled. Disable with --no-rotate or --no-deskew .

  1. Batch Process a Directory (batch )

Recursively scans a directory for images and PDFs, applying OCR.

Process all files. Skips already-OCRed PDFs.

uv run --project ~/.local-ocr scripts/ocr.py batch ./archives/

  1. Extract Raw Text (text )

Does not create a PDF. Just reads the words off the page and returns them as a JSON string. Good for agents reading documents on the fly.

uv run --project ~/.local-ocr scripts/ocr.py text ./han_solo_invoice.png

Franchise Examples (Star Wars)

  • Process the Death Star blueprints: uv run --project ~/.local-ocr scripts/ocr.py pdf ./ds-1_schematics.pdf

  • Extract raw orders: uv run --project ~/.local-ocr scripts/ocr.py text ./order_66_memo.jpg

  • Archive run: uv run --project ~/.local-ocr scripts/ocr.py batch /archives/jedi_temple

Troubleshooting

  • File already contains text: This is the most common "error", but it isn't an error. ocrmypdf returns exit code 6 when it skips a file that already has text. The wrapper script catches this and reports a JSON "status": "success" with a message noting the side-step.

  • Dependencies Missing: Run the setup.sh script again if the agent complains about missing tesseract or Python modules.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

kitchen-sink-design-system

No summary provided by upstream source.

Repository SourceNeeds Review
General

design-lookup

No summary provided by upstream source.

Repository SourceNeeds Review
General

nextjs-tinacms

No summary provided by upstream source.

Repository SourceNeeds Review
General

cloudflare-pages

No summary provided by upstream source.

Repository SourceNeeds Review