ocr

OCR Image Text Extraction Skill

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "ocr" with this command: npx skills add trpc-group/trpc-agent-go/trpc-group-trpc-agent-go-ocr

OCR Image Text Extraction Skill

Extract text from images using Tesseract OCR engine.

Capabilities

  • Extract text from image files (PNG, JPG, JPEG, GIF, BMP, TIFF)

  • Support for 100+ languages

  • Optional image preprocessing for better accuracy

  • Output in plain text or JSON format with confidence scores

Usage

Basic OCR

python3 scripts/ocr.py <image_file> <output_file>

With Options

Specify language (default: eng)

python3 scripts/ocr.py image.png text.txt --lang eng

Chinese text

python3 scripts/ocr.py image.png text.txt --lang chi_sim

Multiple languages

python3 scripts/ocr.py image.png text.txt --lang eng+chi_sim

With image preprocessing (improves accuracy)

python3 scripts/ocr.py image.png text.txt --preprocess

JSON output with confidence scores

python3 scripts/ocr.py image.png output.json --format json

Download and OCR from URL

OCR from remote image

python3 scripts/ocr_url.py <image_url> <output_file>

With options

python3 scripts/ocr_url.py https://example.com/image.jpg text.txt --lang eng --preprocess

Parameters

  • image_file / image_url (required): Path to local image or image URL

  • output_file (required): Path to output text/JSON file

  • --lang : Language code (e.g., eng, chi_sim, jpn, fra, deu). Default: eng

  • --preprocess : Apply image preprocessing (grayscale, thresholding) for better accuracy

  • --format : Output format (text/json, default: text)

Common Languages

Language Code

English eng

Chinese (Simplified) chi_sim

Chinese (Traditional) chi_tra

Japanese jpn

Korean kor

French fra

German deu

Spanish spa

Russian rus

Arabic ara

Supported Image Formats

PNG, JPG, JPEG, GIF, BMP, TIFF, WEBP

Dependencies

  • Python 3.8+

  • pytesseract

  • Pillow (PIL)

  • tesseract-ocr (system package)

Installation

Python packages

pip install pytesseract Pillow

Tesseract OCR engine

sudo apt-get install tesseract-ocr # Ubuntu/Debian sudo yum install tesseract # CentOS/RHEL brew install tesseract # macOS

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

user-file-ops

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

file-tools

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

recommend_poi

No summary provided by upstream source.

Repository SourceNeeds Review