claw-text-and-pics

Extract text and images from documents via Mistral OCR

Give your OpenClaw agent the ability to read scanned documents, PDFs, and images — extracting clean Markdown text and cropping out embedded images. Powered by Mistral's OCR API.

When to use

Extract text from scanned documents, invoices, receipts, contracts
Pull embedded images from PDFs or scans
Convert handwritten notes or photos to searchable text
Send extracted images directly to Telegram

Usage

# Extract text only
python3 ocr.py --input scan.jpg

# Extract text from PDF (3 pages)
python3 ocr.py --input document.pdf --pages 3

# Extract embedded images
python3 ocr.py --input scan.jpg --extract-images --output-dir ./images/

# Extract images and send to Telegram
python3 ocr.py --input scan.jpg --extract-images --send --target 123456789

# Works with URLs too
python3 ocr.py --input https://example.com/document.pdf

Output

stdout: Extracted text as Markdown
Files: Cropped images saved to --output-dir (only with --extract-images)

Configuration

Set in ~/.openclaw/.env or as environment variables:

Variable	Required	Description
`MISTRAL_API_KEY`	Yes	Your Mistral API key
`TELEGRAM_BOT_TOKEN`	Only for `--send`	Your Telegram bot token
`TELEGRAM_CHAT_ID`	Optional	Default chat ID (overridable with `--target`)

Environment Variables

MISTRAL_API_KEY=required        # Mistral API key — get one at console.mistral.ai
TELEGRAM_BOT_TOKEN=optional     # Required only when using --send
TELEGRAM_CHAT_ID=optional       # Default target chat ID (overridable with --target)

This skill reads ~/.openclaw/.env as a fallback for credentials. Ensure the file has restricted permissions: chmod 600 ~/.openclaw/.env

Requirements

Python 3.11+
Mistral API key (console.mistral.ai)
Optional (only for --extract-images): pip install pillow

Parameters

Parameter	Required	Description
`--input`	Yes	Local path or URL to image/PDF
`--extract-images`	No	Crop and save embedded images
`--output-dir`	No	Output directory (default: `./extracted-images`)
`--send`	No	Send extracted images via Telegram
`--target`	No	Telegram chat ID (or `TELEGRAM_CHAT_ID` env var)
`--pages`	No	Number of PDF pages to process
`--debug`	No	Print raw API response

claw-text-and-pics

Safety Notice

Copy this and send it to your AI assistant to learn

claw-text-and-pics

When to use

Usage

Output

Configuration

Environment Variables

Requirements

Parameters

Source Transparency

Related Skills

Url Images To Pdf

PDFExtract Pull Text from PDFs

File to Markdown Converter

WPS PDF Processing