doc2x-ocr-markdown

Convert PDF or image files to Markdown with Doc2X OCR and extract embedded images to local files. Use when tasks mention Doc2X, OCR, PDF/image-to-Markdown conversion, formula-aware document parsing, or when only DOC2X_APIKEY is provided and a local conversion wrapper is needed.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "doc2x-ocr-markdown" with this command: npx skills add jysd-ai/skills/jysd-ai-skills-doc2x-ocr-markdown

Doc2X OCR Markdown

Overview

Convert a single PDF or image into Markdown and extract image assets with one local script:

  • scripts/doc2x_ocr.py

Require only one credential:

  • DOC2X_APIKEY

Quick Start

Set API key:

export DOC2X_APIKEY='sk-...'

Run PDF OCR to Markdown + images:

python scripts/doc2x_ocr.py pdf ./input.pdf --outdir ./output

Run image OCR to Markdown + images:

python scripts/doc2x_ocr.py image ./page.png --outdir ./output

Workflow

  1. Validate DOC2X_APIKEY.
  2. Choose conversion mode from input file type.
  3. Run scripts/doc2x_ocr.py.
  4. Return output folder and generated Markdown path.

Modes

PDF Mode

Use the asynchronous Doc2X PDF flow:

  1. POST /api/v2/parse/preupload
  2. PUT file bytes to returned upload URL
  3. Poll GET /api/v2/parse/status
  4. Trigger export POST /api/v2/convert/parse (to=md)
  5. Poll GET /api/v2/convert/parse/result
  6. Download zip, extract files, locate Markdown

Useful options:

  • --formula-mode dollar|normal (default dollar)
  • --merge-cross-page-forms
  • --poll-interval
  • --timeout
  • --keep-zip

Image Mode

Use synchronous image layout OCR:

  1. POST /api/v2/parse/img/layout with binary image body
  2. Write page Markdown from response
  3. If convert_zip exists, decode and extract image resources

Output Contract

For input <name>.pdf or <name>.png, script writes:

  • <outdir>/<name>/... extracted files
  • <outdir>/<name>/<name>.md if no Markdown file exists in extracted content

Script prints a JSON summary with:

  • mode
  • uid
  • output_dir
  • markdown
  • zip (only when --keep-zip)

References

Read these files when you need deeper context:

  • references/api-quick-reference.md for endpoint behavior and limits
  • references/implementation-notes.md for relation to the copied official doc2x.py

Troubleshooting

  • Handle parse_task_limit_exceeded or parse_concurrency_limit by reducing concurrent jobs and retrying later.
  • Split huge PDFs if parse timeout or page-limit errors occur.
  • Keep poll interval between 1 and 3 seconds for status APIs unless there is a strong reason to change.
  • Save outputs promptly because official docs state cloud parse results are temporary.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

OpenClaw Skill Growth

Make OpenClaw Skills observable, diagnosable, and safely improvable over time. Use this when the user wants to maintain many SKILL.md files, inspect repeated...

Registry SourceRecently Updated
181Profile unavailable
General

Find Skills for ClawHub

Search for and discover OpenClaw skills from ClawHub (the official skill registry). Activate when user asks about finding skills, installing skills, or wants...

Registry SourceRecently Updated
2911Profile unavailable
General

Skill Listing Polisher

Improve a skill's public listing before publish. Use when tightening title, description, tags, changelog, and scan-friendly packaging so the listing looks cl...

Registry SourceRecently Updated
1130Profile unavailable
General

Skill Priority Setup

Scans installed skills, suggests L0-L3 priority tiers, and auto-configures skill injection policy. Use when: setting up skill priorities, optimizing token bu...

Registry SourceRecently Updated
2510Profile unavailable