mineru-pdf

Parse PDFs locally (CPU) into Markdown/JSON using MinerU. Assumes MinerU creates per‑doc output folders; supports table/image extraction.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "mineru-pdf" with this command: npx skills add kesslerio/mineru-pdf-parser-clawdbot-skill/kesslerio-mineru-pdf-parser-clawdbot-skill-mineru-pdf

MinerU PDF

Overview

Parse a PDF locally with MinerU (CPU). Default output is Markdown + JSON. Use tables/images only when requested.

Quick start (single PDF)

# Run from the skill directory
./scripts/mineru_parse.sh /path/to/file.pdf

Optional examples:

./scripts/mineru_parse.sh /path/to/file.pdf --format json
./scripts/mineru_parse.sh /path/to/file.pdf --tables --images

When to read references

If flags differ from your wrapper or you need advanced defaults (backend/method/device/threads/format mapping), read:

references/mineru-cli.md

Output conventions

Output root defaults to ./mineru-output/.
MinerU creates the per-document subfolder under the output root (e.g., ./mineru-output/<basename>/...).

Batching

Default is single-PDF parsing. Only implement batch folder parsing if explicitly requested.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open in GitHub Open in ClawHub

Related Skills

Related by shared tags or category signals.

Automation

coding-router

No summary provided by upstream source.

Repository SourceNeeds Review

60-kesslerio

Automation

coding-agent

No summary provided by upstream source.

Repository SourceNeeds Review

22-kesslerio

Automation

pymupdf-pdf

No summary provided by upstream source.

Repository SourceNeeds Review

14-kesslerio

Research

academic-deep-research

No summary provided by upstream source.

Repository SourceNeeds Review

160-kesslerio