PDF Agent
Summary
- Use
pdfagentto perform PDF operations (merge, split, compress, convert, OCR, etc.) with detailed usage metering in the output. - Best for local, self-hosted processing where inputs/outputs must stay on disk.
- This skill ships source code in
pdfagent/and runs viauv runfromscripts/pdfagent_cli.py(no PyPI publish required).
Requirements
uvinstalled and on PATH.- System tools as needed by specific commands:
qpdf,ghostscript,poppler(pdftoppm),libreoffice,chromium(for HTML -> PDF), andocrmypdf.
Core Usage
- Merge PDFs with usage metrics:
uv run {baseDir}/scripts/pdfagent_cli.py merge file1.pdf file2.pdf --out merged.pdf --json - Split a PDF by ranges:
uv run {baseDir}/scripts/pdfagent_cli.py split input.pdf --range "1-3,5" --out-dir out_dir --json - Compress a PDF with a preset:
uv run {baseDir}/scripts/pdfagent_cli.py compress input.pdf --preset ebook --out compressed.pdf --json - Convert images to PDF:
uv run {baseDir}/scripts/pdfagent_cli.py jpg-to-pdf image1.jpg image2.png --out output.pdf --json - OCR a scanned PDF:
uv run {baseDir}/scripts/pdfagent_cli.py ocr scan.pdf --lang eng --out scan_ocr.pdf --json - Agent mode for multi-step instructions:
uv run {baseDir}/scripts/pdfagent_cli.py agent "merge then rotate 90 degrees every other page" -i file1.pdf -i file2.pdf --out out.pdf --json - Dependency/binary check:
uv run {baseDir}/scripts/pdfagent_cli.py doctor --json
Notes
- Use
--jsonfor machine-readable outputs (includesusageandoutputs). - For encrypted PDFs, pass
--passwordor per-file--passwords. - If a conversion tool is missing,
pdfagentmay use a fallback path and will note it in output or logs. - Optional Python deps are still command-specific:
uv run --with pdf2docx --with camelot-py[cv] --with pdfplumber --with pyhanko {baseDir}/scripts/pdfagent_cli.py <command> ...