Docling Convert
Use this skill to run document conversion through a local Docling service instead of ad-hoc parsing.
Quick Start
- Assume the Docling service is already deployed locally and reachable at
http://localhost:5001. - Prefer
scripts/docling_gradio_convert.pyfor repeatable work. It wraps the documented Gradio API and handles submission, waiting, and archive extraction. - Install the required client before running the script:
pip install gradio_client
- If URL jobs need placeholder image repair and
beautifulsoup4is missing, install it:
pip install beautifulsoup4 lxml
- Read
references/gradio-api-workflow.mdonly when changing endpoints, tuning advanced options, or debugging output layouts.
Workflow
-
Classify the inputs. Use the file flow for local paths and the URL flow for web pages. Do not mix files and URLs in one API request; if the user gives both, run two jobs.
-
Choose the outputs. Default to
md. Addjsonwhen the user also needs structured output. Addhtml,text, ordoctagsonly when the task explicitly needs them. -
Choose the processing options. Keep
pipeline=standard,ocr=true,force_ocr=false,pdf_backend=dlparse_v4, andtable_mode=accurateunless the task calls for a change. Keepimage_export_mode=embeddedwhen the goal is to preserve extracted images. The wrapper post-processes embedded Markdown images into real files underimages/. Turn on enrichment flags only when the user explicitly wants code, formulas, picture classification, or picture descriptions. -
Run the wrapper script.
# Single file
python scripts/docling_gradio_convert.py report.pdf
# Batch files with Markdown + JSON
python scripts/docling_gradio_convert.py "*.pdf" --to-format md --to-format json
# Single URL
python scripts/docling_gradio_convert.py https://example.com/article --output-dir ./article
# Alternate service URL
python scripts/docling_gradio_convert.py slides.pptx --service-url http://localhost:5001
- Verify the extracted results.
The script always requests
return_as_file=true, downloads the returned artifact, extracts it into the chosen output directory, rewrites embedded Markdown images into local files when needed, and for URL conversions can backfill Docling image placeholders from the source page. Inspect the produced Markdown plus any extracted image assets before presenting the result to the user.
Output Conventions
- Prefer the script defaults unless the user asks for a different layout.
- For a single local file, extract into a sibling directory named after the input stem.
- For a single URL, extract into
docling-<slug>under the current working directory. - For multiple inputs, extract into
docling-files-batchordocling-urls-batchunder the current working directory, unless--output-diris supplied. - If the user supplies
--output-dirand both file and URL jobs are needed, the script createsfiles/andurls/subdirectories to keep the results separate.
Script Notes
- Use
scripts/docling_gradio_convert.py --dry-run ...to verify grouping, endpoint selection, and destination paths without contacting the service. - Let the script infer the Gradio UI URL from the service root.
http://localhost:5001becomeshttp://localhost:5001/ui/. - Let the script ask
/change_ocr_langfor the default OCR language set when--ocr-langis not provided. Fall back toen,fr,de,esif the endpoint is unavailable. - Treat a missing
gradio_clientinstallation as an environment issue and fix it withpip install gradio_clientinstead of rewriting the workflow. - If a URL conversion returns
<!-- 🖼️❌ Image not available ... -->, let the wrapper fetch the source page, collect article images, download them intoimages/, and replace placeholders in order.
Resources
scripts/docling_gradio_convert.py
Use this wrapper for deterministic Docling conversions. It supports:
- local files, URLs, and wildcard expansion
- batch conversion
- OCR and enrichment flags
- archive download and extraction
- output directory planning
- dry-run validation
references/gradio-api-workflow.md
Read this reference when you need:
- the endpoint mapping for file versus URL jobs
- the argument names expected by the Gradio client
- the
wait_task_finishtuple layout - the defaults adopted by this skill