liteparse

Use this skill when the user asks to parse, perform multi-format document conversion or spatially extract text from an unstructured file (PDF, DOCX, PPTX, XLSX, images, etc.) locally without cloud dependencies.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "liteparse" with this command: npx skills add run-llama/llamaparse-agent-skills/run-llama-llamaparse-agent-skills-liteparse

LiteParse Skill

Parse unstructured documents (PDF, DOCX, PPTX, XLSX, images, and more) locally with LiteParse: fast, lightweight, no cloud dependencies or LLM required.

Initial Setup

When this skill is invoked, respond with:

I'm ready to use LiteParse to parse files locally. Before we begin, please confirm that:

- `@llamaindex/liteparse` is installed globally (`npm i -g @llamaindex/liteparse`)
- The `lit` CLI command is available in your terminal

If both are set, please provide:

1. One or more files to parse (PDF, DOCX, PPTX, XLSX, images, etc.)
2. Any specific options: output format (json/text), page ranges, OCR preferences, DPI, etc.
3. What you'd like to do with the parsed content.

I will produce the appropriate `lit` CLI command or TypeScript script, and once approved, report the results.

Then wait for the user's input.


Step 0 — Install LiteParse (if needed)

If liteparse is not yet installed, install it globally:

npm i -g @llamaindex/liteparse

Verify installation:

lit --version

For Office document support (DOCX, PPTX, XLSX), LibreOffice is required:

# macOS
brew install --cask libreoffice

# Ubuntu/Debian
apt-get install libreoffice

For image parsing, ImageMagick is required:

# macOS
brew install imagemagick

# Ubuntu/Debian
apt-get install imagemagick

Step 1 — Produce the CLI Command or Script

Parse a Single File

# Basic text extraction
lit parse document.pdf

# JSON output saved to a file
lit parse document.pdf --format json -o output.json

# Specific page range
lit parse document.pdf --target-pages "1-5,10,15-20"

# Disable OCR (faster, text-only PDFs)
lit parse document.pdf --no-ocr

# Use an external HTTP OCR server for higher accuracy
lit parse document.pdf --ocr-server-url http://localhost:8828/ocr

# Higher DPI for better quality
lit parse document.pdf --dpi 300

Batch Parse a Directory

lit batch-parse ./input-directory ./output-directory

# Only process PDFs, recursively
lit batch-parse ./input ./output --extension .pdf --recursive

Generate Page Screenshots

Screenshots are useful for LLM agents that need to see visual layout.

# All pages
lit screenshot document.pdf -o ./screenshots

# Specific pages
lit screenshot document.pdf --pages "1,3,5" -o ./screenshots

# High-DPI PNG
lit screenshot document.pdf --dpi 300 --format png -o ./screenshots

# Page range
lit screenshot document.pdf --pages "1-10" -o ./screenshots

Step 3 — Key Options Reference

OCR Options

OptionDescription
(default)Tesseract.js — zero setup, built-in
--ocr-language fraSet OCR language (ISO code)
--ocr-server-url <url>Use external HTTP OCR server (EasyOCR, PaddleOCR, custom)
--no-ocrDisable OCR entirely

Output Options

OptionDescription
--format jsonStructured JSON with bounding boxes
--format textPlain text (default)
-o <file>Save output to file

Performance / Quality Options

OptionDescription
--dpi <n>Rendering DPI (default: 150; use 300 for high quality)
--max-pages <n>Limit pages parsed
--target-pages <pages>Parse specific pages (e.g. "1-5,10")
--no-precise-bboxDisable precise bounding boxes (faster)
--skip-diagonal-textIgnore rotated/diagonal text
--preserve-small-textKeep very small text that would otherwise be dropped

Step 4 — Using a Config File

For repeated use with consistent options, generate a liteparse.config.json:

{
  "ocrLanguage": "en",
  "ocrEnabled": true,
  "maxPages": 1000,
  "dpi": 150,
  "outputFormat": "json",
  "preciseBoundingBox": true,
  "skipDiagonalText": false,
  "preserveVerySmallText": false
}

For an HTTP OCR server:

{
  "ocrServerUrl": "http://localhost:8828/ocr",
  "ocrLanguage": "en",
  "outputFormat": "json"
}

Use with:

lit parse document.pdf --config liteparse.config.json

Step 5 — HTTP OCR Server API (Advanced)

If the user wants to plug in a custom OCR backend, the server must implement:

  • Endpoint: POST /ocr
  • Accepts: file (multipart) and language (string) parameters
  • Returns:
{
  "results": [
    { "text": "Hello", "bbox": [x1, y1, x2, y2], "confidence": 0.98 }
  ]
}

Ready-to-use wrappers exist for EasyOCR and PaddleOCR in the LiteParse repo.


Supported Input Formats

CategoryFormats
PDF.pdf
Word.doc, .docx, .docm, .odt, .rtf
PowerPoint.ppt, .pptx, .pptm, .odp
Spreadsheets.xls, .xlsx, .xlsm, .ods, .csv, .tsv
Images.jpg, .jpeg, .png, .gif, .bmp, .tiff, .webp, .svg

Office documents require LibreOffice; images require ImageMagick. LiteParse auto-converts these formats to PDF before parsing.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

sync-discord-identity

Sync a Discord bot profile into an OpenClaw agent IDENTITY.md, save the avatar under workspace/avatars, and safely add Avatar and Discord metadata.

Archived SourceRecently Updated
Automation

electric-vehicle-detection-analysis

Automatically detects electric motorcycles and e-bikes in restricted areas based on computer vision. It supports real-time detection for both video streams and images, counts the number of illegal parking or driving instances, and triggers violation alerts to assist with safety management in parks, communities, and organizations. | 电动车智能检测技能,基于计算机视觉自动检测禁行区域内的电动摩托车/电动车,支持视频流和图片实时检测,统计违规停放/行驶数量,触发违规预警,助力园区/社区/单位安全管理

Archived SourceRecently Updated
Automation

reddit-skills

Reddit automation skill collection. Supports authentication, content publishing, search & discovery, social interactions, and compound operations. Triggered when a user asks to operate Reddit (post, search, comment, login, analyze, upvote, save).

Archived SourceRecently Updated
Automation

添加飞书机器人

# Agent Creator 技能

Archived SourceRecently Updated