File to Markdown Converter

# File to Markdown — Skill ## Overview Convert files into **clean, structured, AI-ready Markdown** using the `markdown.new` API powered by **Cloudflare Workers AI toMarkdown()**. Supports 20+ formats including documents, spreadsheets, images, and structured data. No authentication required (500 requests/day per IP). --- ## When to Use This Skill Use this skill whenever you need to: * Extract text from files for LLM processing * Convert PDFs or Office files into Markdown * Normalize data into structured text * Process uploaded user files * Scrape webpage content into Markdown * Convert images into AI-generated descriptions + content Common AI workflows: * RAG ingestion pipelines * Knowledge base creation * Document summarization * Dataset extraction * Spreadsheet analysis * OCR-like extraction from images --- ## Supported Formats ### Documents * `.pdf` * `.docx` * `.odt` ### Spreadsheets * `.xlsx` * `.xls` * `.xlsm` * `.xlsb` * `.et` * `.ods` * `.numbers` ### Images * `.jpg` * `.jpeg` * `.png` * `.webp` * `.svg` ### Text & Structured Data * `.txt` * `.md` * `.csv` * `.json` * `.xml` * `.html` * `.htm` Notes: * Image conversion uses AI object detection + summarization. * HTML URL conversion uses a web page pipeline. * Uploaded HTML uses Workers AI conversion. --- ## API Base URL ``` https://markdown.new ``` --- ## Endpoints ### 1️⃣ Convert Remote File (Simple GET) Returns plain Markdown text. ``` GET /:file-url ``` Example: ```bash curl -s "https://markdown.new/https://example.com/report.pdf" ``` --- ### 2️⃣ Convert Remote File (JSON Response) Returns metadata + Markdown. ``` GET /:file-url?format=json ``` Example: ```bash curl -s "https://markdown.new/https://example.com/report.pdf?format=json" ``` --- ### 3️⃣ Convert Remote File via POST Use when you want structured JSON response. ``` POST / Content-Type: application/json ``` Body: ```json { "url": "https://example.com/report.pdf" } ``` Example: ```bash curl -s https://markdown.new/ \ -H "Content-Type: application/json" \ -d '{"url": "https://example.com/report.pdf"}' ``` --- ### 4️⃣ Upload Local File Use when file is not publicly accessible. ``` POST /convert multipart/form-data ``` Example: ```bash curl -s https://markdown.new/convert \ -F "file=@document.pdf" ``` --- ## Response Formats ### URL Conversion Response ```json { "success": true, "url": "https://example.com/report.pdf", "title": "Quarterly Report", "content": "# Quarterly Report\n\n...", "method": "Workers AI (file)", "duration_ms": 1200, "tokens": 850 } ``` --- ### Upload Conversion Response ```json { "success": true, "data": { "title": "Q4 Report", "content": "# Q4 Report\n\n...", "filename": "report.xlsx", "file_type": ".xlsx", "tokens": 1250, "processing_time_ms": 320 } } ``` --- ## Best Practices for AI Agents ### Prefer GET for Simple Workflows Use: ``` GET /:url ``` When: * You only need Markdown text * Speed is important * No metadata required --- ### Prefer POST for Structured Pipelines Use POST when: * Metadata is needed * Token counts are required * Monitoring or logging is implemented * Building automation workflows --- ### File Upload Strategy Use `/convert` only if: * File is local * File is private * File requires authentication to access Otherwise always prefer URL conversion. --- ## Error Handling Strategy Agents should: 1. Check `"success": true` 2. Retry once if network failure 3. Validate content length > 0 4. Fallback to alternate extraction if needed --- ## Rate Limits * 500 requests/day per IP without API key * No signup required Agents should: * Cache results when possible * Avoid duplicate conversions --- ## Integration Examples ### JavaScript (Node.js) ```js const res = await fetch("https://markdown.new/", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ url: "https://example.com/file.pdf" }) }); const data = await res.json(); console.log(data.content); ``` --- ### Python ```python import requests res = requests.post( "https://markdown.new/", json={"url": "https://example.com/file.pdf"} ) data = res.json() print(data["content"]) ``` --- ## Agent Decision Tree If user provides: | Input Type | Action | | --------------- | ---------------------- | | Public file URL | Use GET or POST | | Local file | Use POST /convert | | Image | Convert then summarize | | Spreadsheet | Convert then analyze | | Webpage | Convert URL HTML | --- ## Output Expectations The Markdown should be: * Clean * Structured * AI-friendly * Minimal noise * Ready for LLM ingestion --- ## Limitations * Complex PDF layouts may lose formatting * Large spreadsheets may be truncated * Images rely on AI interpretation accuracy * Token limits may apply --- ## Summary This skill provides a **universal file-to-Markdown conversion layer** for AI systems with: * No authentication * Simple HTTP interface * Multi-format support * Structured output * Fast processing Ideal for document ingestion, RAG pipelines, and automation agents. ---

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "File to Markdown Converter" with this command: npx skills add alaminrifat/file-to-markdown

File to Markdown — Skill

Overview

Convert files into clean, structured, AI-ready Markdown using the markdown.new API powered by Cloudflare Workers AI toMarkdown().

Supports 20+ formats including documents, spreadsheets, images, and structured data.

No authentication required (500 requests/day per IP).


When to Use This Skill

Use this skill whenever you need to:

  • Extract text from files for LLM processing
  • Convert PDFs or Office files into Markdown
  • Normalize data into structured text
  • Process uploaded user files
  • Scrape webpage content into Markdown
  • Convert images into AI-generated descriptions + content

Common AI workflows:

  • RAG ingestion pipelines
  • Knowledge base creation
  • Document summarization
  • Dataset extraction
  • Spreadsheet analysis
  • OCR-like extraction from images

Supported Formats

Documents

  • .pdf
  • .docx
  • .odt

Spreadsheets

  • .xlsx
  • .xls
  • .xlsm
  • .xlsb
  • .et
  • .ods
  • .numbers

Images

  • .jpg
  • .jpeg
  • .png
  • .webp
  • .svg

Text & Structured Data

  • .txt
  • .md
  • .csv
  • .json
  • .xml
  • .html
  • .htm

Notes:

  • Image conversion uses AI object detection + summarization.
  • HTML URL conversion uses a web page pipeline.
  • Uploaded HTML uses Workers AI conversion.

API Base URL

https://markdown.new

Endpoints

1️⃣ Convert Remote File (Simple GET)

Returns plain Markdown text.

GET /:file-url

Example:

curl -s "https://markdown.new/https://example.com/report.pdf"

2️⃣ Convert Remote File (JSON Response)

Returns metadata + Markdown.

GET /:file-url?format=json

Example:

curl -s "https://markdown.new/https://example.com/report.pdf?format=json"

3️⃣ Convert Remote File via POST

Use when you want structured JSON response.

POST /
Content-Type: application/json

Body:

{
  "url": "https://example.com/report.pdf"
}

Example:

curl -s https://markdown.new/ \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/report.pdf"}'

4️⃣ Upload Local File

Use when file is not publicly accessible.

POST /convert
multipart/form-data

Example:

curl -s https://markdown.new/convert \
  -F "file=@document.pdf"

Response Formats

URL Conversion Response

{
  "success": true,
  "url": "https://example.com/report.pdf",
  "title": "Quarterly Report",
  "content": "# Quarterly Report\n\n...",
  "method": "Workers AI (file)",
  "duration_ms": 1200,
  "tokens": 850
}

Upload Conversion Response

{
  "success": true,
  "data": {
    "title": "Q4 Report",
    "content": "# Q4 Report\n\n...",
    "filename": "report.xlsx",
    "file_type": ".xlsx",
    "tokens": 1250,
    "processing_time_ms": 320
  }
}

Best Practices for AI Agents

Prefer GET for Simple Workflows

Use:

GET /:url

When:

  • You only need Markdown text
  • Speed is important
  • No metadata required

Prefer POST for Structured Pipelines

Use POST when:

  • Metadata is needed
  • Token counts are required
  • Monitoring or logging is implemented
  • Building automation workflows

File Upload Strategy

Use /convert only if:

  • File is local
  • File is private
  • File requires authentication to access

Otherwise always prefer URL conversion.


Error Handling Strategy

Agents should:

  1. Check "success": true
  2. Retry once if network failure
  3. Validate content length > 0
  4. Fallback to alternate extraction if needed

Rate Limits

  • 500 requests/day per IP without API key
  • No signup required

Agents should:

  • Cache results when possible
  • Avoid duplicate conversions

Integration Examples

JavaScript (Node.js)

const res = await fetch("https://markdown.new/", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    url: "https://example.com/file.pdf"
  })
});

const data = await res.json();
console.log(data.content);

Python

import requests

res = requests.post(
    "https://markdown.new/",
    json={"url": "https://example.com/file.pdf"}
)

data = res.json()
print(data["content"])

Agent Decision Tree

If user provides:

Input TypeAction
Public file URLUse GET or POST
Local fileUse POST /convert
ImageConvert then summarize
SpreadsheetConvert then analyze
WebpageConvert URL HTML

Output Expectations

The Markdown should be:

  • Clean
  • Structured
  • AI-friendly
  • Minimal noise
  • Ready for LLM ingestion

Limitations

  • Complex PDF layouts may lose formatting
  • Large spreadsheets may be truncated
  • Images rely on AI interpretation accuracy
  • Token limits may apply

Summary

This skill provides a universal file-to-Markdown conversion layer for AI systems with:

  • No authentication
  • Simple HTTP interface
  • Multi-format support
  • Structured output
  • Fast processing

Ideal for document ingestion, RAG pipelines, and automation agents.


Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Markdown to PDF (Styled)

Convert Markdown files to styled PDFs using pandoc and wkhtmltopdf with built-in or custom CSS style options.

Registry SourceRecently Updated
0303
Profile unavailable
General

Mxe

Convert Markdown files to PDF, DOCX, or HTML with advanced formatting, Mermaid diagrams, custom fonts, and table of contents support.

Registry SourceRecently Updated
01.3K
Profile unavailable
General

Markitdown Converter

使用微软 markitdown 库将多种文档格式(PDF、DOC、DOCX、PPT、HTML等)转换为 Markdown。支持批量转换、保留格式、图片提取等功能。使用场景:(1) "把这个 PDF 转成 Markdown",(2) "批量转换这个文件夹里的文档",(3) "提取文档中的图片"。

Registry SourceRecently Updated
060
Profile unavailable