api

Full DeepRead API reference. All endpoints, auth, request/response formats, blueprints, webhooks, error handling, and code examples.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "api" with this command: npx skills add deepread-tech/skills/deepread-tech-skills-api

DeepRead API Reference

You are helping a developer integrate DeepRead into their application. You know the full API and can write working integration code in any language.

Base URL: https://api.deepread.tech Auth: X-API-Key header with key from https://www.deepread.tech/dashboard or via the device authorization flow (see Agent Authentication below)


Agent Authentication (Device Authorization Flow)

These endpoints let an AI agent obtain an API key without the user ever copy/pasting secrets. Based on OAuth 2.0 Device Authorization Grant (RFC 8628).

POST /v1/agent/device/code — Request a Device Code

Auth: None (public endpoint) Content-Type: application/json

{"agent_name": "my-agent"}
ParameterTypeRequiredDescription
agent_namestringNoDisplay name shown to the user during approval (e.g. "Claude Code", "My CI Bot"). Optional but strongly recommended — without it, the user sees "Unknown Agent".

Response (200 OK):

{
  "device_code": "a7f3c9d2e1b8...",
  "user_code": "HXKP-3MNV",
  "verification_uri": "https://www.deepread.tech/activate",
  "verification_uri_complete": "https://www.deepread.tech/activate?code=HXKP-3MNV",
  "expires_in": 900,
  "interval": 5
}
FieldDescription
device_codeSecret code for polling — never show this to the user
user_codeShort code the user enters in their browser (format: XXXX-XXXX)
verification_uriBase URL for manual code entry
verification_uri_completeURL with code pre-filled — open this to skip manual entry (preferred)
expires_inSeconds until the code expires (default: 900 = 15 minutes)
intervalMinimum seconds between poll requests

POST /v1/agent/device/token — Poll for API Key

Auth: None (public endpoint) Content-Type: application/json

{"device_code": "a7f3c9d2e1b8..."}

Poll this endpoint every interval seconds after the user has been shown the code.

Responses:

Scenarioerror fieldapi_key fieldAction
User hasn't acted yet"authorization_pending"nullWait interval seconds, poll again
User approvednull"sk_live_..."Save the key, stop polling
User denied"access_denied"nullStop polling, inform user
Code expired"expired_token"nullStart over with a new device code

The response always includes all three fields (error, api_key, key_prefix). Check api_key != null to detect success — don't rely on key presence alone.

Important:

  • The api_key is returned exactly once. After you retrieve it, the server clears it. Store it immediately.
  • The key_prefix is a non-secret identifier for the key (useful for display/logging).
  • Never show device_code or api_key to the user.

What happens on the user's side (you don't need to call these):

  • User opens verification_uri_complete — the code is pre-filled, no typing needed
  • User logs in (or signs up + confirms email for new users)
  • User sees your agent name and clicks Approve → redirected to dashboard
  • Once approved, the next poll to /v1/agent/device/token returns the api_key

Processing

POST /v1/process — Submit a Document

Uploads a document for async processing. Returns immediately with a job ID.

Auth: X-API-Key: YOUR_KEY Content-Type: multipart/form-data

ParameterTypeRequiredDefaultDescription
fileFileYesPDF, PNG, JPG, or JPEG
pipelinestringNo"standard""standard" or "searchable"
schemastringNoJSON Schema for structured extraction
blueprint_idstringNoBlueprint UUID (mutually exclusive with schema)
include_imagesstringNo"true"Generate preview images and page data
include_pagesstringNo"false"Per-page breakdown (auto-enabled when include_images=true)
webhook_urlstringNoHTTPS URL to notify on completion
versionstringNoPipeline version for reproducibility

Note: Provide schema OR blueprint_id, not both. Without either, only OCR text is returned.

Response (200 OK):

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued"
}

Errors:

StatusMeaning
400Invalid schema, unsupported file type, both schema and blueprint_id provided
401Invalid or missing API key
413File exceeds plan limit (15MB free, 50MB paid)
429Monthly page quota exceeded or rate limit hit

GET /v1/jobs/{job_id} — Get Results

Poll until status is completed or failed. Recommended: wait 5s, then poll every 5-10s with exponential backoff, max 5 minutes.

Auth: X-API-Key: YOUR_KEY

Response (completed):

{
  "id": "550e8400-...",
  "status": "completed",
  "created_at": "2025-01-18T10:30:00Z",
  "completed_at": "2025-01-18T10:32:15Z",
  "result": {
    "text": "Full extracted text in markdown",
    "text_preview": "First 500 characters...",
    "text_url": "https://...",
    "data": {
      "vendor": {"value": "Acme Inc", "hil_flag": false, "found_on_page": 1},
      "total": {"value": 1250.00, "hil_flag": true, "reason": "Outside typical range", "found_on_page": 1}
    },
    "pages": [
      {
        "page_number": 1,
        "text": "Page 1 text...",
        "hil_flag": false,
        "review_reason": null,
        "data": {}
      }
    ]
  },
  "metadata": {
    "page_count": 3,
    "pipeline": "standard",
    "review_percentage": 5.0,
    "fields_requiring_review": 1,
    "total_fields": 20,
    "step_timings": {}
  },
  "preview_url": "https://preview.deepread.tech/token123...",
  "webhook_url": "https://yourapp.com/webhook",
  "webhook_delivered": true
}

Notes:

  • text_url is provided when full text exceeds 1MB — fetch from this URL instead
  • text_preview is always the first 500 characters
  • data is only present if schema or blueprint_id was provided
  • pages is present when include_pages=true or include_images=true
  • preview_url is a shareable link (no auth needed) to the HIL review interface

Response (failed):

{
  "id": "550e8400-...",
  "status": "failed",
  "error": "PDF parsing failed: file may be corrupted"
}

Statuses: queuedprocessingcompleted or failed


GET /v1/preview/{token} — Public Preview (No Auth)

Returns document preview data. Anyone with the token can view — no API key needed. Use for sharing results with stakeholders.

{
  "file_name": "invoice.pdf",
  "status": "completed",
  "created_at": "2025-01-18T10:30:00Z",
  "pages": [
    {
      "page_number": 1,
      "image_url": "https://...",
      "text": "Page text...",
      "hil_flag": false,
      "data": {}
    }
  ],
  "data": {},
  "metadata": {"page_count": 1, "pipeline": "standard", "review_percentage": 0}
}

GET /v1/pipelines — List Pipelines (No Auth)

  • standard — Multi-model consensus (GPT + Gemini), dual OCR with LLM judge, ~2-3 minutes
  • searchable — Creates searchable PDF with embedded OCR text layer, ~3-4 minutes

Blueprints & Optimizer

Blueprints are optimized, versioned schemas. The optimizer takes your sample documents + expected values and enhances field descriptions for 20-30% accuracy improvement.

GET /v1/blueprints/ — List Blueprints

Auth: X-API-Key: YOUR_KEY

Returns all blueprints with active version and accuracy metrics.

GET /v1/blueprints/{blueprint_id} — Get Blueprint Details

Auth: X-API-Key: YOUR_KEY

Returns blueprint with all versions, active version schema, and accuracy metrics.

POST /v1/optimize — Start Optimization

Auth: X-API-Key: YOUR_KEY

{
  "name": "utility_invoice",
  "description": "Utility bill extraction",
  "document_type": "invoice",
  "initial_schema": {"type": "object", "properties": {...}},
  "training_documents": ["path1.pdf", "path2.pdf"],
  "ground_truth_data": [{"vendor": "Electric Co", "total": 150.00}, ...],
  "target_accuracy": 95.0,
  "max_iterations": 5,
  "max_cost_usd": 10.0
}
  • initial_schema is optional — auto-generated from ground truth if omitted
  • Minimum 2 training documents
  • validation_split (default 0.3) — fraction held out for validation

Response:

{
  "job_id": "...",
  "blueprint_id": "...",
  "status": "pending"
}

POST /v1/optimize/resume — Resume Optimization

Resume a failed job or start a new optimization run for an existing blueprint.

GET /v1/blueprints/jobs/{job_id} — Optimization Job Status

Auth: X-API-Key: YOUR_KEY

{
  "status": "running",
  "iteration": 2,
  "baseline_accuracy": 68.0,
  "current_accuracy": 88.0,
  "target_accuracy": 95.0,
  "total_cost": 1.82,
  "max_cost_usd": 10.0
}

Statuses: pendinginitializingrunningcompleted, failed, or cancelled

GET /v1/blueprints/jobs/{job_id}/schema — Get Optimized Schema

Returns the optimized JSON schema after optimization completes.

Using a Blueprint

curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: YOUR_KEY" \
  -F "file=@invoice.pdf" \
  -F "blueprint_id=660e8400-..."

Webhooks

Pass webhook_url when submitting a document to get notified on completion.

Payload sent to your URL:

{
  "event": "job.completed",
  "job_id": "550e8400-...",
  "status": "completed",
  "result": {"text": "...", "data": {}},
  "metadata": {},
  "preview_url": "https://preview.deepread.tech/..."
}

Important:

  • Webhooks are NOT authenticated — always fetch the canonical result via GET /v1/jobs/{job_id} with your API key
  • Must be HTTPS
  • Return 2xx to confirm delivery
  • Delivery is best-effort — use polling as fallback if webhook not received
  • Make your endpoint idempotent (may receive duplicates)

Rate Limits

Every response includes these headers:

HeaderDescription
X-RateLimit-LimitMonthly pages in your plan
X-RateLimit-RemainingPages remaining this cycle
X-RateLimit-UsedPages used this cycle
X-RateLimit-ResetUnix timestamp when quota resets

Plans:

PlanPages/monthMax filePer-doc limitRate limit
Free2,00015 MB50 pages10 req/min
Pro ($99/mo)50,00050 MBUnlimited100 req/min
Scale1,000,00050 MBUnlimited500 req/min

Error Handling

All errors return:

{"detail": "Human-readable error message"}
StatusMeaning
400Bad request — invalid schema, unsupported file, both schema + blueprint_id
401Invalid or missing API key
404Job not found
413File too large for your plan
429Rate limit or monthly quota exceeded
500Server error

Quota exceeded (429):

{
  "detail": {
    "error": "page_count_exceeded",
    "message": "Document has 100 pages, exceeds 50-page limit for FREE plan. Upgrade to PRO.",
    "page_count": 100,
    "max_pages": 50,
    "plan": "free"
  }
}

Common failure reasons in jobs:

  • Document issues: corrupted, unreadable, poor scan quality, processing timeout
  • Schema issues: invalid JSON Schema, required fields not found
  • Plan limits: file too large, too many pages, quota exceeded

Code Examples

Python

import requests
import time
import json

API_KEY = "sk_live_YOUR_KEY"
BASE = "https://api.deepread.tech"

# Submit document with structured extraction
schema = {
    "type": "object",
    "properties": {
        "vendor": {"type": "string", "description": "Vendor or company name"},
        "total": {"type": "number", "description": "Total amount due"},
        "due_date": {"type": "string", "description": "Payment due date"}
    }
}

with open("invoice.pdf", "rb") as f:
    resp = requests.post(
        f"{BASE}/v1/process",
        headers={"X-API-Key": API_KEY},
        files={"file": f},
        data={"schema": json.dumps(schema)}
    )
job_id = resp.json()["id"]

# Poll with exponential backoff
delay = 5
while True:
    time.sleep(delay)
    result = requests.get(
        f"{BASE}/v1/jobs/{job_id}",
        headers={"X-API-Key": API_KEY}
    ).json()

    if result["status"] in ("completed", "failed"):
        break
    delay = min(delay * 1.5, 30)  # cap at 30s

# Use results
if result["status"] == "completed":
    text = result["result"]["text"]
    data = result["result"].get("data", {})
    for field, info in data.items():
        if info["hil_flag"]:
            print(f"REVIEW: {field} = {info['value']} ({info.get('reason')})")
        else:
            print(f"OK: {field} = {info['value']}")

JavaScript / Node.js

import fs from "fs";

const API_KEY = "sk_live_YOUR_KEY";
const BASE = "https://api.deepread.tech";

// Submit document
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
form.append("schema", JSON.stringify({
  type: "object",
  properties: {
    vendor: { type: "string", description: "Vendor or company name" },
    total: { type: "number", description: "Total amount due" }
  }
}));

const { id: jobId } = await fetch(`${BASE}/v1/process`, {
  method: "POST",
  headers: { "X-API-Key": API_KEY },
  body: form
}).then(r => r.json());

// Poll with backoff
let delay = 5000;
let result;
do {
  await new Promise(r => setTimeout(r, delay));
  result = await fetch(`${BASE}/v1/jobs/${jobId}`, {
    headers: { "X-API-Key": API_KEY }
  }).then(r => r.json());
  delay = Math.min(delay * 1.5, 30000);
} while (!["completed", "failed"].includes(result.status));

console.log(result);

cURL

# Submit with schema
curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: YOUR_KEY" \
  -F "file=@invoice.pdf" \
  -F 'schema={"type":"object","properties":{"vendor":{"type":"string","description":"Vendor name"},"total":{"type":"number","description":"Total amount"}}}'

# Submit with blueprint
curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: YOUR_KEY" \
  -F "file=@invoice.pdf" \
  -F "blueprint_id=660e8400-..."

# Get results
curl https://api.deepread.tech/v1/jobs/JOB_ID \
  -H "X-API-Key: YOUR_KEY"

# List blueprints
curl https://api.deepread.tech/v1/blueprints/ \
  -H "X-API-Key: YOUR_KEY"

Agent Device Flow (Python)

import requests
import time
import webbrowser

BASE = "https://api.deepread.tech"

# Step 1: Request a device code
resp = requests.post(f"{BASE}/v1/agent/device/code", json={"agent_name": "my-agent"})
data = resp.json()
device_code = data["device_code"]
uri_complete = data["verification_uri_complete"]
interval = data["interval"]

# Step 2: Open browser with code pre-filled
success = webbrowser.open(uri_complete)
if success:
    print(f"Opened browser: {uri_complete}")
else:
    print(f"Unable to open browser programmatically; please open this URL manually: {uri_complete}")
print("Log in and click Approve. I'll wait here.")

# Step 3: Poll until approved
api_key = None
while True:
    time.sleep(interval)
    resp = requests.post(f"{BASE}/v1/agent/device/token", json={"device_code": device_code})
    result = resp.json()

    if result.get("api_key"):
        api_key = result["api_key"]
        print(f"Got API key: {result['key_prefix']}...")
        break
    elif result.get("error") == "authorization_pending":
        continue
    elif result.get("error") == "access_denied":
        print("User denied the request.")
        break
    elif result.get("error") == "expired_token":
        print("Code expired. Please start over.")
        break

if api_key is None:
    raise SystemExit("Device flow did not complete successfully — no API key obtained.")

# Step 4: Use the key to process documents
with open("invoice.pdf", "rb") as f:
    resp = requests.post(
        f"{BASE}/v1/process",
        headers={"X-API-Key": api_key},
        files={"file": f},
    )
print(resp.json())  # {"id": "...", "status": "queued"}

Agent Device Flow (JavaScript)

const fs = require("fs");
const BASE = "https://api.deepread.tech";

// Step 1: Request a device code
const { device_code, verification_uri_complete, interval } = await fetch(
  `${BASE}/v1/agent/device/code`,
  { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ agent_name: "my-agent" }) }
).then(r => r.json());

// Step 2: Open browser with code pre-filled
console.log(`Please open: ${verification_uri_complete}`);
console.log("Log in and click Approve. I'll wait here.");

// Step 3: Poll until approved
let apiKey;
while (true) {
  await new Promise(r => setTimeout(r, interval * 1000));
  const result = await fetch(`${BASE}/v1/agent/device/token`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ device_code }),
  }).then(r => r.json());

  if (result.api_key) {
    apiKey = result.api_key;
    console.log(`Got API key: ${result.key_prefix}...`);
    break;
  } else if (result.error === "authorization_pending") {
    continue;
  } else {
    console.log(`Flow ended: ${result.error}`);
    break;
  }
}

if (!apiKey) {
  throw new Error("Device flow did not complete successfully — no API key obtained.");
}

// Step 4: Use the key
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
const job = await fetch(`${BASE}/v1/process`, {
  method: "POST",
  headers: { "X-API-Key": apiKey },
  body: form,
}).then(r => r.json());
console.log(job); // {id: "...", status: "queued"}

Agent Device Flow (cURL)

# Step 1: Request a device code — save the full response
response=$(curl -s -X POST https://api.deepread.tech/v1/agent/device/code \
  -H "Content-Type: application/json" \
  -d '{"agent_name": "my-agent"}')
device_code=$(echo "$response" | jq -r '.device_code')
verification_uri_complete=$(echo "$response" | jq -r '.verification_uri_complete')
interval=$(echo "$response" | jq -r '.interval')

# Step 2: Open the browser (use the saved URL — code is pre-filled, user clicks Approve)
open "$verification_uri_complete"  # macOS / xdg-open on Linux

# Step 3: Poll for the key (repeat every $interval seconds until api_key is returned)
curl -s -X POST https://api.deepread.tech/v1/agent/device/token \
  -H "Content-Type: application/json" \
  -d "{\"device_code\": \"$device_code\"}"
# → {"error": "authorization_pending"}  (keep polling)
# → {"api_key": "sk_live_...", "key_prefix": "sk_live_abc..."}  (done!)

# Step 4: Use the key
curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: sk_live_..." \
  -F "file=@invoice.pdf"

Webhook Receiver (Python / Flask)

from flask import Flask, request
import requests

app = Flask(__name__)
API_KEY = "sk_live_YOUR_KEY"

@app.route("/webhook", methods=["POST"])
def handle_webhook():
    payload = request.json
    job_id = payload["job_id"]

    # IMPORTANT: Always fetch canonical result from API (webhooks are not authenticated)
    result = requests.get(
        f"https://api.deepread.tech/v1/jobs/{job_id}",
        headers={"X-API-Key": API_KEY}
    ).json()

    # Process result...
    return "", 200  # Return 2xx to confirm delivery

Help the Developer

  • No API key yet → use the device authorization flow (Agent Authentication section) — no copy/paste needed
  • Send a document → POST /v1/process, show code in their language
  • Structured data → help write a JSON Schema with descriptive field descriptions
  • Better accuracy → explain blueprints, help set up optimizer
  • Real-time updates → set up webhook_url, build receiver endpoint
  • Hitting errors → check API key, plan limits, file format, schema validity
  • Share results → use preview_url from response (no auth needed)
  • Large documents → use text_url instead of text field for docs > 1MB
  • Review workflow → filter fields by hil_flag, route flagged ones to human review

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

setup

No summary provided by upstream source.

Repository SourceNeeds Review
General

prepare

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

highlevel

Connect your AI assistant to GoHighLevel CRM via the official API v2. Manage contacts, conversations, calendars, pipelines, invoices, payments, workflows, and 30+ endpoint groups through natural language. Includes interactive setup wizard and 100+ pre-built, safe API commands. Python 3.6+ stdlib only — zero external dependencies.

Archived SourceRecently Updated
Coding

frontend-design

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, artifacts, posters, or applications (examples include websites, landing pages, dashboards, React components, HTML/CSS layouts, or when styling/beautifying any web UI). Generates creative, polished code and UI design that avoids generic AI aesthetics.

Repository SourceNeeds Review
160.7K94.2Kanthropics