Reducto
Access the Reducto document processing API with managed API key authentication. Parse documents, extract structured data, split documents into sections, and edit PDFs/DOCX files.
Quick Start
# Parse a document
python <<'EOF'
import urllib.request, os, json
data = json.dumps({'document_url': 'https://example.com/document.pdf'}).encode()
req = urllib.request.Request('https://api.maton.ai/reducto/parse', data=data, method='POST')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
req.add_header('Content-Type', 'application/json')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
Base URL
https://api.maton.ai/reducto/{native-api-path}
Maton proxies requests to platform.reducto.ai and automatically injects your API key.
Authentication
All requests require the Maton API key in the Authorization header:
Authorization: Bearer $MATON_API_KEY
Environment Variable: Set your API key as MATON_API_KEY:
export MATON_API_KEY="YOUR_API_KEY"
Getting Your API Key
- Sign in or create an account at maton.ai
- Go to maton.ai/settings
- Copy your API key
Connection Management
Manage your Reducto API key connections at https://api.maton.ai.
List Connections
python <<'EOF'
import urllib.request, os, json
req = urllib.request.Request('https://api.maton.ai/connections?app=reducto&status=ACTIVE')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
Create Connection
python <<'EOF'
import urllib.request, os, json
data = json.dumps({'app': 'reducto'}).encode()
req = urllib.request.Request('https://api.maton.ai/connections', data=data, method='POST')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
req.add_header('Content-Type', 'application/json')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
Get Connection
python <<'EOF'
import urllib.request, os, json
req = urllib.request.Request('https://api.maton.ai/connections/{connection_id}')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
Response:
{
"connection": {
"connection_id": "{connection_id}",
"status": "ACTIVE",
"creation_time": "2026-02-28T00:12:24.797884Z",
"last_updated_time": "2026-02-28T00:16:13.509841Z",
"url": "https://connect.maton.ai/?session_token=...",
"app": "reducto",
"metadata": {},
"method": "API_KEY"
}
}
Open the returned url in a browser to enter your Reducto API key.
Delete Connection
python <<'EOF'
import urllib.request, os, json
req = urllib.request.Request('https://api.maton.ai/connections/{connection_id}', method='DELETE')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
Specifying Connection
If you have multiple Reducto connections, specify which one to use with the Maton-Connection header:
python <<'EOF'
import urllib.request, os, json
req = urllib.request.Request('https://api.maton.ai/reducto/parse')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
req.add_header('Maton-Connection', '{connection_id}')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
If you have multiple connections, always include this header to ensure requests go to the intended account.
Security & Permissions
- Access is scoped to document parsing, extraction, and structured data output within the connected Reducto account.
- All write operations require explicit user approval. Before executing any create, update, or delete call, confirm the target resource and intended effect with the user.
API Reference
Parse Document
Parse a document and extract structured content (text, tables, figures).
Synchronous Parse
POST /reducto/parse
Content-Type: application/json
{
"document_url": "https://example.com/document.pdf"
}
Response:
{
"job_id": "04b8aa38-7eb3-4151-98b0-dbaea71358d9",
"duration": 17.85,
"pdf_url": "https://...",
"studio_link": "https://studio.reducto.ai/job/...",
"usage": {
"num_pages": 15,
"credits": 15.0
},
"result": {
"chunks": [
{
"content": "Extracted text content...",
"blocks": [...]
}
]
}
}
Asynchronous Parse
For long documents, use async to avoid timeouts:
POST /reducto/parse_async
Content-Type: application/json
{
"document_url": "https://example.com/document.pdf"
}
Response:
{
"job_id": "e234ba95-410a-4dd0-8a14-743dbfc49470"
}
Poll the job status with GET /reducto/job/{job_id}.
Extract Data
Extract specific fields from documents using a JSON schema.
Synchronous Extract
POST /reducto/extract
Content-Type: application/json
{
"document_url": "https://example.com/document.pdf",
"schema": {
"type": "object",
"properties": {
"title": {"type": "string", "description": "The document title"},
"authors": {"type": "array", "items": {"type": "string"}, "description": "List of author names"}
}
}
}
Response:
{
"job_id": "36f01a34-7ef6-40da-9e74-7c14902b6182",
"usage": {
"num_pages": 15,
"num_fields": 9,
"credits": 45.0
},
"studio_link": "https://studio.reducto.ai/job/...",
"result": [
{
"title": "Document Title",
"authors": ["Author One", "Author Two"]
}
]
}
Asynchronous Extract
POST /reducto/extract_async
Content-Type: application/json
{
"document_url": "https://example.com/document.pdf",
"schema": {
"type": "object",
"properties": {
"title": {"type": "string"}
}
}
}
Response:
{
"job_id": "0cdb6a50-df92-438b-875b-8b5c72d5b089"
}
Split Document
Divide documents into logical sections based on content categories.
Synchronous Split
POST /reducto/split
Content-Type: application/json
{
"document_url": "https://example.com/document.pdf",
"split_description": [
{"name": "abstract", "description": "The abstract section"},
{"name": "introduction", "description": "The introduction section"},
{"name": "conclusion", "description": "The conclusion section"}
]
}
Response:
{
"usage": {
"num_pages": 15,
"credits": 15.0
},
"result": {
"section_mapping": {
"abstract": [1],
"introduction": [1, 2],
"conclusion": [14, 15]
},
"splits": [
{
"name": "abstract",
"pages": [1],
"conf": "high"
}
]
}
}
Asynchronous Split
POST /reducto/split_async
Content-Type: application/json
{
"document_url": "https://example.com/document.pdf",
"split_description": [
{"name": "abstract", "description": "The abstract section"}
]
}
Response:
{
"job_id": "381de5fe-e162-4039-9ef9-8522fb34056b"
}
Edit Document
Fill forms and modify PDF/DOCX documents with natural language instructions.
Synchronous Edit
POST /reducto/edit
Content-Type: application/json
{
"document_url": "https://example.com/form.pdf",
"edit_instructions": "Fill in the name field with 'John Doe' and check the consent box"
}
Response:
{
"document_url": "https://presigned-url.s3.amazonaws.com/...",
"form_schema": [...],
"usage": {
"num_pages": 2,
"credits": 2.0
}
}
Asynchronous Edit
POST /reducto/edit_async
Content-Type: application/json
{
"document_url": "https://example.com/form.pdf",
"edit_instructions": "Highlight all mentions of 'important' in red"
}
Response:
{
"job_id": "575189cb-8732-429a-ba8a-06de8ee03208"
}
Upload File
Upload a document to Reducto and get a presigned URL for processing.
POST /reducto/upload
Content-Type: application/json
{}
Response:
{
"file_id": "reducto://18d574c7-4144-4f50-b7af-b8aba83ada5d",
"presigned_url": "https://prod-storage.s3.amazonaws.com/...?AWSAccessKeyId=...&Signature=...&Expires=..."
}
Upload your file to the presigned_url using a PUT request, then use the file_id as document_url in parse/extract/split/edit requests.
Pipeline
Execute pre-configured processing pipelines.
POST /reducto/pipeline
Content-Type: application/json
{
"input": "https://example.com/document.pdf",
"pipeline_id": "your-pipeline-id"
}
Note: pipeline_id must be a valid pipeline ID configured in your Reducto account via the Reducto Studio.
Response:
{
"job_id": "...",
"usage": {
"num_pages": 15,
"credits": 15.0
},
"result": {
"parse": {...},
"extract": {...},
"split": {...},
"edit": {...}
}
}
Jobs
List Jobs
GET /reducto/jobs
Response:
{
"jobs": [
{
"job_id": "8c25561f-247a-4843-b561-1eb94c3792d1",
"status": "Completed",
"type": "Parse",
"created_at": "2026-02-27T23:11:39.787917",
"num_pages": 15,
"duration": 6.62
}
],
"next_cursor": null
}
Get Job Status
GET /reducto/job/{job_id}
Response (Pending):
{
"status": "Pending",
"result": null,
"progress": 0.5,
"reason": null
}
Response (Completed):
{
"status": "Completed",
"result": {
"job_id": "...",
"duration": 17.85,
"usage": {...},
"result": {...}
},
"progress": null,
"reason": null
}
Job status values: Pending, InProgress, Completed, Failed
Version
GET /reducto/version
Response:
"VERSION_GOES_HERE"
Document URL Formats
The document_url parameter accepts several formats:
- Public URL:
https://example.com/document.pdf - Presigned S3 URL:
https://bucket.s3.amazonaws.com/key?... - Reducto Upload:
reducto://file-id(from/uploadendpoint) - Previous Job:
jobid://job-id(reuse parsed content from previous job)
Code Examples
JavaScript
// Parse a document
const response = await fetch(
'https://api.maton.ai/reducto/parse',
{
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.MATON_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
document_url: 'https://example.com/document.pdf'
})
}
);
const data = await response.json();
console.log(data.result.chunks);
Python
import os
import requests
# Extract data from a document
response = requests.post(
'https://api.maton.ai/reducto/extract',
headers={
'Authorization': f'Bearer {os.environ["MATON_API_KEY"]}',
'Content-Type': 'application/json'
},
json={
'document_url': 'https://example.com/document.pdf',
'schema': {
'type': 'object',
'properties': {
'title': {'type': 'string'},
'date': {'type': 'string'}
}
}
}
)
result = response.json()
print(result['result'])
Async Job Polling
import os
import time
import requests
# Start async parse
response = requests.post(
'https://api.maton.ai/reducto/parse_async',
headers={
'Authorization': f'Bearer {os.environ["MATON_API_KEY"]}',
'Content-Type': 'application/json'
},
json={'document_url': 'https://example.com/large-document.pdf'}
)
job_id = response.json()['job_id']
# Poll for completion
while True:
status = requests.get(
f'https://api.maton.ai/reducto/job/{job_id}',
headers={'Authorization': f'Bearer {os.environ["MATON_API_KEY"]}'}
).json()
print(f"Status: {status['status']}, Progress: {status.get('progress')}")
if status['status'] == 'Completed':
print(status['result'])
break
elif status['status'] == 'Failed':
print(f"Failed: {status['reason']}")
break
time.sleep(2)
Notes
- Synchronous endpoints may timeout for large documents; use async endpoints instead
- Upload presigned URLs expire quickly; upload files immediately after calling
/upload - The
reducto://prefix URLs from/uploadcan be used in subsequent parse/extract/split/edit calls - Use
jobid://prefix to reuse parsed content from a previous job (saves processing time) - Connection uses API_KEY authentication method (not OAuth)
- Credits are consumed based on page count and operation type
Error Handling
| Status | Meaning |
|---|---|
| 400 | Invalid request (missing required fields, invalid format) |
| 401 | Invalid or missing Maton API key |
| 404 | Resource not found |
| 422 | Validation error (check response body for details) |
| 4xx/5xx | Passthrough error from Reducto API |
Troubleshooting: API Key Issues
- Check that the
MATON_API_KEYenvironment variable is set:
echo $MATON_API_KEY
- Verify the API key is valid by listing connections:
python <<'EOF'
import urllib.request, os, json
req = urllib.request.Request('https://api.maton.ai/connections')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
Troubleshooting: Invalid App Name
Ensure your URL path starts with reducto. For example:
- Correct:
https://api.maton.ai/reducto/parse - Incorrect:
https://api.maton.ai/parse