Schema Cleaner — JSON Schema Normalization

Clean JSON Schemas for optimal LLM tool-calling compatibility across different providers.

Overview

Different LLM providers support different subsets of JSON Schema. This skill:

Provider-Specific Cleaning: Remove keywords unsupported by each provider
Reference Resolution: Inline $ref entries from $defs and definitions
Union Flattening: Convert anyOf /oneOf with literals into enum
Nullable Handling: Strip nullable variants from unions and type arrays
Const Conversion: Convert const to single-value enum
Circular Detection: Detect and safely handle circular references

Provider Compatibility Matrix

Keyword Gemini Anthropic OpenAI Description

$ref

❌ ✅ ✅ Reference resolution

$defs

❌ ✅ ✅ Schema definitions

additionalProperties

❌ ✅ ✅ Extra properties

pattern

❌ ✅ ✅ Regex validation

minLength

❌ ✅ ✅ Minimum string length

maxLength

❌ ✅ ✅ Maximum string length

format

❌ ✅ ✅ String format

minimum

❌ ✅ ✅ Minimum number

maximum

❌ ✅ ✅ Maximum number

examples

❌ ✅ ✅ Example values

API

Clean for Specific Provider

const { cleanSchema } = require('schema-cleaner');

// Clean for Gemini (most restrictive) const geminiSchema = cleanSchema(dirtySchema, { provider: 'gemini' });

// Clean for Anthropic (moderate) const anthropicSchema = cleanSchema(dirtySchema, { provider: 'anthropic' });

// Clean for OpenAI (most permissive) const openaiSchema = cleanSchema(dirtySchema, { provider: 'openai' });

Validate Schema

const { validateSchema } = require('schema-cleaner');

const errors = validateSchema(mySchema); if (errors.length > 0) { console.error('Invalid schema:', errors); }

Resolve References

const { resolveRefs } = require('schema-cleaner');

const inlineSchema = resolveRefs(schemaWithRefs);

Usage Examples

Before and After (Gemini)

Before:

{ "type": "object", "properties": { "name": { "type": "string", "minLength": 1, "pattern": "^[a-z]+$" }, "age": { "$ref": "#/$defs/Age" } }, "$defs": { "Age": { "type": "integer", "minimum": 0, "maximum": 150 } } }

After (Gemini):

{ "type": "object", "properties": { "name": { "type": "string" }, "age": { "type": "integer" } } }

Complex Schema Cleaning

const schema = { type: 'object', properties: { status: { anyOf: [ { const: 'active' }, { const: 'inactive' }, { const: 'pending' } ] }, metadata: { type: ['string', 'null'] } } };

const cleaned = cleanSchema(schema, { provider: 'gemini' }); // Result: // { // type: 'object', // properties: { // status: { type: 'string', enum: ['active', 'inactive', 'pending'] }, // metadata: { type: 'string' } // } // }

CLI Usage

Clean a schema file for Gemini

schema-cleaner clean schema.json --provider gemini --output clean-schema.json

Validate a schema

schema-cleaner validate schema.json

Check provider compatibility

schema-cleaner check schema.json --all-providers

Advanced Features

Custom Provider Strategy

const { cleanSchema } = require('schema-cleaner');

// Define custom keywords to remove const customStrategy = { remove: ['minLength', 'maxLength', 'pattern', 'description'], preserve: ['title', 'default'] };

const cleaned = cleanSchema(schema, { strategy: customStrategy });

Batch Processing

const schemas = [tool1Schema, tool2Schema, tool3Schema]; const cleaned = schemas.map(s => cleanSchema(s, { provider: 'gemini' }));

Best Practices

Clean at Runtime: Clean schemas dynamically based on the current provider
Preserve Descriptions: Keep description fields for better LLM understanding
Test Per Provider: Validate cleaned schemas work with each target provider
Cache Results: Cache cleaned schemas to avoid repeated processing
Version Schemas: Track schema versions for debugging

Error Messages

The cleaner provides helpful error messages:

{ "valid": false, "errors": [ { "type": "circular_reference", "path": "$.properties.parent.properties.child.$ref", "message": "Circular reference detected: parent -> child -> parent" } ] }

Integration with Tool Definition

const { defineTool } = require('thepopebot'); const { cleanSchema } = require('./schema-cleaner');

// Define tool with full JSON Schema const tool = defineTool({ name: 'file_write', description: 'Write content to a file', parameters: { type: 'object', properties: { path: { type: 'string', minLength: 1, description: 'File path' }, content: { type: 'string', description: 'Content to write' } }, required: ['path', 'content'] } });

// Clean for current provider before registering const provider = process.env.LLM_PROVIDER || 'anthropic'; const cleanParams = cleanSchema(tool.parameters, { provider });

// Register with cleaned schema registerTool({ ...tool, parameters: cleanParams });

schema-cleaner

Safety Notice

Copy this and send it to your AI assistant to learn

Clean a schema file for Gemini

Validate a schema

Check provider compatibility

Source Transparency

Related Skills

vector-memory

model-router

rss-reader

video-frames