llm-challenge

Benchmark for @tailor-platform/sdk AI-friendliness. Located at llm-challenge/ .

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "llm-challenge" with this command: npx skills add tailor-platform/sdk/tailor-platform-sdk-llm-challenge

LLM Challenge

Benchmark for @tailor-platform/sdk AI-friendliness. Located at llm-challenge/ .

Read llm-challenge/README.md for commands, scoring, and verification details.

Core Rule

When AI fails a challenge, improve the SDK (JSDoc, error messages, types, CLAUDE.md) — NEVER add hints to problem.md .

Prerequisite

ALWAYS build SDK before running: pnpm -C packages/sdk build

Problem Conventions

Structure: problems/<id>-<name>/ with meta.json , problem.md , scaffold/ , solution/ , tests/

meta.json rules:

  • id : 3-digit zero-padded, sequential

  • scoring : Category defaults — tailordb: 20/20/60, resolver/executor/workflow: 15/15/70, config: 30/20/50, fix-broken: 15/15/70

  • Fix-broken problem: same file appears in both implement and scaffold

problem.md rules:

  • Sections: Goal → Domain Context/Instructions → What to Build → Requirements → Reference

  • NEVER include SDK code examples — AI must discover API from the SDK package itself

  • Always end with "Refer to the installed SDK package for ..."

Writing Tests

  • Read existing tests in problems/*/tests/ for patterns

  • Helpers: shared/test-helpers.ts (createWorkDirContext , importPath , expectFieldType , etc.)

  • Mocks: shared/mocks.ts (setupTailordbMock , setupWorkflowMock )

  • ALWAYS use describe.skipIf(!workDirReady) guard

Creating a New Problem

  • Next sequential ID (e.g., 013 )

  • Write solution first, then tests

  • Verify: pnpm -C llm-challenge challenge --problem <id> --use-solution → must be 100/100

SDK Improvement Cycle

  • pnpm -C llm-challenge challenge:solve --retry 2 → analyze failures

  • Improve SDK source (NOT problem descriptions)

  • pnpm -C packages/sdk build → pnpm -C llm-challenge challenge:verify-solution

  • Re-run benchmark to measure improvement

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

tailor-sdk

No summary provided by upstream source.

Repository SourceNeeds Review
General

almanak-strategy-builder

No summary provided by upstream source.

Repository SourceNeeds Review
General

varg-video-generation

No summary provided by upstream source.

Repository SourceNeeds Review