Recursive Self-Improvement Loop

A pattern for generating higher-quality output by iterating against explicit scoring criteria.

The Pattern

generate → evaluate → diagnose → improve → repeat (until passing)

Never ship first-draft output for important content. Run the loop.

How It Works

1. Generate

Create the initial output as you normally would.

2. Evaluate

Score the output against each criterion (1-10). Be brutally honest.

3. Diagnose

For any criterion scoring below threshold:

What specifically is weak?
Why does it fail?
What would "passing" look like?

4. Improve

Rewrite addressing each diagnosed weakness. Don't patch — rebuild the weak sections.

5. Repeat

Re-evaluate. Keep looping until all criteria pass threshold (usually 8/10 minimum).

Adversarial Pressure (Optional but Powerful)

After passing criteria, attack the output from a hostile perspective:

Skeptical customer: "Why should I believe this? What's the catch?"
Distracted scroller: "Would I stop for this? In 2 seconds?"
Competitor: "How would a rival tear this apart?"

If it survives, ship it. If not, iterate.

Example Criteria by Use Case

Social Content

Criterion	What to evaluate
Hook strength	First line grabs attention? Pattern interrupt?
Curiosity gap	Creates urge to keep reading?
Clarity	One clear idea? No confusion?
Voice match	Sounds like the target voice/brand?
Engagement potential	People will reply/share/save?
Thumb-stop power	Scroller would pause?
Value density	Every line earns its place?
CTA clarity	Clear what reader should do next?

Adversarial test: Would a distracted, skeptical user at 11pm engage with this?

Landing Page / Web Copy

Criterion	What to evaluate
Headline clarity	Instantly clear what this business does?
Value prop strength	Why choose them over competitors?
Benefit focus	Features translated to customer benefits?
CTA effectiveness	Clear, compelling action? Low friction?
Trust signals	Credibility established? Social proof?
Readability	Scannable? Short paragraphs? Clear hierarchy?
Objection handling	Common concerns addressed?
Specificity	Concrete details vs vague claims?

Adversarial test: Would someone searching on their phone take action within 30 seconds?

Email Copy

Criterion	What to evaluate
Subject line	Would this get opened? Stands out in inbox?
Opening hook	First sentence earns the second?
Single focus	One clear ask per email?
Skimmability	Can get the gist in 5 seconds?
CTA prominence	Action is obvious and easy?
Voice consistency	Matches brand/sender personality?
Length appropriate	No fluff, nothing missing?
Mobile friendly	Works on small screens?

Adversarial test: Would a busy person with 200 unread emails act on this?

Ad Copy

Criterion	What to evaluate
Thumb-stop power	Pattern interrupt in first 2 seconds?
Curiosity gap	Creates need to know more?
Emotional trigger	Hits a real pain point or desire?
Credibility	Believable? Not too good to be true?
CTA strength	Clear next step with low friction?
Persona match	Speaks directly to target audience?
Differentiation	Stands out from competitor ads?
Platform native	Fits the platform's style/format?

Adversarial test: Would this stop YOUR scroll? Would you click?

When to Use

Always use for:

Headlines and hooks
CTAs and value props
Key landing page sections
Social posts (especially threads)
Ad copy
Important emails

Can skip for:

Internal notes
First-pass brainstorming
Technical documentation
Boilerplate content

Building Your Own Criteria

Pick one task you do repeatedly
Write down how YOU evaluate that output — what makes "good" vs "mid"?
Turn each into a pass/fail threshold — be specific ("9/10 minimum" not "make it good")
Add adversarial pressure — who would attack this? What would they say?
Save and reuse — now you have a system, not just a prompt

Quick Loop Template

## Output v1
[Initial generation]

## Evaluation v1
- Hook strength: 6/10 — Opens weak, no pattern interrupt
- Clarity: 8/10 — Clear enough
- Voice match: 7/10 — Too formal
[... score all criteria]

## Diagnosis
1. Hook needs a surprising stat or contrarian take
2. Voice should be more casual, shorter sentences
3. [...]

## Output v2
[Revised version addressing weaknesses]

## Evaluation v2
[Re-score — continue until all pass]

The loop typically adds 2-3 iterations. Worth it for anything that matters.

recursive-improvement

Safety Notice

Copy this and send it to your AI assistant to learn