estimate-calibrator

Produces calibrated three-point estimates (best/likely/worst case) with explicit unknowns, confidence intervals, and assumption documentation. Breaks work into atomic units, identifies technical and scope uncertainties, calculates PERT ranges, and provides confidence rationale. Triggers on: "estimate this", "how long will this take", "effort estimate", "time estimate", "best case worst case", "confidence interval", "sizing", "estimate effort", "how big is this", "story points", "t-shirt sizing", "estimate the work", "PERT". NOT for task decomposition, implementation plans, or dependency mapping — use task-decomposer instead. Use this skill when a task or project needs an effort estimate with explicit uncertainty.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "estimate-calibrator" with this command: npx skills add mathews-tom/praxis-skills/mathews-tom-praxis-skills-estimate-calibrator

Estimate Calibrator

Replaces single-point guesses with structured three-point estimates: decomposes work into atomic units, estimates best/likely/worst case for each, identifies unknowns and assumptions, calculates aggregate ranges using PERT, and assigns confidence levels with explicit rationale.

Reference Files

FileContentsLoad When
references/estimation-methods.mdPERT formula, three-point estimation, Monte Carlo basicsAlways
references/unknown-categories.mdTechnical, scope, external, and organizational uncertainty typesUnknown identification
references/calibration-tips.mdCognitive biases in estimation, historical calibration, buffer strategiesAlways
references/sizing-heuristics.mdCommon task size patterns, complexity indicators, reference class dataQuick sizing needed

Prerequisites

  • Work item description (feature, task, project)
  • Decomposed tasks (or use task-decomposer skill first)
  • Context: team familiarity, tech stack, existing codebase

Workflow

Phase 1: Decompose Work

If the work item is not already decomposed into atomic units:

  1. Break into tasks — Each task should be estimable independently.
  2. Right granularity — Tasks should be 1 hour to 3 days. Larger tasks have higher uncertainty; break them down further.
  3. Identify dependencies — Tasks on the critical path determine the minimum duration.

Phase 2: Three-Point Estimate

For each task, estimate three scenarios:

ScenarioDefinitionMindset
Best caseEverything goes right. No surprises."If I've done this exact thing before"
Likely caseNormal friction. Some minor obstacles."Realistic expectation with typical setbacks"
Worst caseSignificant problems. Not catastrophic."Murphy's law but not a disaster"

Key rule: Worst case is NOT "everything goes wrong." It's the realistic bad scenario (90th percentile), not the apocalyptic one (99th percentile).

Phase 3: Identify Unknowns

Categorize unknowns that affect estimates:

CategoryExampleImpact
Technical"Never used this library before"Likely case inflated, worst case much higher
Scope"Requirements may change"All estimates may shift
External"Depends on API access from partner"Blocking risk — could delay entirely
Integration"Haven't tested with production data"Hidden complexity at integration
Organizational"Need design approval"Calendar time, not effort time

Phase 4: Calculate Ranges

For individual tasks, use the PERT formula:

Expected = (Best + 4 × Likely + Worst) / 6
Std Dev = (Worst - Best) / 6

For aggregate (project) estimates:

  • Sum of expected values for total expected duration
  • Root sum of squares of std devs for aggregate uncertainty

Phase 5: Assign Confidence

ConfidenceMeaningWhen
HighLikely case within ±20%Well-understood task, team has done it before
MediumLikely case within ±50%Some unknowns, moderate familiarity
LowLikely case within ±100% or moreSignificant unknowns, new technology

Output Format

## Estimate: {Work Item}

### Summary
| Scenario | Duration |
|----------|----------|
| Best case | {time} |
| Likely case | {time} |
| Worst case | {time} |
| **PERT expected** | **{time}** |
| **Confidence** | **{High/Medium/Low}** |

### Task-Level Estimates

| # | Task | Best | Likely | Worst | PERT | Unknowns |
|---|------|------|--------|-------|------|----------|
| 1 | {task} | {time} | {time} | {time} | {time} | {key unknown or "None"} |
| 2 | {task} | {time} | {time} | {time} | {time} | {key unknown} |
| | **Total** | **{sum}** | **{sum}** | **{sum}** | **{pert}** | |

### Key Unknowns

| # | Unknown | Category | Impact on Estimate | Mitigation |
|---|---------|----------|-------------------|------------|
| 1 | {unknown} | {Technical/Scope/External} | +{time} if realized | {spike, prototype, early test} |

### Assumptions
- {Assumption 1 — what must be true for this estimate to hold}
- {Assumption 2}

### Risk Factors
- {Risk}: If realized, adds {time}. Likelihood: {High/Medium/Low}.

### Confidence Rationale
**{High/Medium/Low}** because:
- {Specific reason — e.g., "Team has built 3 similar features"}
- {Specific reason — e.g., "External API is a new integration"}

### Recommendation
{Commit to PERT expected with {X}% buffer, or spike the top unknown first.}

Calibration Rules

  1. Three points, not one. Single-point estimates are always wrong. Three points communicate uncertainty — the most important part of any estimate.
  2. Worst case is the 90th percentile, not the 99th. "Asteroid hits the office" is not a useful worst case. "The API documentation is wrong and we need to reverse-engineer the protocol" is realistic worst case.
  3. Unknowns inflate estimates more than known difficulty. A hard but well-understood task is more predictable than an easy but novel one.
  4. Estimates are not commitments. Communicate ranges, not deadlines. If stakeholders need a single number, give the PERT expected plus a buffer for confidence level.
  5. Spike unknowns early. If a single unknown dominates the estimate range, invest 1-2 days spiking it before estimating the rest.

Error Handling

ProblemResolution
Work item not decomposedDecompose into 3-8 tasks first (or suggest task-decomposer skill).
No historical referenceEstimate relative to a known task: "This is about 2x the auth feature."
Stakeholder wants a single numberProvide PERT expected with buffer matching confidence level (High: +20%, Medium: +50%, Low: +100%).
Estimate seems too largeCheck for scope creep in task list. Remove non-essential tasks. Identify what can be deferred.
Team has never done this type of workMark confidence as Low. Recommend a spike before committing to an estimate.

When NOT to Estimate

Push back if:

  • The work is exploratory (research, spikes) — timebox instead of estimating
  • Requirements are completely undefined — define scope first
  • The user wants precision (hours) for a large project — provide ranges, not false precision
  • The estimate will be used as a commitment without acknowledging uncertainty

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

manuscript-review

No summary provided by upstream source.

Repository SourceNeeds Review
General

html-presentation

No summary provided by upstream source.

Repository SourceNeeds Review
General

concept-to-image

No summary provided by upstream source.

Repository SourceNeeds Review
General

md-to-pdf

No summary provided by upstream source.

Repository SourceNeeds Review