ab-test-analysis

Evaluate A/B test results with statistical rigor and translate findings into clear product decisions.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "ab-test-analysis" with this command: npx skills add phuryn/pm-skills/phuryn-pm-skills-ab-test-analysis

A/B Test Analysis

Evaluate A/B test results with statistical rigor and translate findings into clear product decisions.

Context

You are analyzing A/B test results for $ARGUMENTS.

If the user provides data files (CSV, Excel, or analytics exports), read and analyze them directly. Generate Python scripts for statistical calculations when needed.

Instructions

Understand the experiment:

  • What was the hypothesis?

  • What was changed (the variant)?

  • What is the primary metric? Any guardrail metrics?

  • How long did the test run?

  • What is the traffic split?

Validate the test setup:

  • Sample size: Is the sample large enough for the expected effect size?

  • Use the formula: n = (Z²α/2 × 2 × p × (1-p)) / MDE²

  • Flag if the test is underpowered (<80% power)

  • Duration: Did the test run for at least 1-2 full business cycles?

  • Randomization: Any evidence of sample ratio mismatch (SRM)?

  • Novelty/primacy effects: Was there enough time to wash out initial behavior changes?

Calculate statistical significance:

  • Conversion rate for control and variant

  • Relative lift: (variant - control) / control × 100

  • p-value: Using a two-tailed z-test or chi-squared test

  • Confidence interval: 95% CI for the difference

  • Statistical significance: Is p < 0.05?

  • Practical significance: Is the lift meaningful for the business?

If the user provides raw data, generate and run a Python script to calculate these.

Check guardrail metrics:

  • Did any guardrail metrics (revenue, engagement, page load time) degrade?

  • A winning primary metric with degraded guardrails may not be a true win

Interpret results:

Outcome Recommendation

Significant positive lift, no guardrail issues Ship it — roll out to 100%

Significant positive lift, guardrail concerns Investigate — understand trade-offs before shipping

Not significant, positive trend Extend the test — need more data or larger effect

Not significant, flat Stop the test — no meaningful difference detected

Significant negative lift Don't ship — revert to control, analyze why

Provide the analysis summary:

A/B Test Results: [Test Name]

Hypothesis: [What we expected] Duration: [X days] | Sample: [N control / M variant]

MetricControlVariantLiftp-valueSignificant?
[Primary]X%Y%+Z%0.0XYes/No
[Guardrail]...............

Recommendation: [Ship / Extend / Stop / Investigate] Reasoning: [Why] Next steps: [What to do]

Think step by step. Save as markdown. Generate Python scripts for calculations if raw data is provided.

Further Reading

  • A/B Testing 101 + Examples

  • Testing Product Ideas: The Ultimate Validation Experiments Library

  • Are You Tracking the Right Metrics?

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

competitor-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
164-phuryn
Research

swot-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
149-phuryn
Research

pestle-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
144-phuryn
Research

sentiment-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
143-phuryn