A/B Test Analyzer

Name: A/B Test Analyzer
Author: Community

Design and analyze A/B tests with proper statistical methodology.

Usage

Form a clear hypothesis: "Changing X will improve Y by Z%"
Calculate required sample size based on baseline rate, minimum detectable effect, and significance level
Determine test duration and traffic allocation
Monitor for technical issues without peeking at results
Analyze results with proper statistical tests and make a decision

Sample size calculation: Baseline conversion rate: 5%. Minimum detectable effect: 10% relative (0.5 percentage points). Significance level: 95% (alpha=0.05). Power: 80%. Required sample: ~31,000 per variant. At 1,000 visitors/day with 50/50 split: test needs 62 days. If that's too long, either increase traffic or accept a larger minimum detectable effect
Analyzing results: Control: 5,000 visitors, 250 conversions (5.0%). Variant: 5,000 visitors, 280 conversions (5.6%). Relative lift: +12%. P-value: 0.18. NOT statistically significant (p > 0.05). Decision: do not ship the variant. The apparent improvement could be due to chance. Need more sample or the effect isn't real
Segmented analysis: Overall result: no significant difference. But segment by device: mobile shows +15% (significant), desktop shows -5% (not significant). This suggests a mobile-specific improvement. Validate with a follow-up mobile-only test before concluding — segment analysis inflates false positives

Never peek at results before reaching your pre-calculated sample size — peeking inflates false positive rates from 5% to 30%+
If you must check early, use sequential testing methods (always valid p-values) instead of fixed-horizon tests
Test one change at a time. If you change headline AND button color, you can't attribute the effect to either
Run tests for full weeks (7, 14, 21 days) to account for day-of-week effects
A "non-significant" result is still a result — it tells you the change doesn't matter enough to invest in
Document every test: hypothesis, variants, sample size, duration, result, decision. Build an institutional testing knowledge base