A/B Test Statistical Significance Calculator
Test whether your A/B test results are statistically significant. Enter visitors, conversions for control and variant to get the Z-score and p-value.
Calculate sample size and duration for multivariate tests. Enter number of combinations and daily traffic to plan MVT experiments effectively.
Multivariate testing (MVT) tests multiple combinations of changes simultaneously. While an A/B test compares two versions, an MVT can test 4, 8, 16, or more combinations at once. This allows you to find the best combination of headline, image, CTA button, and layout in a single experiment.
The trade-off: MVT requires dramatically more traffic. Each additional variation multiplied by each factor expands the number of cells exponentially. This calculator computes the total number of combinations, the minimum sample per cell, total traffic needed, and estimated duration based on your daily visitors.
MVT is most valuable when you suspect interaction effects between elements (e.g., headline A works better with image B but not image C). For independent changes with no interactions, running sequential A/B tests is often faster and equally informative.
This calculator prevents the most common MVT mistake: underestimating the traffic required. A 3ร3ร2 MVT creates 18 cells, each needing thousands of visitors. Without this planning, you'd run the test for months without reaching significance.
Total Combinations = Variants per Factorโ ร Factorโ ร ... ร Factorโ
Min Sample per Cell = Same as A/B test sample size
Total Traffic = Sample per Cell ร Total Combinations
Duration = Total Traffic / Daily TrafficResult: 8 combinations, ~307,328 total visitors, ~16 days
Three factors with 2 variants each produce 2 ร 2 ร 2 = 8 combinations. Each cell needs ~38,416 visitors (same as the A/B sample size for 3% baseline and 10% MDE). Total = 307,328 visitors. At 20,000/day, this test takes about 16 days.
MVT tests all combinations simultaneously, capturing interaction effects. Sequential A/B testing changes one thing at a time. MVT is more comprehensive but requires exponentially more traffic. For most e-commerce teams, sequential A/B tests are more practical for rapid iteration.
Start with 2 factors and 2 variants each (4 cells). This requires only 2ร the traffic of a standard A/B test. Gradually increase complexity as you build confidence and have access to higher-traffic pages.
Look for both main effects (which headline performs best overall?) and interaction effects (which headline performs best WITH a specific image?). The winning combination may include individual elements that don't win in isolation but excel together.
Last updated:
Use MVT when you suspect interaction effects between elements (headline and image perform differently depending on the combination) and you have enough traffic. For low-traffic sites or independent changes, sequential A/B tests are better.
Instead of testing all combinations (full factorial), a fractional factorial tests a strategic subset. This dramatically reduces the required sample while still detecting main effects and key interactions. It cannot detect all interaction effects though.
For full factorial MVT, 3โ4 factors with 2 variants each (8โ16 cells) is practical for most e-commerce sites. Beyond that, traffic requirements become prohibitive unless you use fractional factorial designs.
Full factorial MVT detects all interaction effects because every combination is tested. This is its key advantage over sequential A/B tests. An interaction effect occurs when the impact of change A depends on whether change B is also present.
Generally no. MVT requires traffic proportional to the number of cells. Low-traffic pages should use A/B testing (2 cells) or at most A/B/C testing (3 cells). Reserve MVT for pages with 5,000+ daily visitors.
Until every cell reaches the calculated minimum sample size. This typically takes 2โ4 weeks for high-traffic sites. Always run for at least 2 full weeks to capture day-of-week effects and business cycles.
Test whether your A/B test results are statistically significant. Enter visitors, conversions for control and variant to get the Z-score and p-value.
Calculate the confidence interval for a conversion rate or proportion. Support for 90%, 95%, and 99% confidence levels with sample size inputs.