A/B Ad Test Sample Size Calculator

Calculate the required sample size for statistically significant A/B ad tests. Determine how many impressions or clicks you need for reliable test results.

%
e.g. 20% = detect 3% โ†’ 3.6%
%
for duration estimate
Sample per Variant
13,919
Total Sample Needed
27,838
both variants combined
Est. Test Duration
28 days
28 days at current traffic
Detecting Change
3% โ†’ 3.6%
20% relative lift
Planning notes, formulas, and examples

About the A/B Ad Test Sample Size Calculator

Running A/B tests on ad creative, landing pages, or bidding strategies without adequate sample sizes leads to false conclusions. A test declared a "winner" with too few impressions may just be random noise. This calculator tells you exactly how many impressions, clicks, or conversions you need for statistically valid results.

The required sample size depends on three key parameters: your baseline conversion rate, the minimum detectable effect (MDE) you want to identify, and the confidence level you require. Smaller effects need larger samples. Higher confidence needs larger samples. Lower baseline rates need larger samples.

Properly sized A/B tests prevent two costly errors: (1) switching to a "better" ad that's actually no different (false positive), and (2) keeping a weaker ad because the test was too small to detect the improvement (false negative).

Tracking this metric consistently enables marketing teams to identify campaign performance trends and reallocate budgets to the highest-performing channels before opportunities are lost.

When This Page Helps

Underpowered A/B tests waste budget and lead to wrong decisions. This calculator ensures your ad tests have enough data for valid conclusions, preventing both false wins and missed improvements.

How to Use the Inputs

  1. Enter your baseline conversion rate (current performance).
  2. Enter the minimum detectable effect (smallest improvement worth detecting).
  3. Set your desired confidence level (typically 95%).
  4. Set statistical power (typically 80%).
  5. View the required sample size per variant.
  6. Estimate test duration based on your daily traffic.
Formula used
n = (Z_ฮฑ/2 + Z_ฮฒ)ยฒ ร— 2 ร— pฬ„(1 โˆ’ pฬ„) รท (pโ‚ โˆ’ pโ‚‚)ยฒ Where: n = sample size per variant Z_ฮฑ/2 = Z-score for confidence (1.96 for 95%) Z_ฮฒ = Z-score for power (0.84 for 80%) pฬ„ = average of baseline and variant rates pโ‚, pโ‚‚ = baseline and expected variant rates

Example Calculation

Result: ~7,700 per variant (15,400 total)

With a 3% baseline conversion rate, detecting a 20% relative improvement (3% โ†’ 3.6%) at 95% confidence and 80% power requires approximately 7,700 samples per variant, or 15,400 total. At 1,000 clicks/day, the test would take about 15 days.

Tips & Best Practices

  • Never end a test early just because one variant looks better โ€” complete the full sample size.
  • A 20% relative MDE (e.g. 3% โ†’ 3.6%) is a practical minimum for most ad tests.
  • Detecting 5% relative differences requires 16x more data than detecting 20% differences.
  • Run tests for at least 1โ€“2 full business cycles (weeks) to account for day-of-week effects.
  • Use conversion events (not clicks) as the success metric for landing page A/B tests.
  • Avoid running more than 2โ€“3 variants simultaneously as each additional variant requires more traffic.

Why Sample Size Matters for Ad Testing

Premature test conclusions are one of the most expensive mistakes in paid advertising. Switching to a "winning" ad variant based on insufficient data can actually decrease performance. Properly calculating sample size before running a test ensures valid, actionable results.

The Three Levers of Sample Size

Baseline rate: lower rates need more data (testing a 1% conversion rate needs 4x the data of a 4% rate). MDE: detecting smaller improvements needs exponentially more data. Confidence/Power: stricter statistical requirements need more data. Adjust these three to balance precision with practical test duration.

Common Ad Testing Mistakes

Ending tests early ("it's already significant at day 3" โ€” no, early peeking inflates false positives). Running too many variants (splits traffic and extends duration). Using clicks instead of conversions (noisy metric). Not accounting for seasonality (weekend traffic differs from weekday). These mistakes make test results unreliable.

Practical Test Design

For most ad A/B tests: use 95% confidence, 80% power, and 15โ€“20% relative MDE. This balances statistical rigor with realistic test durations. If you need to test faster, increase MDE (only test bold creative differences) rather than reducing confidence.

Sources & Methodology

Last updated:

Frequently Asked Questions

  • MDE is the smallest improvement you want to be able to detect reliably. A 20% relative MDE means if the true improvement is 20% or more, your test will detect it. Smaller MDE requires exponentially more data.