What is the Bonferroni correction?

It's a method that adjusts the significance threshold when performing multiple statistical tests. You divide your alpha level by the number of tests (α/m), ensuring the probability of at least one false positive stays at or below α across all tests.

Why is Bonferroni considered conservative?

Because it assumes the worst case where all tests are independent. In practice, tests are often correlated, making the true FWER lower than what Bonferroni controls for. This means you may miss real effects (reduced power).

What is the difference between Bonferroni and Šidák corrections?

Both control the FWER. Bonferroni uses α/m, while Šidák uses 1−(1−α)^(1/m). Šidák is slightly less conservative because it accounts for the exact probability rather than using the Bonferroni inequality. The difference is negligible for large m or small α.

What is the Holm step-down method?

Holm's method sorts p-values from smallest to largest, then tests each against a progressively less strict threshold: α/m, α/(m−1), α/(m−2), etc. It stops at the first non-significant p-value. It's always at least as powerful as Bonferroni.

When should I NOT use Bonferroni correction?

When you have hundreds or thousands of tests (e.g., genomics), Bonferroni becomes extremely conservative. In those cases, FDR-controlling methods like Benjamini-Hochberg are preferred. Also, pre-planned contrasts in ANOVA don't always require correction.

FWER (Familywise Error Rate) is the probability of making at least one Type I error across all tests. FDR (False Discovery Rate) is the expected proportion of false positives among all rejected hypotheses. FDR is less strict and more appropriate for large-scale testing.

Bonferroni Correction Calculator

Calculate adjusted significance thresholds for multiple comparisons using Bonferroni, Šidák, and Holm corrections. Compare methods side-by-side with FWER tables.

Bonferroni Correction Calculator

Method

Number of Comparisons (m)

Significance Level (α)

Individual p-values (comma-separated)Enter observed p-values for per-test evaluation

Bonferroni Threshold

0.008333

α/m = 0.05/6 — reject if p < this value

Šidák Threshold

0.008512

1 − (1−α)^(1/m) — slightly less conservative than Bonferroni

FWER (Uncorrected)

26.49%

Probability of ≥1 false positive without correction across 6 tests

Tests Significant (Bonferroni)

1 / 6

Number of supplied p-values passing Bonferroni threshold

Tests Significant (Šidák)

1 / 6

Number of supplied p-values passing Šidák threshold

Tests Significant (Uncorrected)

5 / 6

Without any correction — likely inflated

FWER Growth Without Correction

Comparisons	FWER (α=0.05)	Bonferroni α*	Šidák α*
1	5.00%	0.050000	0.050000
2	9.75%	0.025000	0.025321
3	14.26%	0.016667	0.016952
5	22.62%	0.010000	0.010206
10	40.13%	0.005000	0.005116

Per-Test Results

#	p-value	Uncorrected	Bonferroni	Šidák
1	0.0100	✓ Sig	✗ NS	✗ NS
2	0.0400	✓ Sig	✗ NS	✗ NS
3	0.0300	✓ Sig	✗ NS	✗ NS
4	0.0020	✓ Sig	✓ Sig	✓ Sig
5	0.0600	✗ NS	✗ NS	✗ NS
6	0.0250	✓ Sig	✗ NS	✗ NS

Holm Step-Down Procedure

Rank	Original #	p-value	Threshold	Decision
1	4	0.0020	0.008333	✓ Reject
2	1	0.0100	0.010000	✗ Fail to Reject
3	6	0.0250	0.012500	✗ Fail to Reject
4	3	0.0300	0.016667	✗ Fail to Reject
5	2	0.0400	0.025000	✗ Fail to Reject
6	5	0.0600	0.050000	✗ Fail to Reject

Visual: p-Values vs Thresholds

0.0100

0.0400

0.0300

0.0020

0.0600

0.0250

Dashed red line = Bonferroni threshold (0.008333)

Planning notes, formulas, and examples

About the Bonferroni Correction Calculator

When you perform multiple statistical tests simultaneously, the probability of committing at least one Type I error (false positive) increases dramatically. The Bonferroni correction is the simplest and most widely used remedy: divide your significance level by the number of tests to maintain the desired familywise error rate (FWER).

This calculator computes adjusted significance thresholds using Bonferroni, Šidák, and Holm step-down methods. Enter your original alpha, the number of comparisons, and optionally your individual p-values to see which tests remain significant after correction. A side-by-side comparison table shows how each method performs.

Multiple comparison corrections are essential in genomics (thousands of gene tests), ANOVA post-hoc analyses, clinical trials with multiple endpoints, neuroimaging voxel-wise tests, and any study where many hypotheses are tested simultaneously. Without correction, you're virtually guaranteed false positives. Use the example to see how the same alpha is distributed across tests and why Holm usually retains more power than plain Bonferroni.

When This Page Helps

Performing 20 independent tests at α = 0.05 gives a 64% chance of at least one false positive — even when no real effect exists. Bonferroni correction reduces each test's threshold to α/m, keeping the overall error rate at α. This calculator also shows the less conservative Šidák and Holm alternatives, helping you pick the right balance between controlling false positives and retaining statistical power.

How to Use the Inputs

Enter the number of simultaneous comparisons (m) you're performing.
Set your original significance level alpha (typically 0.05).
Optionally enter comma-separated p-values from your individual tests.
Choose a correction method or select "Compare All Methods" for a side-by-side view.
Review the corrected significance thresholds for Bonferroni and Šidák.
Check the per-test results table to see which p-values survive each correction.
Examine the Holm step-down procedure for a more powerful sequential method.

Formula used

Bonferroni Correction:
  α* = α / m

Šidák Correction:
  α* = 1 − (1 − α)^(1/m)

Familywise Error Rate (uncorrected):
  FWER = 1 − (1 − α)^m

Holm Step-Down:
  Order p-values: p₍₁₎ ≤ p₍₂₎ ≤ … ≤ p₍ₘ₎
  Reject p₍ᵢ₎ if p₍ᵢ₎ < α / (m − i + 1)
  Stop at first non-rejection

Where: m = number of comparisons, α = original significance level

Example Calculation

Result: Bonferroni threshold: 0.008333; 1 of 6 tests significant

With 6 comparisons and α = 0.05, the Bonferroni-adjusted threshold is 0.05/6 = 0.008333. Only the p-value of 0.002 falls below this threshold, so only one test remains significant after correction. The uncorrected FWER would have been 26.5%.

Tips & Best Practices

Bonferroni is the most conservative correction — if power matters, consider Šidák or Holm-Bonferroni, which are slightly less strict.
The Holm step-down method is uniformly more powerful than Bonferroni and controls the same FWER — it's generally preferred.
If you're more concerned about the false discovery rate (FDR) than FWER, consider the Benjamini-Hochberg procedure instead.
Bonferroni and Šidák give nearly identical results when α/m is small (which is most practical scenarios).
The correction assumes independent or positively correlated tests. For negatively correlated tests, it can be overly conservative.
Always report both uncorrected and corrected results in publications so readers can assess the impact of the correction.

The Multiple Testing Problem

Every time you test a hypothesis at α = 0.05, there's a 5% chance of a false positive. Run 20 independent tests and the probability of at least one false positive is 1 − 0.95²⁰ ≈ 64%. This is the multiple testing problem, and it's pervasive in modern research where data analysis often involves many simultaneous comparisons.

Choosing a Correction Method

Bonferroni is the gold standard for simplicity but sacrifices power. Šidák provides a slight improvement. Holm's step-down procedure is uniformly more powerful — it should generally be preferred when you need FWER control. For large-scale screening (genomics, proteomics, neuroimaging), switch to FDR methods like Benjamini-Hochberg, which allow a controlled proportion of false discoveries rather than trying to eliminate them entirely.

Practical Considerations in Research

Many journals now require multiple comparison corrections for any study reporting more than one primary outcome. Pre-registering your planned analyses helps distinguish exploratory from confirmatory tests. Some researchers advocate adjusting only for the number of primary hypotheses, not secondary or exploratory analyses. The key is transparency: always report how many tests were conducted and what correction was applied.

Sources & Methodology

Last updated: March 8, 2026

Frequently Asked Questions

It's a method that adjusts the significance threshold when performing multiple statistical tests. You divide your alpha level by the number of tests (α/m), ensuring the probability of at least one false positive stays at or below α across all tests.

Bonferroni Correction Calculator

Bonferroni Correction Calculator

FWER Growth Without Correction

Per-Test Results

Holm Step-Down Procedure

Visual: p-Values vs Thresholds

About the Bonferroni Correction Calculator

When This Page Helps

How to Use the Inputs

Example Calculation

Tips & Best Practices

The Multiple Testing Problem

Choosing a Correction Method

Practical Considerations in Research

Sources & Methodology

Frequently Asked Questions

More in this topic

ANOVA Calculator

P-Value Calculator

T-Test Calculator