Bonferroni Correction Calculator

Calculate adjusted significance thresholds for multiple comparisons using Bonferroni, Šidák, and Holm corrections. Compare methods side-by-side with FWER tables.

Bonferroni Correction Calculator

Enter observed p-values for per-test evaluation
Bonferroni Threshold
0.008333
α/m = 0.05/6 — reject if p < this value
Šidák Threshold
0.008512
1 − (1−α)^(1/m) — slightly less conservative than Bonferroni
FWER (Uncorrected)
26.49%
Probability of ≥1 false positive without correction across 6 tests
Tests Significant (Bonferroni)
1 / 6
Number of supplied p-values passing Bonferroni threshold
Tests Significant (Šidák)
1 / 6
Number of supplied p-values passing Šidák threshold
Tests Significant (Uncorrected)
5 / 6
Without any correction — likely inflated

FWER Growth Without Correction

ComparisonsFWER (α=0.05)Bonferroni α*Šidák α*
15.00%0.0500000.050000
29.75%0.0250000.025321
314.26%0.0166670.016952
522.62%0.0100000.010206
1040.13%0.0050000.005116

Per-Test Results

#p-valueUncorrectedBonferroniŠidák
10.0100✓ Sig✗ NS✗ NS
20.0400✓ Sig✗ NS✗ NS
30.0300✓ Sig✗ NS✗ NS
40.0020✓ Sig✓ Sig✓ Sig
50.0600✗ NS✗ NS✗ NS
60.0250✓ Sig✗ NS✗ NS

Holm Step-Down Procedure

RankOriginal #p-valueThresholdDecision
140.00200.008333✓ Reject
210.01000.010000✗ Fail to Reject
360.02500.012500✗ Fail to Reject
430.03000.016667✗ Fail to Reject
520.04000.025000✗ Fail to Reject
650.06000.050000✗ Fail to Reject

Visual: p-Values vs Thresholds

#1
0.0100
#2
0.0400
#3
0.0300
#4
0.0020
#5
0.0600
#6
0.0250
Dashed red line = Bonferroni threshold (0.008333)
Planning notes, formulas, and examples

About the Bonferroni Correction Calculator

When you perform multiple statistical tests simultaneously, the probability of committing at least one Type I error (false positive) increases dramatically. The Bonferroni correction is the simplest and most widely used remedy: divide your significance level by the number of tests to maintain the desired familywise error rate (FWER).

This calculator computes adjusted significance thresholds using Bonferroni, Šidák, and Holm step-down methods. Enter your original alpha, the number of comparisons, and optionally your individual p-values to see which tests remain significant after correction. A side-by-side comparison table shows how each method performs.

Multiple comparison corrections are essential in genomics (thousands of gene tests), ANOVA post-hoc analyses, clinical trials with multiple endpoints, neuroimaging voxel-wise tests, and any study where many hypotheses are tested simultaneously. Without correction, you're virtually guaranteed false positives. Use the example to see how the same alpha is distributed across tests and why Holm usually retains more power than plain Bonferroni.

When This Page Helps

Performing 20 independent tests at α = 0.05 gives a 64% chance of at least one false positive — even when no real effect exists. Bonferroni correction reduces each test's threshold to α/m, keeping the overall error rate at α. This calculator also shows the less conservative Šidák and Holm alternatives, helping you pick the right balance between controlling false positives and retaining statistical power.

How to Use the Inputs

  1. Enter the number of simultaneous comparisons (m) you're performing.
  2. Set your original significance level alpha (typically 0.05).
  3. Optionally enter comma-separated p-values from your individual tests.
  4. Choose a correction method or select "Compare All Methods" for a side-by-side view.
  5. Review the corrected significance thresholds for Bonferroni and Šidák.
  6. Check the per-test results table to see which p-values survive each correction.
  7. Examine the Holm step-down procedure for a more powerful sequential method.
Formula used
Bonferroni Correction: α* = α / m Šidák Correction: α* = 1 − (1 − α)^(1/m) Familywise Error Rate (uncorrected): FWER = 1 − (1 − α)^m Holm Step-Down: Order p-values: p₍₁₎ ≤ p₍₂₎ ≤ … ≤ p₍ₘ₎ Reject p₍ᵢ₎ if p₍ᵢ₎ < α / (m − i + 1) Stop at first non-rejection Where: m = number of comparisons, α = original significance level

Example Calculation

Result: Bonferroni threshold: 0.008333; 1 of 6 tests significant

With 6 comparisons and α = 0.05, the Bonferroni-adjusted threshold is 0.05/6 = 0.008333. Only the p-value of 0.002 falls below this threshold, so only one test remains significant after correction. The uncorrected FWER would have been 26.5%.

Tips & Best Practices

  • Bonferroni is the most conservative correction — if power matters, consider Šidák or Holm-Bonferroni, which are slightly less strict.
  • The Holm step-down method is uniformly more powerful than Bonferroni and controls the same FWER — it's generally preferred.
  • If you're more concerned about the false discovery rate (FDR) than FWER, consider the Benjamini-Hochberg procedure instead.
  • Bonferroni and Šidák give nearly identical results when α/m is small (which is most practical scenarios).
  • The correction assumes independent or positively correlated tests. For negatively correlated tests, it can be overly conservative.
  • Always report both uncorrected and corrected results in publications so readers can assess the impact of the correction.

The Multiple Testing Problem

Every time you test a hypothesis at α = 0.05, there's a 5% chance of a false positive. Run 20 independent tests and the probability of at least one false positive is 1 − 0.95²⁰ ≈ 64%. This is the multiple testing problem, and it's pervasive in modern research where data analysis often involves many simultaneous comparisons.

Choosing a Correction Method

Bonferroni is the gold standard for simplicity but sacrifices power. Šidák provides a slight improvement. Holm's step-down procedure is uniformly more powerful — it should generally be preferred when you need FWER control. For large-scale screening (genomics, proteomics, neuroimaging), switch to FDR methods like Benjamini-Hochberg, which allow a controlled proportion of false discoveries rather than trying to eliminate them entirely.

Practical Considerations in Research

Many journals now require multiple comparison corrections for any study reporting more than one primary outcome. Pre-registering your planned analyses helps distinguish exploratory from confirmatory tests. Some researchers advocate adjusting only for the number of primary hypotheses, not secondary or exploratory analyses. The key is transparency: always report how many tests were conducted and what correction was applied.

Sources & Methodology

Last updated:

Frequently Asked Questions

  • It's a method that adjusts the significance threshold when performing multiple statistical tests. You divide your alpha level by the number of tests (α/m), ensuring the probability of at least one false positive stays at or below α across all tests.