Wilcoxon Rank-Sum Test Calculator

Perform the Wilcoxon rank-sum (Mann-Whitney U) test for non-parametric comparison of two groups. Get U statistic, z-score, p-value, effect size, and rank table.

Wilcoxon Rank-Sum Test Calculator

U Statistic
10
U₁ = 90, U₂ = 10 (min = 10)
Z Approximation
3.0237
Normal approximation: (U₁ − μ_U) / σ_U
p-Value
0.0025
Significant at α = 0.05
Decision
Reject H₀
Distributions differ significantly
Effect Size (r)
0.6761
Large
Hodges-Lehmann
2.0000
Median pairwise difference (robust location shift estimate)

Group Summary

GroupnMedianRank SumMean Rank
Group 1107.00145.014.50
Group 2105.0065.06.50

Rank Details

ValueGroupRank
3Group 21.5
3Group 21.5
4Group 23.5
4Group 23.5
5Group 16.5
5Group 26.5
5Group 26.5
5Group 26.5
6Group 110.5
6Group 110.5
6Group 210.5
6Group 210.5
7Group 114.5
7Group 114.5
7Group 114.5
7Group 214.5
8Group 118
8Group 118
8Group 118
9Group 120

Visual: Rank Distribution

Group 1 Mean Rank
14.5
Expected
10.5
Group 2 Mean Rank
6.5
Planning notes, formulas, and examples

About the Wilcoxon Rank-Sum Test Calculator

The Wilcoxon rank-sum test (also called the Mann-Whitney U test) is the non-parametric alternative to the independent two-sample t-test. It compares two groups without assuming normality, instead working with the ranks of the combined data to test whether one group tends to produce larger values than the other.

This calculator takes raw data from two groups, computes the Mann-Whitney U statistic, performs a z-approximation for the p-value, and provides effect sizes and the Hodges-Lehmann median difference estimator. A complete rank table shows every observation's assigned rank.

The Wilcoxon rank-sum test is ideal when data is ordinal (Likert scales, rankings), heavily skewed, contains outliers, or comes from small samples where normality cannot be verified. It's widely used in biomedical research, psychology, ecology, and quality control.

When This Page Helps

The t-test can be sensitive to skew and outliers, while the Wilcoxon rank-sum test works on ranks and is better suited to ordinal data or distributions that are difficult to treat as normal. This calculator automates the ranking, tie handling, and p-value step so the result is easier to inspect.

How to Use the Inputs

  1. Enter comma-separated data values for Group 1 and Group 2.
  2. Or click a preset to load example data.
  3. Select the tail direction: two-tailed, right-tailed, or left-tailed.
  4. Set your significance level alpha.
  5. Review the U statistic, z-approximation, and p-value.
  6. Check the effect size (r) and Hodges-Lehmann median difference.
  7. Examine the rank table to see how observations were ranked.
Formula used
Mann-Whitney U Statistic: U₁ = R₁ − n₁(n₁+1)/2 U₂ = R₂ − n₂(n₂+1)/2 U = min(U₁, U₂) Z Approximation (for large samples): z = (U₁ − μ_U) / σ_U μ_U = n₁n₂/2 σ_U = √(n₁n₂(N+1)/12) Effect Size: r = |z| / √N Hodges-Lehmann Estimator: Median of all n₁×n₂ pairwise differences

Example Calculation

Result: U = 14.5, z = 3.09, p = 0.002

Group 1 (median = 7) has significantly higher ranks than Group 2 (median = 5). U = 14.5 with z = 3.09 gives p = 0.002 (two-tailed), indicating a statistically significant difference. The effect size r = 0.69 suggests a large effect. The Hodges-Lehmann estimate of the median shift is 3.0.

Tips & Best Practices

  • Ties are handled by assigning average ranks. Many ties reduce the test's discriminating power.
  • The z-approximation is reliable for sample sizes n₁, n₂ ≥ 10. For very small samples, use exact critical values.
  • Unlike the t-test, the Wilcoxon test is not testing means — it tests whether one distribution is stochastically greater than the other.
  • The Hodges-Lehmann estimator provides a robust estimate of the median shift between groups.
  • Effect size r follows Cohen's benchmarks: 0.1 = small, 0.3 = medium, 0.5 = large.
  • This test is sometimes incorrectly called a "test of medians" — it actually tests the full distributions, not just their centers.

Rank-Based Testing Philosophy

Rank-based tests replace raw observations with their ranks in the combined sample, making the analysis robust to outliers and distributional assumptions. If a single outlier changes a value from 100 to 10,000, the rank changes by at most one position. This robustness comes at a small cost in power: when data truly is normal, the Wilcoxon test is about 95.5% as efficient as the t-test.

Handling Ties in Rank Data

Tied observations receive the average rank. For example, if observations at positions 3, 4, and 5 all share the same value, each receives rank 4. When there are many ties, a correction factor adjusts the variance of the U statistic. With discrete data (like Likert scales), ties are common and the correction becomes important.

Interpreting the Hodges-Lehmann Estimator

The Hodges-Lehmann estimator is the median of all pairwise differences d = x₁ᵢ − x₂ⱼ. It estimates the shift in location between the two distributions. Unlike the mean difference, it's resistant to outliers. A confidence interval for this estimator can be constructed using the distribution of U, providing a non-parametric analog to the confidence interval from a t-test.

Sources & Methodology

Last updated:

Frequently Asked Questions

  • They're the same test with different formulations. Wilcoxon uses rank sums directly; Mann-Whitney uses U statistics. They always give the same p-value. The names are used interchangeably.