Pearson Correlation Calculator

Calculate Pearson's r with step-by-step computation, Fisher z confidence intervals, t-test significance, covariance, and full deviation table.

About the Pearson Correlation Calculator

Pearson's r measures the strength and direction of a linear relationship between two variables. This calculator shows the full computation path instead of only the final coefficient, so you can inspect the deviations, cross-products, and sums of squares that build the result.

Alongside r, the page also reports R², a t-test for significance, covariance, and Fisher z confidence intervals. That makes it useful when you need both the coefficient and the uncertainty around it.

Preset datasets cover classic positive and negative relationships so the calculation steps are easy to compare against a known pattern.

Why Use This Pearson Correlation Calculator?

Pearson correlation is often the first pass when you want to know whether a relationship is strong enough to matter and straight enough to model. Showing the arithmetic step by step makes it easier to audit the answer and explain where it came from.

The Fisher z interval and significance test add the uncertainty that a single r value cannot show on its own.

How to Use This Calculator

  1. Enter paired X and Y values (comma-separated, same count).
  2. Or click a preset to load example relationships.
  3. Select your significance level α (0.01, 0.05, or 0.10).
  4. Review Pearson r, R², and confidence interval.
  5. Check the t-statistic and significance result.
  6. Study the computation steps table to verify the math.
  7. Use the interpretation guide for your specific r value.

Formula

r = Σ(xᵢ−x̄)(yᵢ−ȳ) / √[Σ(xᵢ−x̄)²·Σ(yᵢ−ȳ)²]. t = r√(n−2)/√(1−r²), df=n−2. Fisher z = ½·ln((1+r)/(1−r)), SE_z = 1/√(n−3).

Example Calculation

Result: r = 0.9972, R² = 0.9945, t = 35.58 (p < 0.001), 95% CI [0.9872, 0.9994]

Height and weight show a very strong positive linear correlation. 99.45% of weight variation is linearly associated with height. The Fisher z CI confirms the true population r is between 0.987 and 0.999.

Tips & Best Practices

Deriving Pearson's r

Pearson's r is the ratio of covariance to the product of standard deviations: r = Cov(X,Y)/(SD_X · SD_Y). Expanding Cov(X,Y) = Σ(xᵢ−x̄)(yᵢ−ȳ)/(n−1), we get the familiar formula. The denominator normalizes the covariance to the [−1, +1] range regardless of variable scales.

If all points fall exactly on a line with positive slope, every (xᵢ−x̄)(yᵢ−ȳ) term is positive, and r = +1. If the line has negative slope, they're all negative, giving r = −1. Scattered points produce a mix of positive and negative terms that partially cancel, yielding |r| < 1.

Hypothesis Testing for ρ

The null hypothesis H₀: ρ = 0 is tested using t = r√(n−2)/√(1−r²) with n−2 degrees of freedom. Reject H₀ when |t| > t_critical. For testing H₀: ρ = ρ₀ (some non-zero value), convert to z-scores: z = (z_r − z_ρ₀) / SE_z, where z_r = Fisher transform of r and SE_z = 1/√(n−3).

Effect Size Interpretation

In psychology and social sciences, Cohen's guidelines classify r = 0.10 as small, 0.30 as medium, 0.50 as large. In medical research, r = 0.30 might be clinically meaningful. In physics, r < 0.99 might indicate measurement error. Always interpret r in context, not by universal cutoffs.

Sources & Methodology

Last updated:

Frequently Asked Questions

What is Pearson's r and what does it range?

Pearson's r measures the strength and direction of linear association. It ranges from −1 (perfect negative) through 0 (no linear relationship) to +1 (perfect positive). Only linear relationships are captured — a perfect parabola gives r ≈ 0.

What's the Fisher z-transform for?

Raw r has a skewed sampling distribution, especially near ±1. Fisher's z-transform converts r to a normally distributed z-score, enabling accurate confidence intervals and hypothesis tests for the population correlation ρ.

What assumptions does Pearson's r require?

Ideally: (1) both variables are continuous, (2) relationship is linear, (3) no extreme outliers, (4) bivariate normality for significance tests. It's robust to mild violations of normality with n ≥ 30.

How does Pearson differ from Spearman?

Pearson measures linear association using raw values. Spearman measures monotonic association using ranks. For linear relationships, both give similar results. For nonlinear monotonic relationships (log, exponential), Spearman is higher.

Why does my confidence interval seem so wide?

CI width depends on n and |r|. With n=10 and r=0.50, a 95% CI might be [−0.17, 0.87]. You need about n=40 for useful CIs with moderate correlations. Larger samples → narrower CIs.

Can one outlier change r dramatically?

Yes. A single extreme point can inflate r from 0.1 to 0.8 or deflate it from 0.9 to 0.3. Always check for outliers before interpreting Pearson's r, or use Spearman's rank correlation instead.

Related Pages