Linear Regression Calculator

Calculate the best-fit line (OLS), slope, intercept, R², correlation, residuals, and prediction confidence intervals from X/Y data points.

Linear Regression Calculator

e.g., 1, 2, 3, 4, 5
e.g., 2.1, 4.0, 5.8, 8.1, 9.9
Regression Equation
Y = 2.0000·X + 0.0000
Best-fit line (OLS)
Slope (b₁)
2.000000
Each unit increase in X → 2.0000 change in Y
Intercept (b₀)
0.000000
Predicted Y when X = 0
R² (Coefficient of Determination)
0.999049
99.90% of variance in Y explained by X
Correlation (r)
0.999524
Strong
Standard Error
0.163299
Average prediction error (residual std dev)
Prediction at X=10
20.0000
95% CI: [19.57, 20.43]
Data Points
8
Mean X: 4.50, Mean Y: 9.00

Residuals Table

XY (actual)Y (predicted)ResidualBar
1.002.10002.00000.1000
2.004.00004.00000.0000
3.005.80006.0000-0.2000
4.008.10008.00000.1000
5.009.900010.0000-0.1000
6.0012.200012.00000.2000
7.0013.800014.0000-0.2000
8.0016.100016.00000.1000

R² Interpretation Guide

R² RangeStrengthInterpretationYour Data
0.90 – 1.00Very StrongExcellent fit, highly predictive✓ R² = 0.9990
0.70 – 0.89StrongGood fit, reliable predictions
0.50 – 0.69ModerateUseful but with notable error
0.30 – 0.49WeakSome relation, poor predictions
0.00 – 0.29Very WeakEssentially no linear relationship
Planning notes, formulas, and examples

About the Linear Regression Calculator

Linear regression is the foundation of predictive analytics — and our calculator makes it accessible to everyone. Enter your X and Y data points, and quickly get the best-fit regression line, slope, intercept, R² (coefficient of determination), Pearson correlation, standard error, and full residual analysis.

The calculator uses ordinary least squares (OLS) to minimize the sum of squared residuals, producing the mathematically optimal straight line through your data. Beyond the equation, it provides a prediction tool: enter any X value to get the predicted Y with confidence intervals (90%, 95%, or 99%).

Preset datasets let you explore real-world relationships — study hours vs grades, age vs income, ad spend vs revenue. The residuals table shows how far each data point deviates from the line, and the R² interpretation guide helps you assess the practical strength of your model. It is a quick way to move from raw paired data to a readable model summary without leaving basic statistical diagnostics behind.

When This Page Helps

Understanding relationships between variables is central to data-driven decision-making. Does increased ad spend actually boost revenue? Do more study hours improve grades? Linear regression quantifies these relationships with mathematical precision.

This calculator eliminates the need for Excel or statistical software for basic regression tasks. The complete output — equation, coefficients, diagnostics, predictions — provides everything needed for reports, homework, and quick analyses.

How to Use the Inputs

  1. Enter X values (comma-separated) — the independent/predictor variable.
  2. Enter Y values (comma-separated) — the dependent/response variable.
  3. Or click a preset dataset to load sample data.
  4. Review the regression equation, slope, intercept, and R².
  5. Enter an X value to predict Y with a confidence interval.
  6. Examine the residuals table for model fit diagnostics.
  7. Use the R² guide to interpret your model strength.
Formula used
Slope b₁ = (n·ΣXY − ΣX·ΣY) / (n·ΣX² − (ΣX)²). Intercept b₀ = Ȳ − b₁·X̄. R² = 1 − SS_res/SS_tot. r = ±√R². Standard Error = √(SS_res/(n−2)).

Example Calculation

Result: Y = 1.96X + 0.10, R² = 0.9993, r = 0.9997

Near-perfect linear relationship. Each unit increase in X predicts a 1.96 increase in Y. R² of 0.9993 means 99.93% of Y's variance is explained by X.

Tips & Best Practices

  • Always check residuals before trusting R² — a high R² with patterned residuals means the model is wrong.
  • Extrapolation (predicting beyond your data range) is unreliable — the linear trend may not continue.
  • Adding more data points improves reliability but won't fix a fundamentally nonlinear relationship.
  • Swap X and Y to reverse the prediction direction — the slope changes but R² stays the same.
  • Use the confidence interval, not just the point prediction, for realistic forecasting.
  • An R² near 0 doesn't mean "no relationship" — it means "no LINEAR relationship." The data may have a strong curved pattern.

The Mathematics of Ordinary Least Squares

OLS regression finds slope b₁ and intercept b₀ that minimize Σ(yᵢ − ŷᵢ)², the sum of squared residuals. The closed-form solution yields b₁ = (n·ΣXY − ΣX·ΣY)/(n·ΣX² − (ΣX)²) and b₀ = Ȳ − b₁·X̄. This is computationally efficient and produces the unique global minimum for linear models.

The choice to minimize squared (not absolute) residuals gives OLS desirable statistical properties: unbiased estimates, minimum variance among linear estimators (Gauss-Markov theorem), and equivalence to Maximum Likelihood Estimation under normally distributed errors.

Assumptions and Diagnostics

Linear regression assumes: (1) Linear relationship between X and Y, (2) Independent observations, (3) Homoscedasticity (constant residual variance), (4) Normally distributed residuals. Violations don't invalidate the regression but affect confidence intervals and hypothesis tests. The residuals table helps diagnose issues — look for patterns, increasing spread, or extreme outliers.

From Simple to Multiple Regression

This calculator handles simple (one X) linear regression. Real-world problems often involve multiple predictors (multiple regression): Y = b₀ + b₁X₁ + b₂X₂ + ... + bₖXₖ. The concept is identical — OLS minimizes squared residuals — but the math uses matrix algebra. Tools like R, Python, and Excel handle multiple regression; our calculator provides the foundational understanding.

Sources & Methodology

Last updated:

Frequently Asked Questions

  • R² measures how much of Y's variation is explained by the model. 0.90+ is very strong (common in physical sciences). In social sciences, 0.30-0.50 is typical and often useful. There's no universal "good" value — it depends on your field.