Residual Analysis Calculator

Compute residuals, standardized residuals, leverage, Cook's distance, Durbin-Watson, skewness, kurtosis, and outlier detection for regression diagnostics.

Residual Analysis Calculator

Comma-separated
Comma-separated
Optional
Equation
Y = 1.9982X + 0.0400
Fitted OLS regression line
0.999512
99.95% variance explained
RMSE
0.1417
Root Mean Squared Error
MAE
0.1091
Mean Absolute Error
Durbin-Watson
3.2954
Negative autocorrelation suspected
Runs Test
10 runs (exp: 6.0)
Residuals may not be random
Skewness
-0.3267
Approximately symmetric
Excess Kurtosis
-0.8276
Near normal (mesokurtic)

Residual Table

XYŶResidualStd. Res.LeverageCook\'s DVisual
1.002.10002.03820.06180.5390.34550.0767
2.003.90004.0364-0.1364-1.1100.24850.2036
3.006.20006.03450.16551.2860.17580.1763
4.008.00008.0327-0.0327-0.2470.12730.0045
5.0010.100010.03090.06910.5150.10300.0152
6.0011.800012.0291-0.2291-1.7070.10300.1673
7.0014.100014.02730.07270.5490.12730.0220
8.0015.900016.0255-0.1255-0.9750.17580.1013
9.0018.200018.02360.17641.4350.24850.3406
10.0020.000020.0218-0.0218-0.1900.34550.0096

Diagnostic Reference

DiagnosticGoodConcerningIndicates
Durbin-Watson1.5–2.5<1.5 or >2.5Autocorrelation in residuals
|Std. Residual|<2>2 (outlier at >2)Unusual observations
Leverage<0.400>0.400 (2p/n)Influential X position
Cook\'s D<0.5>1.0Overall influence on regression
Skewness|s|<0.5|s|>1.0Non-normality of residuals
Kurtosis|k|<1.0|k|>2.0Heavy/light tail problems
Planning notes, formulas, and examples

About the Residual Analysis Calculator

Fitting a regression line is only the first step. Residual analysis checks whether the model is actually behaving like a usable regression rather than simply producing a high R².

This calculator reports raw residuals, standardized residuals, leverage, Cook's distance, and summary diagnostics such as Durbin-Watson, skewness, and kurtosis. Together those outputs help you look for common failure modes: curvature, changing variance, autocorrelation, and influential points.

The goal is not just to identify a line, but to see whether the assumptions behind that line are holding up once you inspect the errors directly.

When This Page Helps

Residual diagnostics matter because a visually poor model can still produce an impressive summary statistic. Looking at residual shape, influence, and correlation is often what tells you whether to transform variables, add curvature, or question a few points before you trust the fit.

How to Use the Inputs

  1. Enter X values and corresponding Y values (comma-separated).
  2. Or click a preset to load diagnostic scenarios.
  3. Set the outlier threshold for standardized residuals (default 2).
  4. Review the diagnostic output cards (RMSE, Durbin-Watson, etc.).
  5. Examine the residual table for outliers and influential points.
  6. Check Cook's distance — values > 1.0 indicate highly influential observations.
  7. Use the diagnostic reference table to interpret each metric.
Formula used
Residual: eᵢ = yᵢ − ŷᵢ. Standardized: eᵢ* = eᵢ / (s√(1−hᵢᵢ)). Leverage: hᵢᵢ = 1/n + (xᵢ−x̄)²/Sxx. Cook's D: Dᵢ = eᵢ*²·hᵢᵢ / (p(1−hᵢᵢ)). Durbin-Watson: d = Σ(eᵢ−eᵢ₋₁)²/Σeᵢ².

Example Calculation

Result: R² = 0.9997, RMSE = 0.117, Durbin-Watson = 2.14, all |std. residuals| < 2.0, max Cook's D = 0.32

Residuals show no pattern, Durbin-Watson near 2.0 (no autocorrelation), no outliers or influential points. This is a healthy regression with all assumptions met.

Tips & Best Practices

  • Always examine residual plots before trusting any regression model.
  • A curved pattern in residuals means you need polynomial or transformed terms.
  • Fan-shaped residuals indicate heteroscedasticity — use weighted regression or transform Y.
  • Cook's D > 1 means removing that single point substantially changes the entire regression.
  • If Durbin-Watson is far from 2, consider adding time-lagged predictors or using autocorrelation-corrected regression.
  • Try the "Outlier Present" preset to see how one extreme point affects diagnostics.

Regression Assumptions and Residuals

OLS regression assumes: (1) Linearity — the true relationship is linear. (2) Independence — residuals are uncorrelated. (3) Homoscedasticity — residual variance is constant. (4) Normality — residuals are normally distributed. Each assumption maps to specific diagnostic tests.

Linearity: Plot residuals vs. predicted values. Random scatter = good. Curves = consider polynomial terms. Independence: Durbin-Watson tests first-order serial correlation. Homoscedasticity: Look for fan shapes in residual plots. Normality: Check skewness and kurtosis.

Influential Points vs. Outliers

An outlier has a large residual — the model predicts poorly for that point. A high-leverage point has an extreme X value. An influential point changes the regression substantially when removed. A point can be high-leverage without being influential (if it falls on the trend), or an outlier without being influential (if leverage is low). Cook's distance captures the combined effect.

What To Do When Diagnostics Fail

Non-linearity: Add polynomial terms or transform variables. Heteroscedasticity: Use weighted least squares or robust standard errors. Autocorrelation: Use generalized least squares or add lag terms. Non-normality: Transform Y (log, sqrt) or use robust regression. Outliers: Investigate data quality, use robust methods (LAD, Huber), or report with and without.

Sources & Methodology

Last updated:

Frequently Asked Questions

  • Raw residuals (eᵢ = yᵢ − ŷᵢ) retain Y-units. Standardized residuals divide by estimated standard deviation accounting for leverage, converting to a unit-free scale where values beyond ±2 indicate potential outliers.