What's the difference between raw and standardized residuals?

Raw residuals (eᵢ = yᵢ − ŷᵢ) retain Y-units. Standardized residuals divide by estimated standard deviation accounting for leverage, converting to a unit-free scale where values beyond ±2 indicate potential outliers.

What does the Durbin-Watson statistic mean?

DW tests for first-order autocorrelation in residuals. DW ≈ 2 means no autocorrelation. DW > 2 suggests negative autocorrelation (consecutive residuals alternate sign).

When is Cook's distance concerning?

The traditional rule: Cook's D > 1 is influential. A stricter rule uses D > 4/n. Remove or investigate high-Cook's-D points — they may be data errors, outliers, or genuinely different observations that shouldn't be modeled together.

What does high leverage mean?

Leverage measures how far xᵢ is from x̄. Extreme X values have high leverage: they have outsized potential to pull the regression line. High leverage isn't always bad — compare Cook's D to see if the point actually affects the regression.

What if residuals aren't normally distributed?

Non-normal residuals don't affect coefficient estimates but do affect confidence intervals and p-values. Check skewness (should be near 0) and kurtosis (should be near 0 for excess kurtosis). With n > 30, the Central Limit Theorem provides some protection.

How do I detect heteroscedasticity?

Look for a fan or funnel shape in the residual visual — residuals getting larger (or smaller) as X increases. Our visual bars show this pattern clearly. Formal tests include Breusch-Pagan and White's test.

Residual Analysis Calculator

Compute residuals, standardized residuals, leverage, Cook's distance, Durbin-Watson, skewness, kurtosis, and outlier detection for regression diagnostics.

Residual Analysis Calculator

X valuesComma-separated

Y valuesComma-separated

Predict Y at X =Optional

Residual display

Outlier threshold (|std. residual|)

Equation

Y = 1.9982X + 0.0400

Fitted OLS regression line

R²

0.999512

99.95% variance explained

RMSE

0.1417

Root Mean Squared Error

MAE

0.1091

Mean Absolute Error

Durbin-Watson

3.2954

Negative autocorrelation suspected

Runs Test

10 runs (exp: 6.0)

Residuals may not be random

Skewness

-0.3267

Approximately symmetric

Excess Kurtosis

-0.8276

Near normal (mesokurtic)

Residual Table

X	Y	Ŷ	Residual	Std. Res.	Leverage	Cook\'s D
1.00	2.1000	2.0382	0.0618	0.539	0.3455	0.0767
2.00	3.9000	4.0364	-0.1364	-1.110	0.2485	0.2036
3.00	6.2000	6.0345	0.1655	1.286	0.1758	0.1763
4.00	8.0000	8.0327	-0.0327	-0.247	0.1273	0.0045
5.00	10.1000	10.0309	0.0691	0.515	0.1030	0.0152
6.00	11.8000	12.0291	-0.2291	-1.707	0.1030	0.1673
7.00	14.1000	14.0273	0.0727	0.549	0.1273	0.0220
8.00	15.9000	16.0255	-0.1255	-0.975	0.1758	0.1013
9.00	18.2000	18.0236	0.1764	1.435	0.2485	0.3406
10.00	20.0000	20.0218	-0.0218	-0.190	0.3455	0.0096

Diagnostic Reference

Diagnostic	Good	Concerning	Indicates
Durbin-Watson	1.5–2.5	<1.5 or >2.5	Autocorrelation in residuals
\|Std. Residual\|	<2	>2 (outlier at >2)	Unusual observations
Leverage	<0.400	>0.400 (2p/n)	Influential X position
Cook\'s D	<0.5	>1.0	Overall influence on regression
Skewness	\|s\|<0.5	\|s\|>1.0	Non-normality of residuals
Kurtosis	\|k\|<1.0	\|k\|>2.0	Heavy/light tail problems

Planning notes, formulas, and examples

About the Residual Analysis Calculator

Fitting a regression line is only the first step. Residual analysis checks whether the model is actually behaving like a usable regression rather than simply producing a high R².

This calculator reports raw residuals, standardized residuals, leverage, Cook's distance, and summary diagnostics such as Durbin-Watson, skewness, and kurtosis. Together those outputs help you look for common failure modes: curvature, changing variance, autocorrelation, and influential points.

The goal is not just to identify a line, but to see whether the assumptions behind that line are holding up once you inspect the errors directly.

When This Page Helps

Residual diagnostics matter because a visually poor model can still produce an impressive summary statistic. Looking at residual shape, influence, and correlation is often what tells you whether to transform variables, add curvature, or question a few points before you trust the fit.

How to Use the Inputs

Enter X values and corresponding Y values (comma-separated).
Or click a preset to load diagnostic scenarios.
Set the outlier threshold for standardized residuals (default 2).
Review the diagnostic output cards (RMSE, Durbin-Watson, etc.).
Examine the residual table for outliers and influential points.
Check Cook's distance — values > 1.0 indicate highly influential observations.
Use the diagnostic reference table to interpret each metric.

Formula used

Residual: eᵢ = yᵢ − ŷᵢ. Standardized: eᵢ* = eᵢ / (s√(1−hᵢᵢ)). Leverage: hᵢᵢ = 1/n + (xᵢ−x̄)²/Sxx. Cook's D: Dᵢ = eᵢ*²·hᵢᵢ / (p(1−hᵢᵢ)). Durbin-Watson: d = Σ(eᵢ−eᵢ₋₁)²/Σeᵢ².

Example Calculation

Result: R² = 0.9997, RMSE = 0.117, Durbin-Watson = 2.14, all |std. residuals| < 2.0, max Cook's D = 0.32

Residuals show no pattern, Durbin-Watson near 2.0 (no autocorrelation), no outliers or influential points. This is a healthy regression with all assumptions met.

Tips & Best Practices

Always examine residual plots before trusting any regression model.
A curved pattern in residuals means you need polynomial or transformed terms.
Fan-shaped residuals indicate heteroscedasticity — use weighted regression or transform Y.
Cook's D > 1 means removing that single point substantially changes the entire regression.
If Durbin-Watson is far from 2, consider adding time-lagged predictors or using autocorrelation-corrected regression.
Try the "Outlier Present" preset to see how one extreme point affects diagnostics.

Regression Assumptions and Residuals

OLS regression assumes: (1) Linearity — the true relationship is linear. (2) Independence — residuals are uncorrelated. (3) Homoscedasticity — residual variance is constant. (4) Normality — residuals are normally distributed. Each assumption maps to specific diagnostic tests.

Linearity: Plot residuals vs. predicted values. Random scatter = good. Curves = consider polynomial terms. Independence: Durbin-Watson tests first-order serial correlation. Homoscedasticity: Look for fan shapes in residual plots. Normality: Check skewness and kurtosis.

Influential Points vs. Outliers

An outlier has a large residual — the model predicts poorly for that point. A high-leverage point has an extreme X value. An influential point changes the regression substantially when removed. A point can be high-leverage without being influential (if it falls on the trend), or an outlier without being influential (if leverage is low). Cook's distance captures the combined effect.

What To Do When Diagnostics Fail

Non-linearity: Add polynomial terms or transform variables. Heteroscedasticity: Use weighted least squares or robust standard errors. Autocorrelation: Use generalized least squares or add lag terms. Non-normality: Transform Y (log, sqrt) or use robust regression. Outliers: Investigate data quality, use robust methods (LAD, Huber), or report with and without.

Sources & Methodology

Last updated: March 8, 2026

Frequently Asked Questions

Raw residuals (eᵢ = yᵢ − ŷᵢ) retain Y-units. Standardized residuals divide by estimated standard deviation accounting for leverage, converting to a unit-free scale where values beyond ±2 indicate potential outliers.

Residual Analysis Calculator

Residual Analysis Calculator

Residual Table

Diagnostic Reference

About the Residual Analysis Calculator

When This Page Helps

How to Use the Inputs

Example Calculation

Tips & Best Practices

Regression Assumptions and Residuals

Influential Points vs. Outliers

What To Do When Diagnostics Fail

Sources & Methodology

Frequently Asked Questions

More in this topic

Coefficient of Determination (R²) Calculator

Correlation Calculator

Cubic Regression Calculator