Scatter Plot & Regression Calculator

Plot data points, compute Pearson correlation, linear regression, R², residuals, standard error, and outlier detection interactively.

Scatter Plot & Regression Calculator

e.g. 1,2;3,4;5,6
Correlation (r)
0.975456
Very Strong Positive linear relationship
R² (Coefficient of Determination)
0.951515
95.15% of variance in Y explained by X
Regression Equation
y = 0.9515x + 1.2667
Least-squares best-fit line
Slope
0.951515
For each unit increase in x, y changes by 0.9515
Standard Error
0.6898
Typical distance of points from the regression line
Outliers (>2 SE)
None
0 point(s) more than 2 standard errors from the line

Scatter Plot

x
y

Data points   Regression line

Data & Residuals

xyPredicted ŷResidualResidual²Visual
1.002.002.218-0.2180.048
2.004.003.1700.8300.689
3.005.004.1210.8790.772
4.004.005.073-1.0731.151
5.005.006.024-1.0241.049
6.007.006.9760.0240.001
7.008.007.9270.0730.005
8.009.008.8790.1210.015
9.0010.009.8300.1700.029
10.0011.0010.7820.2180.048

Summary Statistics

StatisticXY
Mean5.50006.5000
Min1.00002.0000
Max10.000011.0000
Range9.00009.0000
n10
Planning notes, formulas, and examples

About the Scatter Plot & Regression Calculator

A scatter plot is the starting point of virtually every bivariate data analysis. It reveals the relationship between two variables at a glance — positive or negative trend, linear or curved, tight or dispersed, with or without outliers. Pair it with a linear regression line and correlation statistics and you have a powerful analysis toolkit.

This calculator lets you enter data as x,y pairs, visualizes the scatter plot, computes the least-squares regression line (y = mx + b), Pearson correlation coefficient (r), coefficient of determination (R²), standard error, and flags outliers more than 2 standard errors from the line. A full residuals table shows each point's predicted value and deviation from the line with a visual bar.

Whether you are analyzing lab results, economic data, survey responses, or homework problems, the page brings the plot, regression line, and residual structure into one view. Use the presets to explore classic data patterns — strong positive, negative, no correlation, quadratic, and outlier scenarios — before entering your own data.

When This Page Helps

Data visualization and regression analysis are core skills in every quantitative field — from science and engineering to business and social sciences. This calculator combines the scatter plot, correlation coefficient, regression line, residual analysis, and outlier detection into a single interactive experience.

It is ideal for students learning statistics, professionals doing quick data explorations, and anyone who wants to check the strength of a relationship between two variables without opening a spreadsheet or writing code.

How to Use the Inputs

  1. Enter data points as x,y pairs separated by semicolons (e.g., 1,2;3,4;5,6).
  2. Use presets to load example datasets for different correlation patterns.
  3. Toggle the regression line on or off to compare visual impressions.
  4. Read the output cards for r, R², the regression equation, slope, and standard error.
  5. Examine the scatter plot for patterns, clusters, and outliers (red dots).
  6. Review the residuals table to see how each point deviates from the fit.
  7. Check summary statistics for descriptive measures of X and Y.
Formula used
Slope: m = Σ(xᵢ−x̄)(yᵢ−ȳ) / Σ(xᵢ−x̄)². Intercept: b = ȳ − m·x̄. Pearson r = Sxy / √(Sxx·Syy). R² = r². Standard error: SE = √(SSE/(n−2)).

Example Calculation

Result: r = 0.9863, R² = 0.9728, y = 0.9879x + 0.6121

A very strong positive linear relationship — about 97% of the variance in Y is explained by X.

Tips & Best Practices

  • Always look at the scatter plot before trusting the correlation — Anscombe's quartet shows why.
  • A high R² does not imply causation — it only measures linear association.
  • Check the residuals for patterns (curves, fans) that indicate the linear model is inadequate.
  • Outliers can drastically affect r and slope — try removing them to see the impact.
  • For prediction, only interpolate within the range of your data — extrapolation is risky.
  • Enter data with semicolons between pairs: "x1,y1;x2,y2;x3,y3".

Understanding Correlation Strength

The absolute value of r indicates strength: |r| > 0.9 is very strong, 0.7–0.9 is strong, 0.5–0.7 is moderate, 0.3–0.5 is weak, and < 0.3 is very weak or no linear relationship. However, even a moderate r can be practically significant in some fields (e.g., psychology often considers r = 0.3 meaningful), while a high r can be trivial if the variables are measured redundantly.

Anscombe's Quartet

In 1973, Francis Anscombe constructed four datasets with nearly identical summary statistics (mean, variance, r, regression line) but wildly different scatter plots — one has a clear non-linear pattern, one has an outlier, and one is perfectly linear except for one point. The lesson: never skip the scatter plot. This calculator makes plotting so easy that there's no excuse for relying on numbers alone.

Beyond Simple Regression

Simple linear regression (one predictor, one response) is the foundation, but real analysis often involves multiple regression (many predictors), polynomial regression (curved fits), logistic regression (binary outcomes), or machine learning models. This calculator covers the foundational case; understanding it well is essential before tackling more complex methods.

Sources & Methodology

Last updated:

Frequently Asked Questions

  • r ranges from −1 to +1. Values near ±1 indicate a strong linear relationship; 0 means no linear relationship. It doesn't capture non-linear patterns.