| Term | Estimate |
|---|---|
| Intercept | -105.01 |
| hgt | 1.02 |
Math 115
In previous topics, our explanatory variable was categorical:
Now: Both variables are numerical
A scatterplot visualizes the relationship between two numerical variables.
Data: Body measurements from 507 physically active adults.
Positive association: As one variable increases, the other tends to increase.
Negative association: As one variable increases, the other tends to decrease.
Height and weight have a positive association — taller people tend to weigh more.
The data appear to fall roughly along a line. We can add a line of best fit:
\[\hat{y} = b_0 + b_1 x\]
Statistics vs Parameters:
Response variable (\(y\)): What we’re trying to predict
Predictor variable (\(x\)): What we use to make predictions
\[\widehat{wgt} = -105 + 1.02 \times hgt\]
Predict the weight of someone who is 170 cm tall:
\[\widehat{wgt} = -105 + 1.02 \times 170 = 68 \text{ kg}\]
Predict the weight of someone who is 180 cm tall:
\[\widehat{wgt} = -105 + 1.02 \times 180 = 78.2 \text{ kg}\]
The 10 cm difference in height corresponds to a 10.2 kg difference in predicted weight.
\[\widehat{wgt} = -105 + 1.02 \times hgt\]
Slope (\(b_1 = 1.02\)):
For each additional centimeter of height, we expect weight to increase by 1.02 kg, on average.
Template: “For each additional [unit of x], we expect [y] to [increase/decrease] by [slope] [units of y], on average.”
Intercept (\(b_0 = -105\)):
The predicted weight for someone 0 cm tall is -105 kg.
This is often not meaningful! (No one is 0 cm tall.)
Better interpretation: The intercept positions the line vertically so it passes through the data cloud.
Extrapolation: Predicting outside the range of observed data.
Rule: Only use the model within the range of your data.
Many lines could go through the data. Which one is “best”?
We need a criterion for “best fit.”
The residual is the difference between observed and predicted:
\[e_i = y_i - \hat{y}_i = \text{Observed} - \text{Predicted}\]
The least squares regression line minimizes:
\[\sum_{i=1}^{n} e_i^2 = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2\]
Why squared?
Software finds \(b_0\) and \(b_1\) that minimize \(\sum e_i^2\):
| Term | Estimate |
|---|---|
| Intercept | -105.01 |
| hgt | 1.02 |
You will learn to do this in Jamovi.
The correlation coefficient measures the strength AND direction of a linear relationship.
\[-1 \leq r \leq 1\]
| Value of r | Interpretation |
|---|---|
| r close to +1 | Strong positive linear relationship |
| r close to -1 | Strong negative linear relationship |
| r close to 0 | Weak or no linear relationship |
Scatter plots with different correlations. From IMS2 Figure 7.10.
Key properties:
\(r = 0.717\) indicates a moderately strong positive linear relationship.
R² measures how well the model fits the data.
\[R^2 = r^2 \quad \text{(for simple linear regression)}\]
Interpretation: The proportion of variability in \(y\) that is explained by \(x\).
For our height-weight data:
\[R^2 = r^2 = (0.717)^2 = 0.515\]
Height explains about 51.5% of the variability in weight.
The remaining 48.5% is due to other factors (muscle mass, bone density, etc.).
A residual plot shows residuals vs. predicted values.
What to look for: No obvious patterns → linear model is appropriate.
Scatter plots (top) and residual plots (bottom). From IMS2.
Outliers: Points far from the overall pattern.
High leverage points: Points with extreme x-values.
Influential points: Points that actually change the regression line substantially.
Test: Remove the point, refit the line. If the slope changes a lot, the point is influential.
Key insight: High leverage + doesn’t follow pattern = influential
Scatter plots with outliers. From IMS2 Figure 7.16.
Which points are high leverage? Which are influential?
| Component | Description |
|---|---|
| Model | \(\hat{y} = b_0 + b_1 x\) |
| Slope (\(b_1\)) | Change in y per unit increase in x |
| Intercept (\(b_0\)) | Predicted y when x = 0 |
| Residual | \(e = y - \hat{y}\) (observed - predicted) |
| Correlation (r) | Strength and direction of linear relationship |
| R² | Proportion of variability explained by model |
Next, we’ll learn: