| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | -186.302648 | 47.3081367 | -3.938068 | 0.0009641 |
| hgt | 1.507037 | 0.2759548 | 5.461175 | 0.0000346 |
bdims body measurement dataset is available here.
507 physically active individuals (247 men, 260 women)
age, weight (wgt), height (hgt), sex, 21 body girth variables (e.g., hip girth)
Observations of wgt vs. hgt and least squares line for the entire population.
bdims dataObservations of wgt vs. hgt and least squares line for first sample of 20.
Sample 1
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | -186.302648 | 47.3081367 | -3.938068 | 0.0009641 |
| hgt | 1.507037 | 0.2759548 | 5.461175 | 0.0000346 |
Observations of wgt vs. hgt and least squares lines for first two samples of 20.
Sample 2
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | -186.302648 | 47.3081367 | -3.938068 | 0.0009641 |
| hgt | 1.507037 | 0.2759548 | 5.461175 | 0.0000346 |
Observations of wgt vs. hgt and least squares lines for first three samples of 20.
Sample 3
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | -186.302648 | 47.3081367 | -3.938068 | 0.0009641 |
| hgt | 1.507037 | 0.2759548 | 5.461175 | 0.0000346 |
Least squares lines for 100 random samples of 20.
Dotplot of slopes of least squares lines from 100 random samples.
| n | mean | sd |
|---|---|---|
| 100 | 1.009732 | 0.220826 |
Let us run a test to see if there significant evidence of the linear relationship between weight and height
Since the direction of the test is not indicated, we will use a two-sided alternative: \[H_0:\beta_1=0\] \[H_A:\beta_1 \ne 0\]
We can randomly permute the value of the response (wgt) to simulate the null hypothesis
Each time, compute the slope of the relationship between Wgt and hgt
p-value \(\approx0\)
Note
When the null hypothesis is true and the following conditions are met, the \(T\) score has a \(t\)-distribution with \(df=n-2\) degrees of freedom.
One way to check conditions is to look at residual plots.
Linearity? Independent observations? Normality of residuals? Constant variability?

hgt)| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | -105.011254 | 7.5394092 | -13.92831 | 0 |
| hgt | 1.017617 | 0.0439868 | 23.13459 | 0 |
We can also calculate a 95% bootstrap percentile confidence interval based on the entire sample
We will create bootstrapped samples and calculate resulting slopes of the regression lines
Then we will create the distribution of the bootstrapped slopes

95% bootstrap percentile confidence interval: \((0.933, 1.10)\)
We are 95% confident that the slope is between 0.933 and 1.10, meaning that the weight increases from 0.933 to 1.1 kilograms for each increase of 1 cm in the height.
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | -105.011254 | 7.5394092 | -13.92831 | 0 |
| hgt | 1.017617 | 0.0439868 | 23.13459 | 0 |
The value of \(t^{\ast}_{df}\) depends on the confidence level and degrees of freedom
For example, for a 95% confidence itreval will be \(t^{\ast}_{505}=1.965\)
Finally, 95% confidence interval for the slope of the regression line will be \[1.02 \pm 1.965 \cdot 0.044=(0.934,1.106)\]
restNYC dataset1 is available herePrice (USD, includes tip and drink)Food (rating: 1 to 30)Scatter plot of Price vs Food with least squares line.
Least squares regression line \[\widehat{Price}=-17.8+2.94\times Food\]
Recall that we can use Jamovi to find the equation for the regression line (see, e.g., J Lab 4)
Linearity? Independent observations? Normality of residuals? Constant variability?
Residual plot.
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | -17.83215 | 5.8631197 | -3.04141 | 0.0027375 |
| Food | 2.93896 | 0.2833809 | 10.37106 | 0.0000000 |
Price) to simulate the null hypothesisPrice and FoodHistogram of slopes from different random permultations of Price (null distribution).
The proportion of values \(\ge 2.94\) or \(\le -2.94\) is 0, so p-value \(\approx0\)

95% bootstrap percentile confidence interval: (2.38, 3.45)
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | -17.83215 | 5.8631197 | -3.04141 | 0.0027375 |
| Food | 2.93896 | 0.2833809 | 10.37106 | 0.0000000 |
Since, \(df = 166\), \(t^{\ast}_{df}=1.974\) for a 95% CI
The 95% CI is \(2.94\pm1.974\times0.283=(2.34, 3.49)\).