# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -186. 47.3 -3.94 0.000964
2 hgt 1.51 0.276 5.46 0.0000346
IMS1 Ch. 24
Math 215
bdims 1 body measurement dataset.
507 physically active individuals (247 men, 260 women)
age, weight (wgt), height (hgt), sex, 21 body girth variables (e.g., hip girth)
bdims dataObservations of wgt vs. hgt and least squares line for first sample of 20.
Sample 1
# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -186. 47.3 -3.94 0.000964
2 hgt 1.51 0.276 5.46 0.0000346
Observations of wgt vs. hgt and least squares lines for first two samples of 20.
Sample 2
# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -119. 42.8 -2.77 0.0125
2 hgt 1.10 0.247 4.47 0.000299
Observations of wgt vs. hgt and least squares lines for first three samples of 20.
Sample 3
# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -117. 26.2 -4.46 0.000299
2 hgt 1.07 0.151 7.05 0.00000140
Least squares lines for 100 random samples of 20.
Dotplot of slopes of least squares lines from 100 random samples.
# A tibble: 1 × 3
n mean sd
<int> <dbl> <dbl>
1 100 1.01 0.221
Note
When the null hypothesis is true and the following conditions are met, the \(T\) score has a \(t\)-distribution with \(df=n-2\) degrees of freedom.
One way to check conditions is to look at residual plots.
restNYC dataset1Price (USD, includes tip and drink)Food (rating: 1 to 30)Scatter plot of Price vs Food with least squares line.
Linearity? Independent observations? Normality of residuals? Constant variability?
Residual plot.
# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -17.8 5.86 -3.04 2.74e- 3
2 Food 2.94 0.283 10.4 9.63e-20
Price) to simulate the null hypothesisPrice and FoodHistogram of slopes from different random permultations of Price (null distribution).
p-value \(\approx0\)
Histogram of slopes from bootstrapped data.
95% bootstrap percentile confidence interval: (2.38, 3.45)