
We will estimate the difference in mean birth weights \(\mu_n-\mu_s\) using a confidence interval
births14 1 datasethabit is smoking habit (“smoker” or nonsmoker”)weight is birth weight in pounds
| habit | n | mean | sd |
|---|---|---|---|
| nonsmoker | 867 | 7.27 | 1.23 |
| smoker | 114 | 6.68 | 1.60 |
The observed difference in means is \[\begin{array}{lcr}\bar{x}_n-\bar{x}_s &=& 7.27-6.68\\ &=& 0.59\end{array}\]
Histogram of differences in means (null distribution) calculated from 1,000 random permutations of birth weights. Observed difference is 0.59.
Note
When the null hypothesis is true and the following conditions are met, the \(T\) score has a \(t\)-distribution with \(df=n_1+n_2-2\) degrees of freedom.
t_test function in the infer package to calculate a p-valuevar.equal = TRUE, these calculations will use the equal variance assumptionvar.equal = TRUE relaxes the equal variance assumptionbirths_boot <- births14 |>
specify(weight ~ habit) |>
generate(reps = 1000, type = "bootstrap") |>
calculate(stat = "diff in means", order = c("nonsmoker", "smoker"))
births_boot |>
summarize(ci_lo = quantile(stat, 0.025),
ci_hi = quantile(stat, 0.975))# A tibble: 1 × 2
ci_lo ci_hi
<dbl> <dbl>
1 0.312 0.917
t_test function to calculate CI