Compare Two Independent Means

Chapter 20
Math 215

Birth Weights and Smoking

  • Do infants whose mothers smoke have a different mean birth weight than infants whose mothers do not smoke?
  • Let \(\mu_n\) be the mean weight (lbs) of infants whose mothers did not smoke, and let \(\mu_s\) be the mean for infants whose mothers smoked

Inference

  • We will estimate the difference in mean birth weights \(\mu_n-\mu_s\) using a confidence interval
  • We will conduct a hypothesis test with hypotheses
    • \(H_0: \mu_n-\mu_s = 0\)
    • \(H_A: \mu_n-\mu_s \neq 0\)

Data

  • births14 1 dataset
  • Random sample of 1,000 cases from US birth data set from 2014 (19 removed with missing values)
  • habit is smoking habit (“smoker” or nonsmoker”)
  • weight is birth weight in pounds

EDA

Histrograms showing birth weights for infants whose mothers did not smoke (top) for infants whose mothers smoked (bottom).
habit n mean sd
nonsmoker 867 7.27 1.23
smoker 114 6.68 1.60

The observed difference in means is \[\begin{array}{lcr}\bar{x}_n-\bar{x}_s &=& 7.27-6.68\\ &=& 0.59\end{array}\]

Randomization Test for Difference in Means

  • We can simulate a true null hypothesis by randomly permuting the values of the response variable
set.seed(8675309)

library(infer)
births_perm <- births14 |>
  specify(weight ~ habit) |>
  hypothesize(null = "independence") |>
  generate(reps = 1000, type = "permute") |>
  calculate(stat = "diff in means", order = c("nonsmoker", "smoker"))

Histogram of differences in means (null distribution) calculated from 1,000 random permutations of birth weights. Observed difference is indicated by dashed vertical line.

  • The p-value is the proportion of randomized differences that are at least as extreme as the observed value (\(\leq\) -0.59 or \(\geq\) 0.59)
  • There are no such randomized differences, so the p-value is approximately 0
births_perm |>
  summarize(n_extreme = sum(abs(stat >= dif_mean)),
            pval = mean(abs(stat) >= dif_mean))
# A tibble: 1 × 2
  n_extreme  pval
      <int> <dbl>
1         0     0

Test Statistic for Comparing Two Means

  • The test statistic for comparing two means is the \(T\) statistic (\(T\) score)
  • We will use a version of the \(T\) statistic that assumes the two populations have equal variance (different than the version presented in the text)

Pooled Sample Standard Deviation

  • First we compute the pooled sample standard deviation, \[s_p = \sqrt{\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}}\]
  • The pooled sample standard deviation in birth weights is \[\begin{array}{rcl} s_p &=& \sqrt{\frac{(867-1)\cdot 1.23^2+(114-1)\cdot 1.60^2}{867+114-2}}\\ &=& 1.28\end{array}\]
  • The \(T\) statistic is \[T=\frac{(\bar{x}_1-\bar{x}_2)-0}{s_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}}\]
  • For the birth weight example, the value is \[T=\frac{0.59-0}{1.28\cdot\sqrt{\frac{1}{867}+\frac{1}{114}}} = 4.63\]

Mathematical Model for Testing the Difference in Means

Note

When the null hypothesis is true and the following conditions are met, the \(T\) score has a \(t\)-distribution with \(df=n_1+n_2-2\) degrees of freedom.

  1. Groups have equal variance in the population
  2. Independent observations within and between groups
  3. Normality: Large samples and no extreme outliers.

Two Sample T-Test

  • The degrees of freedom for the birth weight example is \(df=867+114-2=979\).
  • The p-value is extremely small
2*pt(-4.63, df = 979)
[1] 4.148965e-06

Note About Relaxing the Equal Variance Assumption

  • We can compute a \(T\) statistic without assuming equal variances (see the formula in the book)
  • If the null hypothesis is true and the technical conditions are met, then the distribution of these \(T\) statistics will be be approximately \(t\)-distributed
  • The \(df\) for the approximating \(t\) distribution involves a complicated calculation (not the one listed in the text)
  • We can use the t_test function in the infer package to calculate a p-value
  • It calculates the \(T\) statistic, \(df\), and the p-value for us
  • It does NOT check conditions
  • If we specify the option var.equal = TRUE, these calculations will use the equal variance assumption
  • The value of \(T\) and the p-value differ from ours due to rounding
births14 |> 
  t_test(weight ~ habit,
         var.equal = TRUE,
         order = c("nonsmoker", "smoker"))
# A tibble: 1 × 7
  statistic  t_df    p_value alternative estimate lower_ci upper_ci
      <dbl> <dbl>      <dbl> <chr>          <dbl>    <dbl>    <dbl>
1      4.65   979 0.00000382 two.sided      0.593    0.342    0.843
  • Specifying var.equal = TRUE relaxes the equal variance assumption
  • Note that \(df\) is no longer an integer
births14 |> 
  t_test(weight ~ habit,
         var.equal = FALSE,
         order = c("nonsmoker", "smoker"))
# A tibble: 1 × 7
  statistic  t_df  p_value alternative estimate lower_ci upper_ci
      <dbl> <dbl>    <dbl> <chr>          <dbl>    <dbl>    <dbl>
1      3.82  131. 0.000208 two.sided      0.593    0.285    0.900

Bootstrap Confidence Interval for Difference in Means

  • The method for calculating bootsrap confidence intervals for a difference in means is similar to the methods we used for a difference in proportions
  • We create bootstrap samples from each group (resampling with replacement) and calculate a difference in means
  • We do this 1,000 time and use the resulting sampling distribution to calculate a bootstrap percentile CI or a boostrap SE CI
  • Lets calculate differences in bootstrapped means for the birth weight example
  • The 95% boostrap percentile CI is (0.312, 0.917)
births_boot <- births14 |>
  specify(weight ~ habit) |>
  generate(reps = 1000, type = "bootstrap") |>
  calculate(stat = "diff in means", order = c("nonsmoker", "smoker"))

births_boot |>
  summarize(ci_lo = quantile(stat, 0.025),
            ci_hi = quantile(stat, 0.975))
# A tibble: 1 × 2
  ci_lo ci_hi
  <dbl> <dbl>
1 0.312 0.917

Estimating the Difference in Means Using a Mathematical Model

  • If the technical conditions are met, including the equal variance assumption, then we can use the \(t\)-distribution to estimate the difference in means
  • We can calculate a confidence interval for the difference in means as \[(\bar{x}_1-\bar{x}_2)\pm t^{\ast}_{df}\times SE\]
  • Assuming equal variance, \(df=n_1+n_2-2\), and the standard error is \[SE = s_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}\]
  • The value of \(t^{\ast}_{df}\) depends on the degrees of freedom and the confidence level
  • For the birth weights example \[SE=1.28\cdot\sqrt{\frac{1}{867}+\frac{1}{114}}=0.128\]
  • Since \(df=979\) for this example, the value of \(t^{\ast}_{df}\) for a 95% confidence interval is 1.96
  • Thus, the 95% confidence interval is \[0.59\pm1.96\times0.128=0.59\pm0.251\]
  • In interval form it is approximately (0.339, 0.841)
qt(0.975, df = 979)
[1] 1.96239

Relaxing the Equal Variance Assumption for CI

  • We can also use the t_test function to calculate CI
  • Here is the version with the equal variance assumption
births14 |> 
  t_test(weight ~ habit,
         var.equal = TRUE,
         conf_int = TRUE,
         conf_level = 0.95,
         order = c("nonsmoker", "smoker"))
# A tibble: 1 × 7
  statistic  t_df    p_value alternative estimate lower_ci upper_ci
      <dbl> <dbl>      <dbl> <chr>          <dbl>    <dbl>    <dbl>
1      4.65   979 0.00000382 two.sided      0.593    0.342    0.843
  • Here is the confidence interval calculated without assuming equal variances
births14 |> 
  t_test(weight ~ habit,
         var.equal = FALSE,
         conf_int = TRUE,
         conf_level = 0.95,
         order = c("nonsmoker", "smoker"))
# A tibble: 1 × 7
  statistic  t_df  p_value alternative estimate lower_ci upper_ci
      <dbl> <dbl>    <dbl> <chr>          <dbl>    <dbl>    <dbl>
1      3.82  131. 0.000208 two.sided      0.593    0.285    0.900

Conclusions

  • We get the same result from both versions of the hypothesis test.
  • We reject the null hypothesis at the \(\alpha=0.05\) significance level and conclude that there is strong evidence of a difference in the average weights of infants born to mothers who smoked and those who did not (p < 0.001)
  • We are 95% confident that the mean weight of babies born to mothers who did not smoke is between 0.285 and 0.900 pounds higher than the mean weight of babies whos mothers smoked
  • This result is also consistent with the result of the hypothesis test, since 0 does not appear in the 95% confidence interval