Inference for Two Independent Means

Topic 12

Math 115

Comparing Two Groups

Previously, we made inferences about a single mean from one group.

Now we extend to comparing two independent groups.

Research question: Is there a difference between the population means of two groups?

Same framework:

  • CI: Point estimate ± margin of error
  • Hypothesis testing with p-values

Data Structure for Two Means

When comparing two means:

  • Response variable: Numerical
  • Explanatory variable: Binary categorical (defines the two groups)

This is the same structure as two-proportion inference, but now the response is numerical instead of categorical.

Statistics and Parameters for Two Means

Statistic (sample) Parameter (population)
Group 1 mean \(\bar{x}_1\) \(\mu_1\)
Group 2 mean \(\bar{x}_2\) \(\mu_2\)
Difference \(\bar{x}_1 - \bar{x}_2\) \(\mu_1 - \mu_2\)

Statistic of interest: \(\bar{x}_1 - \bar{x}_2\) (difference in sample means)

Goal: Make inferences about \(\mu_1 - \mu_2\) (difference in population means)

Birth Weights and Smoking

Do infants whose mothers smoke have different mean birth weights?

  • Response: Birth weight (pounds) — quantitative
  • Explanatory: Smoking habit (smoker/nonsmoker) — binary categorical
  • Data: Random sample of 981 births from 2014

Research question: Is the mean birth weight different for babies born to mothers who smoked vs. those who did not?

EDA: Birth Weights

Group n \(\bar{x}\) s
Nonsmoker 867 7.27 1.23
Smoker 114 6.68 1.6

Observed Difference

Point estimate:

\[\bar{x}_n - \bar{x}_s = 7.27 - 6.68 = 0.59 \text{ lbs}\]

Babies born to nonsmoking mothers weigh about 0.59 pounds more on average in our sample.

Question: Is this difference statistically significant, or could it be due to chance?

Pooled Standard Deviation

When comparing two groups, we pool the variability to get a combined estimate of the common population SD.

\[s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}}\]

For birth weights:

\[s_p = \sqrt{\frac{(867-1) \cdot 1.23^2 + (114-1) \cdot 1.6^2}{867+114-2}} = 1.28\]

SE for Difference in Means

The standard error for the difference in sample means:

\[SE = s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}\]

For birth weights:

\[SE = 1.28 \times \sqrt{\frac{1}{867} + \frac{1}{114}} = 0.128\]

Hypotheses for Birth Weights

Let \(\mu_n\) = mean birth weight for all babies born to nonsmoking mothers

Let \(\mu_s\) = mean birth weight for all babies born to smoking mothers

Hypotheses:

  • \(H_0: \mu_n - \mu_s = 0\) (no difference in mean birth weights)
  • \(H_A: \mu_n - \mu_s \neq 0\) (there is a difference)

This is a two-sided test.

T-Statistic for Two Means

The T-statistic measures how far the observed difference is from the null value in SE units:

\[T = \frac{(\bar{x}_1 - \bar{x}_2) - 0}{SE}\]

Degrees of freedom: \(df = n_1 + n_2 - 2 = 867 + 114 - 2 = 979\)

For birth weights:

\[T = \frac{0.59 - 0}{0.128} = 4.65\]

Conditions for Two-Sample T-Test

Independence:

  • Random sample from each population
  • Observations independent within and between groups

Normality:

  • Both groups have n ≥ 30, no extreme outliers

Equal variance:

  • Similar spread in both groups (check histograms/SDs)

Conditions are met for using the t-distribution.

Calculating the P-value

Use Jamovi’s Model-Based Inference calculator with t-distribution (df = 979).

P-value < 0.001

Conclusion: Hypothesis Test

Results:

  • T = 4.65
  • P-value < 0.001
  • Using α = 0.05: P-value < 0.05

Decision: Reject \(H_0\)

Conclusion: The data provide strong evidence that the mean birth weight differs between babies born to smoking and nonsmoking mothers.

CI Formula for Difference in Means

When conditions are met:

\[\text{CI} = (\bar{x}_1 - \bar{x}_2) \pm t^*_{df} \times SE\]

Where:

  • \(\bar{x}_1 - \bar{x}_2\) = observed difference in sample means
  • \(t^*_{df}\) = critical value from t-distribution with df = \(n_1 + n_2 - 2\)
  • \(SE = s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}\)

Birth Weights 95% CI

Given: \(\bar{x}_n - \bar{x}_s = 0.59\), \(SE = 0.128\), \(df = 979\)

Critical value: \(t^*_{979} = 1.962\) (from Jamovi)

95% CI:

\[0.59 \pm 1.962 \times 0.128 = (0.342, 0.843)\]

Interpreting the CI

95% CI: (0.342, 0.843) pounds

Interpretation: We are 95% confident that the mean birth weight for babies born to nonsmoking mothers is between 0.34 and 0.84 pounds higher than for babies born to smoking mothers.

Does CI include 0? No

This is consistent with our hypothesis test—we rejected \(H_0\), and 0 is not in the CI.

Iris Example

  • We compare the mean sepal length between two independent groups of iris flowers (setosa and versicolor) to determine whether their average sepal lengths differ.
  • We also assume that the variability of the populations of both flowers is similar
Group (n) Sample mean (cm) Sample SD (cm)
Setosa 30 5.006 0.3525
Versicolor 70 5.936 0.5162

Research question:
Is the true mean sepal length for setosa different from the true mean sepal length for versicolor?

  • The pooled sample standard deviation \[s_p = \sqrt{\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}}\]

  • The \(T\) statistic is \[T=\frac{(\bar{x}_1-\bar{x}_2)-0}{s_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}}\]

  • The degrees of freedom (d.f.) are \(df=n_1+n_2-2\)

  • Confidence interval for the difference in means as \[(\bar{x}_1-\bar{x}_2)\pm t^{\ast}_{df}\times SE\]

Technical Note: Equal Variance

We used Student’s t-test, which assumes equal population variances.

Alternative: Welch’s t-test

  • Does NOT assume equal variance
  • Uses a different (more complex) df formula
  • Jamovi can calculate both versions

For this course, we use the equal variance version. When variances appear very different, Welch’s test is preferred.

What Affects Strength of Evidence?

For hypothesis testing, stronger evidence = smaller p-value = larger |T|

Recall: \(T = \frac{(\bar{x}_1 - \bar{x}_2) - 0}{SE}\) where \(SE = s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}\)

Three factors affect |T| (and thus p-value):

  1. Effect size: Larger \(|\bar{x}_1 - \bar{x}_2|\) → larger |T| → smaller p-value
  2. Sample size: Larger \(n_1\) and \(n_2\) → smaller SE → larger |T| → smaller p-value
  3. Variability: Smaller \(s_1\) and \(s_2\) → smaller \(s_p\) → smaller SE → larger |T| → smaller p-value

What Affects CI Width?

For confidence intervals: Width = \(2 \times t^* \times SE\)

Where \(SE = s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}\)

Three factors affect width:

  1. Confidence level: Higher confidence → larger \(t^*\)wider CI

  2. Sample size: Larger \(n_1\) and \(n_2\) → smaller SE → narrower CI

  3. Variability: Smaller \(s_1\) and \(s_2\) → smaller \(s_p\) → smaller SE → narrower CI

Summary

For comparing two independent means:

Component Formula
Pooled SD \(s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}}\)
Standard Error \(SE = s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}\)
Degrees of freedom \(df = n_1 + n_2 - 2\)
T-statistic \(T = \frac{(\bar{x}_1 - \bar{x}_2) - 0}{SE}\)
CI \((\bar{x}_1 - \bar{x}_2) \pm t^*_{df} \times SE\)

Conditions: Independence + Normality + Equal variance

References