Inference for Two Proportions

Normal Distribution

Math 115

Where We’ve Been

Previously, we used randomization for two-proportion inference:

  • Permutation test: Shuffled outcomes to build null distribution
  • Bootstrap CI: Resampled with replacement for confidence interval

These methods work well, but require many simulations.

Question: Is there a mathematical shortcut?

The Mathematical Alternative

Just as we used the normal distribution for one proportion…

We can use the normal distribution for the difference in proportions!

When conditions are met:

  • Results will be similar to randomization methods
  • Can calculate exact p-values and CIs without simulation
  • Faster computation

Sampling Distribution of \(\hat{p}_1 - \hat{p}_2\)

When conditions are met, the sampling distribution is approximately normal with:

  • Mean: \(p_1 - p_2\) (the true difference)
  • Standard Error: \(SE = \sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}}\)

This parallels what we learned for a single proportion.

Technical Conditions

To use the normal model for two proportions:

  1. Independence: Data are independent within AND between groups
    • e.g., Random samples or randomized experiment
  2. Success-Failure Condition: At least 10 expected successes AND 10 expected failures in each group

Note: How we check success-failure differs for H-test vs CI

The Pooled Proportion

Under \(H_0: p_1 - p_2 = 0\), both groups share a common proportion.

The pooled proportion estimates this common value:

\[\hat{p}_{pool} = \frac{\text{total successes}}{\text{total observations}} = \frac{n_1\hat{p}_1 + n_2\hat{p}_2}{n_1 + n_2}\]

Important: Used ONLY for hypothesis testing, not for confidence intervals.

Two Different SE Formulas

For Hypothesis Testing (uses pooled proportion):

\[SE = \sqrt{\hat{p}_{pool}(1-\hat{p}_{pool})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}\]

For Confidence Intervals (uses separate proportions):

\[SE = \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}\]

Why different? H-test assumes null is true (equal proportions); CI makes no such assumption.

Test Statistic for H-Test

The test statistic for a hypothesis test about a difference in proportions is the Z-score:

\[Z = \frac{(\hat{p}_1 - \hat{p}_2) - 0}{SE}\]

Where:

  • \(\hat{p}_1 - \hat{p}_2\) is the observed difference in sample proportions
  • 0 is the null value (from \(H_0: p_1 - p_2 = 0\))
  • \(SE = \sqrt{\hat{p}_{pool}(1-\hat{p}_{pool})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}\) (uses pooled proportion)

CPR Study Revisited

Researchers studied whether blood thinners affect survival during CPR. 90 patients randomly assigned to treatment (blood thinner) or control.

Group Died Survived Total
Control 39 11 50
Treatment 26 14 40
Total 65 25 90

Observed difference: \(\hat{p}_T - \hat{p}_C = 0.35 - 0.22 = 0.13\)

CPR Study Hypotheses

Let \(p_T\) = true survival rate for treatment, \(p_C\) = true survival rate for control

Hypotheses:

  • \(H_0: p_T - p_C = 0\) (no difference in survival rates)
  • \(H_A: p_T - p_C \neq 0\) (survival rates differ)

This is a two-sided test.

Expected Counts for H-Test

To check the success-failure condition, we calculate expected counts using \(\hat{p}_{pool}\).

Expected successes and failures in each group:

  • Group 1: \(n_1 \times \hat{p}_{pool}\) successes, \(n_1 \times (1-\hat{p}_{pool})\) failures
  • Group 2: \(n_2 \times \hat{p}_{pool}\) successes, \(n_2 \times (1-\hat{p}_{pool})\) failures

All four values must be ≥ 10.

Checking Conditions: H-Test

Independence: Randomized experiment

Success-Failure: Using pooled proportion \(\hat{p}_{pool} = \frac{25}{90} = 0.278\)

  • Treatment: \(40 \times 0.278 = 11.1\) expected successes
  • Treatment: \(40 \times 0.722 = 28.9\) expected failures
  • Control: \(50 \times 0.278 = 13.9\) expected successes
  • Control: \(50 \times 0.722 = 36.1\) expected failures

All ≥ 10, so conditions are met.

Computing Z and P-value

Pooled SE:

\[SE = \sqrt{0.278 \times 0.722 \times \left(\frac{1}{40} + \frac{1}{50}\right)} = 0.095\]

Z-score:

\[Z = \frac{0.13 - 0}{0.095} = 1.37\]

Calculating the P-value

CPR Conclusion: H-Test

Results:

  • Observed difference: \(\hat{p}_T - \hat{p}_C = 0.13\)
  • Z-score = 1.37
  • P-value = 0.171
  • Using \(\alpha = 0.05\): P-value > 0.05, so do not reject \(H_0\)

Conclusion: The data do not provide convincing evidence that blood thinners affect survival rate during CPR.

CI Formula for Difference in Proportions

When conditions are met:

\[\text{CI} = (\hat{p}_1 - \hat{p}_2) \pm z^* \times SE\]

Where SE uses separate sample proportions:

\[SE = \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}\]

Multipliers: 90% → 1.645, 95% → 1.96, 99% → 2.576

Checking Conditions: CI

Independence: Randomized experiment

Success-Failure: Using observed counts (not pooled)

  • Treatment: 14 survived, 26 died (both ≥ 10)
  • Control: 11 survived, 39 died (both ≥ 10)

Conditions are met for using the normal model.

Computing 95% CI

SE (using separate proportions):

\[SE = \sqrt{\frac{0.35 \times 0.65}{40} + \frac{0.22 \times 0.78}{50}} = 0.0955\]

95% CI:

\[0.13 \pm 1.96 \times 0.0955 = 0.13 \pm 0.187\]

\[(-0.057, 0.317)\]

Interpreting the CI

95% Confidence Interval: (-0.057, 0.317)

Interpretation: We are 95% confident that the true difference in survival rates (treatment - control) is between -0.057 and 0.317.

Does CI include 0? Yes → Consistent with failing to reject \(H_0\)

Comparing Methods: Results

Method Purpose CPR Study Result
Permutation test H-test p-value ≈ 0.23
Normal approximation H-test p-value = 0.171
Bootstrap CI Estimation 95% CI: (-0.065, 0.311)
Normal CI Estimation 95% CI: (-0.057, 0.317)

All methods tell the same story: no convincing evidence of a difference.

When to Use Each Method

Randomization Normal Model
Conditions Always works Requires large sample
Computation Many simulations Formula-based
When conditions NOT met Use this Results may be unreliable
Software Jamovi Randomize Jamovi Randomize (Model-Based)

Why Results Are Similar

  • Both methods estimate the same thing
  • Normal model approximates the permutation distribution
  • When conditions are met, the approximation is good

What Affects Strength of Evidence?

For hypothesis testing, stronger evidence = smaller p-value = larger |Z|

Recall: \(Z = \frac{(\hat{p}_1 - \hat{p}_2) - 0}{SE}\) where \(SE = \sqrt{\hat{p}_{pool}(1-\hat{p}_{pool})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}\)

Three factors affect |Z| (and thus p-value):

  1. Effect size: Larger \(|\hat{p}_1 - \hat{p}_2|\) → larger |Z| → smaller p-value
  2. Sample size: Larger \(n_1\) and \(n_2\) → smaller SE → larger |Z| → smaller p-value
  3. Variability: \(\hat{p}_{pool}\) near 0.5 → larger SE → smaller |Z| → larger p-value

What Affects CI Width?

For confidence intervals: Width = \(2 \times z^* \times SE\)

Where \(SE = \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}\)

Three factors affect width:

  1. Confidence level: Higher confidence → larger \(z^*\)wider CI

    • 90%: \(z^* = 1.645\), 95%: \(z^* = 1.96\), 99%: \(z^* = 2.576\)
  2. Sample size: Larger \(n_1\) and \(n_2\) → smaller SE → narrower CI

  3. Variability: Proportions near 0.5 → larger SE → wider CI

Summary

Sampling distribution of \(\hat{p}_1 - \hat{p}_2\) is approximately normal when conditions are met

For Hypothesis Testing:

  • Use pooled proportion in SE formula
  • Check success-failure using expected counts from \(\hat{p}_{pool}\)

For Confidence Intervals:

  • Use separate proportions in SE formula
  • Check success-failure using observed counts

When conditions met: Normal model gives similar results to randomization

References