Inference for Two Proportions

Randomization

Math 115

What We’ve Done vs. What We’ll Do

Previously: Inference for a single proportion

  • One binary categorical variable
  • Hypothesis testing: \(H_0: p = p_0\) (single value)
  • Confidence interval: What is \(p\)?

Now: Inference for comparing two proportions

  • Two binary categorical variables
  • Hypothesis testing: \(H_0: p_1 - p_2 = 0\) (no difference between groups)
  • Confidence interval: What is \(p_1 - p_2\)?

Two Binary Categorical Variables

When we have two binary categorical variables:

  • One explanatory variable (defines the groups)
  • One response variable (defines success/failure)

We compare the proportion of successes between groups

Examples:

  • Stent vs No Stent \(\rightarrow\) Stroke rate
  • Drug vs Placebo \(\rightarrow\) Symptom improvement
  • With sign vs Without sign \(\rightarrow\) Recycling contamination

CPR Study Introduction

Researchers studied whether blood thinners affect survival during CPR

  • Data set cpr
  • 90 patients who received CPR for cardiac arrest
  • Patients randomly assigned to treatment or control
  • Explanatory: Group (treatment = blood thinner, control = none)
  • Response: Outcome (survived or died within 24 hours)

Research question: Does blood thinner affect survival rate?

CPR Study Hypotheses

Let \(p_T\) = true survival rate for treatment, \(p_C\) = true survival rate for control

Hypotheses:

  • \(H_0: p_T - p_C = 0\) (no difference in survival rates)
  • \(H_A: p_T - p_C \neq 0\) (survival rates differ between groups)

Sometimes we state the hypotheses in words as

  • \(H_0:\) There is no association between blood thinner use and survival
  • \(H_A:\) There is an association between blood thinner use and survival

CPR Study Data

Group Died Survived Total
Control 39 11 50
Treatment 26 14 40
Total 65 25 90

Survival proportions:

  • Treatment: \(\hat{p}_T = \frac{14}{40} = 0.35\) (35%)
  • Control: \(\hat{p}_C = \frac{11}{50} = 0.22\) (22%)
  • Observed difference: \(\hat{p}_T - \hat{p}_C = 0.13\)
  • This difference is the statistic of interest

The New Challenge

For single proportion: We used a spinner to simulate outcomes under the null

  • \(H_0: p = 0.10\) \(\rightarrow\) set spinner to 10% success
  • Generate many samples, build null distribution

For two proportions: We’re testing \(H_0: p_1 - p_2 = 0\)

  • We don’t know \(p_1\) or \(p_2\), just that they’re equal under null
  • Can’t build a spinner without knowing the probabilities
  • Need a different approach: permutation test

The Key Insight

Under the null hypothesis (\(p_T = p_C\)):

  • The groups have the same survival rate
  • Whether a patient is in treatment or control doesn’t affect their outcome
  • The outcome is independent of group assignment

Key idea: If group doesn’t matter, we can shuffle the outcomes!

Permutation: Shuffling Response Values

A permutation randomly reassigns the response values (outcomes):

  • Group assignments stay fixed (40 treatment, 50 control)
  • We randomly shuffle which outcome goes with which patient
  • This simulates “what if the outcome had nothing to do with group”

After shuffling, we calculate the difference in proportions

This gives us ONE value from the null distribution

Permutation Demo

View interactive permutation demo

The demo shows:

  1. Original data with treatment/control groups
  2. Shuffling outcomes between groups (simulating null)
  3. Calculating difference in proportions after each shuffle
  4. Building the null distribution one permutation at a time

Building the Null Distribution

  • Repeat the permutation many times (1000+)
  • Each permutation gives a difference in proportions
  • Distribution of all these differences = null distribution

The null distribution shows what differences we’d expect if there’s truly no effect (i.e., if \(H_0\) is true)

CPR Null Distribution

1000 permutations. Centered at 0 (as expected under null). Shaded region shows the smaller tail used for p-value.

Calculating the P-value

Method: Double the smaller tail

For a two-sided test (\(H_A: p_T - p_C \neq 0\)):

  • Count permutations to the LEFT of 0.13: 951
  • Count permutations to the RIGHT of 0.13: 117
  • Smaller tail: 117
  • P-value = \(2 \times \frac{117}{1000} = 0.234\)

CPR Conclusion

Results:

  • Observed difference: \(\hat{p}_T - \hat{p}_C = 0.13\)
  • P-value \(\approx 0.23\)
  • Using \(\alpha = 0.05\): Do not reject \(H_0\)

Conclusion: The data do not provide convincing evidence that blood thinners affect survival rate during CPR.

Note: This was a randomized experiment, so we could make causal conclusions if we had found a significant difference.

Why Permutation Works

The permutation test assumes exchangeability under the null:

  • If null is true, outcomes are independent of group
  • Shuffling outcomes simulates this independence
  • The permuted differences show what’s expected by chance alone
  • If observed difference is rare in this distribution \(\rightarrow\) evidence against null

From Testing to Estimation

The permutation test answers: “Is there a difference?”

But we also want to know: “How big is the difference?”

Solution: Bootstrap confidence interval for \(p_1 - p_2\)

  • Gives a range of plausible values for the true difference
  • More informative than just a yes/no answer

Bootstrap Approach

Similar to single proportion bootstrap:

  1. Resample WITH replacement from the original sample
  2. Calculate statistic of interest (now: difference in proportions)
  3. Repeat many times to build the bootstrap distribution

CPR Bootstrap Demo

View interactive bootstrap demo

The demo shows:

  1. Original 90 patients (circles = treatment, squares = control)
  2. Resampling WITH replacement from entire sample
  3. Group sizes vary in each bootstrap sample
  4. Building the bootstrap distribution

CPR Bootstrap Distribution

1000 bootstraps. Centered near observed difference (0.13), NOT at 0

CPR Confidence Interval

95% Bootstrap Percentile CI:

Find the 2.5th and 97.5th percentiles of the bootstrap distribution:

\[(-0.065, 0.311)\]

Interpretation: We are 95% confident that the true difference in survival rates (treatment - control) is between -0.065 and 0.311.

CI and Test Agreement

The confidence interval and hypothesis test give consistent results:

For CPR study:

  • 95% CI: (-0.065, 0.311) includes 0
  • P-value \(\approx 0.23 > 0.05\) \(\rightarrow\) Don’t reject \(H_0\)
  • Both results indicate that 0 is a plausible value for \(p_T-p_C\)

Permutation vs Bootstrap

Permutation Test Bootstrap CI
Purpose Hypothesis testing Estimation
Question Is there a difference? How big is the difference?
Assumes Null is true Nothing about null
Method Shuffle response values Resample with replacement
Centered at 0 Observed difference
Result P-value Confidence interval

References