cpr data set we explored in Ch. 14, available here| group | died | survived | total |
|---|---|---|---|
| control | 39 | 11 | 50 |
| treatment | 26 | 14 | 40 |
| total | 65 | 25 | 90 |
Difference in proportions of “survived”: \[\hat{p}_T-\hat{p}_C=\frac{14}{40}-\frac{11}{50}=0.13\]
Null distribution for difference in proportions that survived (treatment - control). Observed difference in proportions indicated by dashed line.
For a two sided test, count the number of simulated differences that are
Double the smaller count and divide by the number of simulations to get the p-value
p-value = \(2\times 55/1000 = 0.11\)
Sampling distribution of \(\hat{p}_1-\hat{p}_2\)
The sampling distribution of \(\hat{p}_1-\hat{p}_2\) based on samples of size \(n_1\) and \(n_2\) and population proportions \(p_1\) and \(p_2\) will be approximately normal with mean \(p_1-p_2\) and standard error \[SE(\hat{p}_1-\hat{p}_2)=\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}}\]
if the following technical conditions are met:
In the CPR example we expect
Since there are at least 10 expected successes and failures in each group a normal approximation of the null distribution is appropriate
For the CPR example the Z score is \[Z=\frac{(\hat{p}_T-\hat{p}_C)-0}{SE}=\frac{0.13}{0.095}=1.37\]
Standard normal curve with shaded area corresponding to p-value
Compare this p-value (0.171) to the one we calculated using random permutation (0.11)
Let’s compute 1,000 differences in bootstrapped proportions using the CPR data.
The 95% bootstrap percentile confidence interval for the difference in survival rates (treatment - control) is between -0.0416 and 0.330.
| Type | Interval |
|---|---|
| Bootstrap Percentile | (-0.042, 0.330) |
| Normal Approximation | (-0.057, 0.317) |