Math 115
So far, we’ve used simulation for inference:
These methods work well, but require many simulations.
Question: Is there a mathematical shortcut?
A sampling distribution shows how a statistic (like \(\hat{p}\)) varies from sample to sample.
In Hypothesis Testing:
For Confidence Intervals:
The distribution is bell-shaped. This pattern appears consistently with large samples.
The bell-shaped curve is called a normal distribution.
Properties:
Notation: N(μ, σ)
Key insight: Mean shifts the curve left/right; SD controls the spread.
The sampling distribution of \(\hat{p}\) is approximately normal when:
Independence: Observations are independent (e.g., from SRS)
Success-Failure Condition: At least 10 expected successes and 10 expected failures
When these conditions are met, the sampling distribution is approximately normal with:
Standard error is the standard deviation of the sampling distribution.
\[SE = \sqrt{\frac{p(1-p)}{n}}\]
Important distinction:
This is the spread we were measuring with our simulated distributions.
The Z-score tells us how many standard errors an observation is from the mean.
\[Z = \frac{\text{observed} - \text{mean}}{SE}\]
For a proportion:
\[Z = \frac{\hat{p} - p_0}{SE}\]
Converting to Z-scores standardizes any normal distribution to N(0, 1).
Advantage: We only need one table or tool for all normal calculations.
Key idea: Probabilities correspond to areas under the normal curve
From 68-95-99.7 rule: 95% falls between Z = -2 and Z = 2, so 5% is in the two tails. By symmetry, 2.5% is above Z = 2.
Neal’s z-score is \[Z_{Neal} = \frac{1470-1050}{210}=2\] Sean’s z-score is \[Z_{Sean} = \frac{28-21}{5}=1.4\]
Question: Using Neal’s result, What proportion of test-takers score above 1470?
Solution:
Heights of US adult females are approximately normal with μ = 64 inches, σ = 3 inches.
Question: A woman is 57 inches tall. Is this unusual?
Solution:
To use the normal model for hypothesis testing, check:
Scenario: Researchers surveyed a random sample of 830 payday loan borrowers. 453 (54.6%) said they support additional regulation.
Research question: Is there evidence that the majority support regulation?
Hypotheses:
Independence: Random sample of borrowers ✓
Success-Failure Condition:
Both are ≥ 10, so conditions are met. We can use the normal model.
Standard Error: \(SE = \sqrt{\frac{0.5 \times 0.5}{830}} = 0.0174\)
Z-score: \(Z = \frac{0.546 - 0.5}{0.0174} = 2.64\)
P-value: Area to the right of Z = 2.64 on standard normal
\[\text{P-value} = 0.004\]
Note: You can use the Model-Based Inference calculator in Jamovi’s Randomize module to calculate the p-value.
Conclusion: P-value = 0.004 < 0.05, so reject \(H_0\).
Can we actually generalize?
We can generalize these findings to all payday borrowers in MI
Scenario: A medical consultant had 3 complications in 62 surgeries.
Test: \(H_0: p = 0.1\) vs \(H_A: p < 0.1\)
Check success-failure condition:
6.2 < 10, so the condition is NOT met.
Solution: Use simulation (randomization) instead of the normal model.
To use the normal model for a confidence interval, check:
Independence: Observations are independent
Success-Failure Condition: Using \(\hat{p}\) (observed proportion)
Key difference from H-test: Use \(\hat{p}\), not \(p_0\)
When conditions are met:
\[\text{CI} = \hat{p} \pm z^* \times SE\]
Where:
| Confidence Level | \(z^*\) |
|---|---|
| 90% | 1.645 |
| 95% | 1.960 |
| 99% | 2.576 |
Note: You can use the Model-Based Inference calculator in Jamovi’s Randomize module to calculate these multipliers/critical values.
Payday study: \(\hat{p} = 0.546\), \(n = 830\)
Check conditions:
Standard Error: \(SE = \sqrt{\frac{0.546 \times 0.454}{830}} = 0.0173\)
95% CI: \(0.546 \pm 1.96 \times 0.0173 = (0.512, 0.58)\)
95% Confidence Interval: (0.512, 0.58)
Interpretation: We are 95% confident that the true proportion of payday borrowers who support the regulation is between 51.2% and 58%.
Note: The interval does not includes 0.5, which is consistent with our hypothesis test result (we rejected \(H_0: p = 0.5\)).
Scenario: Researchers surveyed a sample of 500 teens in a small town about texting while driving. 180 admitted to texting while driving at least once.
Null hypothesis assumption: It has been reported that about 39% of teens nationwide reported texting while driving
Test: Is there evidence that the proportion of all teens in this town who are texting while driving is less than the national rate?
Data: \(\hat{p} = 180/500 = 0.36\)
Check conditions:
Compute:
Visualizing the P-value
Conclusion: P-value > 0.05, so we fail to reject \(H_0\) at \(\alpha = 0.05\).
Statistical conclusion: P-value > 0.05, so we fail to reject \(H_0\). There is no convincing evidence that fewer than 39% of teens text while driving.
But wait… can we generalize?
The researchers do not indicate that it was a random sample.
This is NOT a random sample of all teens in this town
We cannot generalize these findings to all teens.
Teen study: \(\hat{p} = 0.36\), \(n = 500\)
Check conditions:
Standard Error: \(SE = \sqrt{\frac{0.36 \times 0.64}{500}} = 0.0215\)
95% CI: \(0.36 \pm 1.96 \times 0.0215 = (0.318, 0.402)\)
95% Confidence Interval: (0.318, 0.402)
Interpretation: We are 95% confident that the true proportion of teens who are texting while driving is between 31.8% and 40.2%.
Note: The interval includes 0.39, which is consistent with our hypothesis test result (we failed to rejected \(H_0: p = 0.39\)).
| Aspect | Simulation | Model-Based |
|---|---|---|
| P-value | Count extreme simulations | Area under normal curve |
| CI | Bootstrap percentiles | \(\hat{p} \pm z^* \times SE\) |
| When to use | Always can be used | Success/Failure conditions have to met) |
Key point: Both methods give similar results when conditions are met.
Key insight: Normal approximation works well when conditions are met (left), but fails when conditions are NOT met (right). This is why checking technical conditions matters!