topic6

Inference with Mathematical Models

One Proportion

Math 115

Where We’ve Been

So far, we’ve used simulation for inference:

Hypothesis testing: Simulated null distribution using parametric bootstrap
Confidence intervals: Simulated bootstrap distribution by resampling

These methods work well, but require many simulations.

Question: Is there a mathematical shortcut?

Reminder: What is a Sampling Distribution?

A sampling distribution shows how a statistic (like \(\hat{p}\)) varies from sample to sample.

In Hypothesis Testing:

Created null distribution by simulating many samples assuming \(H_0\) is true
“If \(H_0\) were true, how much would \(\hat{p}\) vary?”

For Confidence Intervals:

Created bootstrap distribution by resampling our data
“How much does \(\hat{p}\) vary due to sampling?”

The Shape of Sampling Distributions

The distribution is bell-shaped. This pattern appears consistently with large samples.

The Normal Distribution

The bell-shaped curve is called a normal distribution.

Properties:

Symmetric and unimodal
Described by two parameters:
- Mean (μ): center of the distribution
- Standard deviation (σ): spread of the distribution

Notation: N(μ, σ)

Normal Distributions with Different Parameters

Key insight: Mean shifts the curve left/right; SD controls the spread.

Central Limit Theorem for Proportions

The sampling distribution of \(\hat{p}\) is approximately normal when:

Independence: Observations are independent (e.g., from SRS)
Success-Failure Condition: At least 10 expected successes and 10 expected failures

When these conditions are met, the sampling distribution is approximately normal with:

Mean (\(\mu\)) = \(p\) (the true population proportion)
Standard deviation (\(\sigma\)) = \(\sqrt{\frac{p(1-p)}{n}}\)

Standard Error (SE)

Standard error is the standard deviation of the sampling distribution.

\[SE = \sqrt{\frac{p(1-p)}{n}}\]

Important distinction:

For hypothesis testing: Use \(p_0\) (the null hypothesis value)
For confidence intervals: Use \(\hat{p}\) (the observed proportion)

This is the spread we were measuring with our simulated distributions.

Z-Scores

The Z-score tells us how many standard errors an observation is from the mean.

\[Z = \frac{\text{observed} - \text{mean}}{SE}\]

For a proportion:

\[Z = \frac{\hat{p} - p_0}{SE}\]

Z-Scores: Interpretation

Z = 2 means the observation is 2 SEs above the mean
Z = -1.5 means the observation is 1.5 SEs below the mean

The Standard Normal Distribution

Converting to Z-scores standardizes any normal distribution to N(0, 1).

Advantage: We only need one table or tool for all normal calculations.

Using the Normal Distribution to Calculate Probabilities

Key idea: Probabilities correspond to areas under the normal curve

From 68-95-99.7 rule: 95% falls between Z = -2 and Z = 2, so 5% is in the two tails. By symmetry, 2.5% is above Z = 2.

Comparing z-scores

SAT scores follow a nearly normal distribution with a mean of 1470 points and a standard deviation of 210 points.
ACT scores also follow a nearly normal distribution with mean of 21 points and a standard deviation of 5 points.
Suppose Neal scored 1470 points on his SAT and Sean scored 24 points on his ACT.
Who performed better?

Neal’s z-score is \[Z_{Neal} = \frac{1470-1050}{210}=2\] Sean’s z-score is \[Z_{Sean} = \frac{28-21}{5}=1.4\]

Comparing z-scores

So Neal performed better since the probability of getting the score as high as Neal’s is smaller than getting the score as high as Sean’s

68-95-99.7 Rule

68% of data falls within 1 SD of mean
95% falls within 2 SD (1.96 to be precise)
99.7% falls within 3 SD

Estmate probabilities with 68-95-99.7 Rule

We can quickly approximate the prbabilities of the tails of the normal distribution based on 68-95-99.7 Rule

Question: Using Neal’s result, What proportion of test-takers score above 1470?

Solution:

Z = 2 means 1470 is 2 standard deviations above the mean
From 68-95-99.7 rule: 95% fall between Z = -2 and Z = 2
So 5% are outside this range; by symmetry, 2.5% score above 1470

Example: Adult Female Heights

Heights of US adult females are approximately normal with μ = 64 inches, σ = 3 inches.

Question: A woman is 57 inches tall. Is this unusual?

Solution:

Z-score: \(Z = \frac{57 - 64}{3} = \frac{-7}{3} = -2.3\)
She is below 2 standard deviations below the mean
From 68-95-99.7 rule: only 5% of women fall outside ±2 SD
Less than 2.5% are shorter than Z = -2.3, so this is unusual

Technical Conditions: Hypothesis Testing

To use the normal model for hypothesis testing, check:

Independence: Observations are independent
- e.g., from simple random sample
Success-Failure Condition: Using \(p_0\) (the null value)
- \(n \times p_0 \geq 10\) (expected successes)
- \(n \times (1 - p_0) \geq 10\) (expected failures)

Example: Payday Loan Study

Scenario: Researchers surveyed a random sample of 830 payday loan borrowers. 453 (54.6%) said they support additional regulation.

Research question: Is there evidence that the majority support regulation?

Hypotheses:

\(H_0: p = 0.5\) (no majority)
\(H_A: p > 0.5\) (majority supports)

Checking Conditions

Independence: Random sample of borrowers ✓

Success-Failure Condition:

Expected successes: \(n \times p_0 = 830 \times 0.5 = 415\) ✓
Expected failures: \(n \times (1-p_0) = 830 \times 0.5 = 415\) ✓

Both are ≥ 10, so conditions are met. We can use the normal model.

Computing Z and P-value

Standard Error: \(SE = \sqrt{\frac{0.5 \times 0.5}{830}} = 0.0174\)

Z-score: \(Z = \frac{0.546 - 0.5}{0.0174} = 2.64\)

P-value: Area to the right of Z = 2.64 on standard normal

\[\text{P-value} = 0.004\]

Note: You can use the Model-Based Inference calculator in Jamovi’s Randomize module to calculate the p-value.

Visualizing the P-value

Conclusion: P-value = 0.004 < 0.05, so reject \(H_0\).

Conclusion

So we reject the null hypothesis. It means that There is convincing evidence that a majority of all MI payday borrowers supports the regulation.

Can we actually generalize?

The researchers stated that it was a random sample of payday borrowers in MI.

We can generalize these findings to all payday borrowers in MI

When Conditions Are NOT Met

Scenario: A medical consultant had 3 complications in 62 surgeries.

Test: \(H_0: p = 0.1\) vs \(H_A: p < 0.1\)

Check success-failure condition:

Expected successes: \(62 \times 0.1 = 6.2\) ✗

6.2 < 10, so the condition is NOT met.

Solution: Use simulation (randomization) instead of the normal model.

Technical Conditions: Confidence Intervals

To use the normal model for a confidence interval, check:

Independence: Observations are independent
Success-Failure Condition: Using \(\hat{p}\) (observed proportion)
- \(n \times \hat{p} \geq 10\) (observed successes)
- \(n \times (1 - \hat{p}) \geq 10\) (observed failures)
- Equivalently: at least 10 successes AND 10 failures observed

Key difference from H-test: Use \(\hat{p}\), not \(p_0\)

CI Formula Using Normal Model

When conditions are met:

\[\text{CI} = \hat{p} \pm z^* \times SE\]

Where:

\(\hat{p}\) = observed proportion
\(z^*\) = critical value (multiplier) for confidence level
\(SE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\)

Multipliers for Different Confidence Levels

Confidence Level	\(z^*\)
90%	1.645
95%	1.960
99%	2.576

Note: You can use the Model-Based Inference calculator in Jamovi’s Randomize module to calculate these multipliers/critical values.

Computing a 95% CI

Payday study: \(\hat{p} = 0.546\), \(n = 830\)

Check conditions:

Observed successes: 453 ≥ 10 ✓
Observed failures: 377 ≥ 10 ✓

Standard Error: \(SE = \sqrt{\frac{0.546 \times 0.454}{830}} = 0.0173\)

95% CI: \(0.546 \pm 1.96 \times 0.0173 = (0.512, 0.58)\)

Interpreting the CI

95% Confidence Interval: (0.512, 0.58)

Interpretation: We are 95% confident that the true proportion of payday borrowers who support the regulation is between 51.2% and 58%.

Note: The interval does not includes 0.5, which is consistent with our hypothesis test result (we rejected \(H_0: p = 0.5\)).

Full Example: Teen Driving Study

Scenario: Researchers surveyed a sample of 500 teens in a small town about texting while driving. 180 admitted to texting while driving at least once.

Null hypothesis assumption: It has been reported that about 39% of teens nationwide reported texting while driving

Test: Is there evidence that the proportion of all teens in this town who are texting while driving is less than the national rate?

\(H_0: p = 0.39\)
\(H_A: p < 0.39\)

Teen Driving: Analysis

Data: \(\hat{p} = 180/500 = 0.36\)

Check conditions:

Independence: Responses are independent ✓
Success-Failure: \(500 \times 0.39 = 195\) ≥ 10 ✓

Compute:

\(SE = \sqrt{\frac{0.39 \times 0.61}{500}} = 0.0218\)
\(Z = \frac{0.36 - 0.39}{0.0218} = -1.38\)
P-value = 0.08451 (area to left of Z)

Visualizing the P-value

Conclusion: P-value > 0.05, so we fail to reject \(H_0\) at \(\alpha = 0.05\).

Teen Driving: Conclusions

Statistical conclusion: P-value > 0.05, so we fail to reject \(H_0\). There is no convincing evidence that fewer than 39% of teens text while driving.

But wait… can we generalize?

The researchers do not indicate that it was a random sample.
This is NOT a random sample of all teens in this town

We cannot generalize these findings to all teens.

Computing a 95% CI

Teen study: \(\hat{p} = 0.36\), \(n = 500\)

Check conditions:

Observed successes: 180 ≥ 10 ✓
Observed failures: 320 ≥ 10 ✓

Standard Error: \(SE = \sqrt{\frac{0.36 \times 0.64}{500}} = 0.0215\)

95% CI: \(0.36 \pm 1.96 \times 0.0215 = (0.318, 0.402)\)

Interpreting the CI

95% Confidence Interval: (0.318, 0.402)

Interpretation: We are 95% confident that the true proportion of teens who are texting while driving is between 31.8% and 40.2%.

Note: The interval includes 0.39, which is consistent with our hypothesis test result (we failed to rejected \(H_0: p = 0.39\)).

Model-Based vs Simulation-Based Methods

Aspect	Simulation	Model-Based
P-value	Count extreme simulations	Area under normal curve
CI	Bootstrap percentiles	\(\hat{p} \pm z^* \times SE\)
When to use	Always can be used	Success/Failure conditions have to met)

Key point: Both methods give similar results when conditions are met.

Visualizing When Normal Approximation Works

Key insight: Normal approximation works well when conditions are met (left), but fails when conditions are NOT met (right). This is why checking technical conditions matters!

References

Introduction to Modern Statistics (2e) textbook by Mine Çetinkaya-Rundel and Johanna Hardin
Chapter 13, Section 16.2