Inference with Mathematical Models

One Proportion

Math 115

Where We’ve Been

So far, we’ve used simulation for inference:

  • Hypothesis testing: Simulated null distribution using parametric bootstrap
  • Confidence intervals: Simulated bootstrap distribution by resampling

These methods work well, but require many simulations.

Question: Is there a mathematical shortcut?

Reminder: What is a Sampling Distribution?

A sampling distribution shows how a statistic (like \(\hat{p}\)) varies from sample to sample.

In Hypothesis Testing:

  • Created null distribution by simulating many samples assuming \(H_0\) is true
  • “If \(H_0\) were true, how much would \(\hat{p}\) vary?”

For Confidence Intervals:

  • Created bootstrap distribution by resampling our data
  • “How much does \(\hat{p}\) vary due to sampling?”

The Shape of Sampling Distributions

The distribution is bell-shaped. This pattern appears consistently with large samples.

The Normal Distribution

The bell-shaped curve is called a normal distribution.

Properties:

  • Symmetric and unimodal
  • Described by two parameters:
    • Mean (μ): center of the distribution
    • Standard deviation (σ): spread of the distribution

Notation: N(μ, σ)

Normal Distributions with Different Parameters

Key insight: Mean shifts the curve left/right; SD controls the spread.

Central Limit Theorem for Proportions

The sampling distribution of \(\hat{p}\) is approximately normal when:

  1. Independence: Observations are independent (e.g., from SRS)

  2. Success-Failure Condition: At least 10 expected successes and 10 expected failures

When these conditions are met, the sampling distribution is approximately normal with:

  • Mean (\(\mu\)) = \(p\) (the true population proportion)
  • Standard deviation (\(\sigma\)) = \(\sqrt{\frac{p(1-p)}{n}}\)

Standard Error (SE)

Standard error is the standard deviation of the sampling distribution.

\[SE = \sqrt{\frac{p(1-p)}{n}}\]

Important distinction:

  • For hypothesis testing: Use \(p_0\) (the null hypothesis value)
  • For confidence intervals: Use \(\hat{p}\) (the observed proportion)

This is the spread we were measuring with our simulated distributions.

Z-Scores

The Z-score tells us how many standard errors an observation is from the mean.

\[Z = \frac{\text{observed} - \text{mean}}{SE}\]

For a proportion:

\[Z = \frac{\hat{p} - p_0}{SE}\]

Z-Scores: Interpretation

  • Z = 2 means the observation is 2 SEs above the mean
  • Z = -1.5 means the observation is 1.5 SEs below the mean

The Standard Normal Distribution

Converting to Z-scores standardizes any normal distribution to N(0, 1).

Advantage: We only need one table or tool for all normal calculations.

Using the Normal Distribution to Calculate Probabilities

Key idea: Probabilities correspond to areas under the normal curve

From 68-95-99.7 rule: 95% falls between Z = -2 and Z = 2, so 5% is in the two tails. By symmetry, 2.5% is above Z = 2.

Comparing z-scores

  • SAT scores follow a nearly normal distribution with a mean of 1470 points and a standard deviation of 210 points.
  • ACT scores also follow a nearly normal distribution with mean of 21 points and a standard deviation of 5 points.
  • Suppose Neal scored 1470 points on his SAT and Sean scored 24 points on his ACT.
  • Who performed better?

Neal’s z-score is \[Z_{Neal} = \frac{1470-1050}{210}=2\] Sean’s z-score is \[Z_{Sean} = \frac{28-21}{5}=1.4\]

Comparing z-scores

  • So Neal performed better since the probability of getting the score as high as Neal’s is smaller than getting the score as high as Sean’s

68-95-99.7 Rule

  • 68% of data falls within 1 SD of mean
  • 95% falls within 2 SD (1.96 to be precise)
  • 99.7% falls within 3 SD

Estmate probabilities with 68-95-99.7 Rule

  • We can quickly approximate the prbabilities of the tails of the normal distribution based on 68-95-99.7 Rule

Question: Using Neal’s result, What proportion of test-takers score above 1470?

Solution:

  • Z = 2 means 1470 is 2 standard deviations above the mean
  • From 68-95-99.7 rule: 95% fall between Z = -2 and Z = 2
  • So 5% are outside this range; by symmetry, 2.5% score above 1470

Example: Adult Female Heights

Heights of US adult females are approximately normal with μ = 64 inches, σ = 3 inches.

Question: A woman is 57 inches tall. Is this unusual?

Solution:

  • Z-score: \(Z = \frac{57 - 64}{3} = \frac{-7}{3} = -2.3\)
  • She is below 2 standard deviations below the mean
  • From 68-95-99.7 rule: only 5% of women fall outside ±2 SD
  • Less than 2.5% are shorter than Z = -2.3, so this is unusual

Technical Conditions: Hypothesis Testing

To use the normal model for hypothesis testing, check:

  1. Independence: Observations are independent
    • e.g., from simple random sample
  2. Success-Failure Condition: Using \(p_0\) (the null value)
    • \(n \times p_0 \geq 10\) (expected successes)
    • \(n \times (1 - p_0) \geq 10\) (expected failures)

Example: Payday Loan Study

Scenario: Researchers surveyed a random sample of 830 payday loan borrowers. 453 (54.6%) said they support additional regulation.

Research question: Is there evidence that the majority support regulation?

Hypotheses:

  • \(H_0: p = 0.5\) (no majority)
  • \(H_A: p > 0.5\) (majority supports)

Checking Conditions

Independence: Random sample of borrowers

Success-Failure Condition:

  • Expected successes: \(n \times p_0 = 830 \times 0.5 = 415\)
  • Expected failures: \(n \times (1-p_0) = 830 \times 0.5 = 415\)

Both are ≥ 10, so conditions are met. We can use the normal model.

Computing Z and P-value

Standard Error: \(SE = \sqrt{\frac{0.5 \times 0.5}{830}} = 0.0174\)

Z-score: \(Z = \frac{0.546 - 0.5}{0.0174} = 2.64\)

P-value: Area to the right of Z = 2.64 on standard normal

\[\text{P-value} = 0.004\]

Note: You can use the Model-Based Inference calculator in Jamovi’s Randomize module to calculate the p-value.

Visualizing the P-value

Conclusion: P-value = 0.004 < 0.05, so reject \(H_0\).

Conclusion

  • So we reject the null hypothesis. It means that There is convincing evidence that a majority of all MI payday borrowers supports the regulation.

Can we actually generalize?

  • The researchers stated that it was a random sample of payday borrowers in MI.

We can generalize these findings to all payday borrowers in MI

When Conditions Are NOT Met

Scenario: A medical consultant had 3 complications in 62 surgeries.

Test: \(H_0: p = 0.1\) vs \(H_A: p < 0.1\)

Check success-failure condition:

  • Expected successes: \(62 \times 0.1 = 6.2\)

6.2 < 10, so the condition is NOT met.

Solution: Use simulation (randomization) instead of the normal model.

Technical Conditions: Confidence Intervals

To use the normal model for a confidence interval, check:

  1. Independence: Observations are independent

  2. Success-Failure Condition: Using \(\hat{p}\) (observed proportion)

    • \(n \times \hat{p} \geq 10\) (observed successes)
    • \(n \times (1 - \hat{p}) \geq 10\) (observed failures)
    • Equivalently: at least 10 successes AND 10 failures observed

Key difference from H-test: Use \(\hat{p}\), not \(p_0\)

CI Formula Using Normal Model

When conditions are met:

\[\text{CI} = \hat{p} \pm z^* \times SE\]

Where:

  • \(\hat{p}\) = observed proportion
  • \(z^*\) = critical value (multiplier) for confidence level
  • \(SE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\)

Multipliers for Different Confidence Levels

Confidence Level \(z^*\)
90% 1.645
95% 1.960
99% 2.576

Note: You can use the Model-Based Inference calculator in Jamovi’s Randomize module to calculate these multipliers/critical values.

Computing a 95% CI

Payday study: \(\hat{p} = 0.546\), \(n = 830\)

Check conditions:

  • Observed successes: 453 ≥ 10
  • Observed failures: 377 ≥ 10

Standard Error: \(SE = \sqrt{\frac{0.546 \times 0.454}{830}} = 0.0173\)

95% CI: \(0.546 \pm 1.96 \times 0.0173 = (0.512, 0.58)\)

Interpreting the CI

95% Confidence Interval: (0.512, 0.58)

Interpretation: We are 95% confident that the true proportion of payday borrowers who support the regulation is between 51.2% and 58%.

Note: The interval does not includes 0.5, which is consistent with our hypothesis test result (we rejected \(H_0: p = 0.5\)).

Full Example: Teen Driving Study

Scenario: Researchers surveyed a sample of 500 teens in a small town about texting while driving. 180 admitted to texting while driving at least once.

Null hypothesis assumption: It has been reported that about 39% of teens nationwide reported texting while driving

Test: Is there evidence that the proportion of all teens in this town who are texting while driving is less than the national rate?

  • \(H_0: p = 0.39\)
  • \(H_A: p < 0.39\)

Teen Driving: Analysis

Data: \(\hat{p} = 180/500 = 0.36\)

Check conditions:

  • Independence: Responses are independent
  • Success-Failure: \(500 \times 0.39 = 195\) ≥ 10

Compute:

  • \(SE = \sqrt{\frac{0.39 \times 0.61}{500}} = 0.0218\)
  • \(Z = \frac{0.36 - 0.39}{0.0218} = -1.38\)
  • P-value = 0.08451 (area to left of Z)

Visualizing the P-value

Conclusion: P-value > 0.05, so we fail to reject \(H_0\) at \(\alpha = 0.05\).

Teen Driving: Conclusions

Statistical conclusion: P-value > 0.05, so we fail to reject \(H_0\). There is no convincing evidence that fewer than 39% of teens text while driving.

But wait… can we generalize?

  • The researchers do not indicate that it was a random sample.

  • This is NOT a random sample of all teens in this town

We cannot generalize these findings to all teens.

Computing a 95% CI

Teen study: \(\hat{p} = 0.36\), \(n = 500\)

Check conditions:

  • Observed successes: 180 ≥ 10
  • Observed failures: 320 ≥ 10

Standard Error: \(SE = \sqrt{\frac{0.36 \times 0.64}{500}} = 0.0215\)

95% CI: \(0.36 \pm 1.96 \times 0.0215 = (0.318, 0.402)\)

Interpreting the CI

95% Confidence Interval: (0.318, 0.402)

Interpretation: We are 95% confident that the true proportion of teens who are texting while driving is between 31.8% and 40.2%.

Note: The interval includes 0.39, which is consistent with our hypothesis test result (we failed to rejected \(H_0: p = 0.39\)).

Model-Based vs Simulation-Based Methods

Aspect Simulation Model-Based
P-value Count extreme simulations Area under normal curve
CI Bootstrap percentiles \(\hat{p} \pm z^* \times SE\)
When to use Always can be used Success/Failure conditions have to met)

Key point: Both methods give similar results when conditions are met.

Visualizing When Normal Approximation Works

Key insight: Normal approximation works well when conditions are met (left), but fails when conditions are NOT met (right). This is why checking technical conditions matters!

References