Inference for One Mean

Topic 11

Math 115

A New Type of Variable

So far we’ve focused on inference for proportions (categorical variables).

Now we turn to inference for means (quantitative variables).

Same framework:

  • CI: Point estimate ± margin of error
  • Hypothesis testing with p-values
  • Technical conditions to check

We begin with inference for a single mean (studies involving a single numerical variable).

Statistics and Parameters for Means

Statistic (sample) Parameter (population)
Mean \(\bar{x}\) \(\mu\)
Standard deviation \(s\) \(\sigma\)

Statistic of interest: \(\bar{x}\) (sample mean)

Goal: Make inferences about \(\mu\) (population mean)

  • Estimate \(\mu\) with a confidence interval
  • Test hypotheses about \(\mu\)

Model-Based Focus

For the remaining topics, we focus on model-based inference.

  • Randomization methods (permutation, bootstrap) give similar results when conditions are met
  • Most software reports model-based results by default

When conditions are NOT met → consider randomization methods instead.

  • We will return to these briefly in J Lab 2

Cherry Blossom Run

The Cherry Blossom Run is an annual 10-mile race in Washington, D.C.

  • Mean finish time for all runners was 93.29 minutes in 2006
  • Research question: Are runners getting faster or slower?
  • Data: Random sample of 100 runners from 2017

EDA: Cherry Blossom Data

n \(\bar{x}\) s min max
100 99.02 17.93 53.27 139.07

Variability of Data vs. Variability of Mean

Two different types of variability:

Sample standard deviation (s):

  • Measures how individual finish times vary around the sample mean
  • s = 17.93 minutes

Standard error (SE):

  • Measures how the sample mean would vary from sample to sample
  • Much smaller than s!

Demo - Sampling From a Population

View interactive sampling demo

The demo shows:

  1. Theoretical population distribution
  2. Sampling from the population
  3. Calculating sample mean for each sample
  4. Building the distribution of sample means, one sample at a time

SE Formula for Means

The standard error for a sample mean is:

\[SE = \frac{\sigma}{\sqrt{n}}\]

Problem: We don’t know σ (population SD)

Solution: Estimate with sample SD:

\[SE \approx \frac{s}{\sqrt{n}} = \frac{17.93}{\sqrt{100}} = 1.793\]

With \(n=100\), spread of sample mean is \(\frac{1}{10}\)th spread of data

Central Limit Theorem for Means

When conditions are met, the sampling distribution of \(\bar{x}\) is approximately normal with:

  • Mean: μ (the population mean)
  • Standard error: \(SE = \frac{\sigma}{\sqrt{n}}\)

Conditions:

  1. Independence: Observations are independent (e.g., from a random sample)
  2. Normality: See next slide

Normality Conditions

Normality:

  • If n < 30: Data should come from a nearly normal population (no strong skew or outliers)
  • If n ≥ 30: CLT applies; some skewness OK, but no extreme outliers

These are rules of thumb, not hard cutoffs.

Why T Instead of Z?

When we estimate σ with s, we add extra uncertainty.

To account for this, we use the t-distribution instead of normal:

  • T-distribution has thicker tails (more probability in extremes)
  • The smaller the sample, the thicker the tails
  • Accounts for uncertainty in our estimate of σ

T-Distribution and Degrees of Freedom

The t-distribution is characterized by degrees of freedom (df).

For one mean: df = n − 1

As df increases, t-distribution approaches normal distribution.

CI Formula for One Mean

When conditions are met:

\[\text{CI} = \bar{x} \pm t^*_{df} \times \frac{s}{\sqrt{n}}\]

Where:

  • \(\bar{x}\) = sample mean
  • \(t^*_{df}\) = critical value from t-distribution with df = n − 1
  • \(s/\sqrt{n}\) = standard error

Use Jamovi’s Model-Based Inference calculator to find \(t^*\).

Checking Conditions

Independence:

  • Random sample of runners

Normality:

  • n = 100 ≥ 30
  • No extreme outliers in histogram

Conditions are met for using the t-distribution.

Cherry Blossom 95% CI

Given: \(\bar{x} = 99.02\), \(s = 17.93\), \(n = 100\), \(df = 99\)

Critical value: \(t^*_{99} = 1.984\) (from Jamovi)

Standard error: \[SE = \frac{17.93}{\sqrt{100}} = 1.793\]

95% CI: \[99.02 \pm 1.984 \times 1.793 = (95.46, 102.57)\]

Interpreting the CI

95% CI: (95.46, 102.57) minutes

Interpretation: We are 95% confident that the true mean finish time for all 2017 Cherry Blossom 10-mile runners is between 95.46 and 102.57 minutes.

Does CI include 93.29? No

This suggests the 2017 mean is different from 2006.

Hypotheses for Cherry Blossom

Let μ = mean finish time for all 2017 Cherry Blossom 10-mile runners

Hypotheses:

  • \(H_0: \mu = 93.29\) (no change from 2006)
  • \(H_A: \mu \neq 93.29\) (finish time has changed)

This is a two-sided test.

Test Statistic

The T-statistic measures how far \(\bar{x}\) is from the null value in SE units:

\[T = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}\]

For Cherry Blossom:

\[T = \frac{99.02 - 93.29}{1.793} = \frac{5.73}{1.793} = 3.19\]

Calculating the P-value

Use Jamovi’s Model-Based Inference calculator with t-distribution (df = 99).

P-value = 0.0019

Conclusion

Results:

  • T = 3.19
  • P-value = 0.0019
  • Using α = 0.05: P-value < 0.05

Decision: Reject \(H_0\)

Conclusion: The data provide convincing evidence that the mean finish time in 2017 is different from 93.29 minutes (the 2006 mean).

Note: This is consistent with our CI—93.29 is outside the 95% CI.

The Foundation: Sampling Variability

All inference is built on one idea: statistics vary from sample to sample.

The standard error quantifies this variability—it tells us how much we expect the statistic to vary across different samples.

Whether we use randomization or mathematical models, the goal is the same: describe the sampling variability so we can make inferences.

Why SE Determines CI Width

CI = statistic ± multiplier × SE

  • If SE is large: \(\bar{x}\) varies a lot from sample to sample
    • Any single \(\bar{x}\) could be far from the true \(\mu\)
    • We need a wide interval to be confident we’ve captured \(\mu\)
  • If SE is small: \(\bar{x}\) stays close to \(\mu\) across samples
    • Our observed \(\bar{x}\) is probably near the true \(\mu\)
    • A narrow interval is sufficient

Smaller SE → more precision → narrower CI

Why SE Determines Strength of Evidence

Test statistic = (observed − null) / SE

Under \(H_0\), we expect statistics to land near the null value—but not exactly, due to sampling variability.

  • SE tells us how much variability to expect
  • Test statistic asks: “How many SEs is our result from the null?”

As a rough sense of scale:

  • If observed is within 1-2 SEs of null → typical variability → not surprising → weak evidence
  • If observed is 3+ SEs away from null → far more than expected by chance → surprising → strong evidence

Same Logic: Proportions and Means

Proportion Mean
Statistic \(\hat{p}\) \(\bar{x}\)
SE formula \(\sqrt{\frac{p(1-p)}{n}}\) \(\frac{s}{\sqrt{n}}\)
CI statistic ± multiplier × SE statistic ± multiplier × SE
Test statistic \(Z=\frac{\text{observed} - \text{null}}{SE}\) \(T=\frac{\text{observed} - \text{null}}{SE}\)

The formulas differ, but the logic is identical:

Quantify sampling variability (SE) → Use it to measure uncertainty (CI) or surprise (p-value)

References