Inference: Single Mean

IMS2 Ch. 19
Math 115

Yurk

Cherry Blossom 10 Mile

The Cherry Blossom Run is an annual 10 mile run in Washington, D.C.
The mean finishing time for all 10-mile runners was 93.29 minutes in 2006.
Are runners getting faster or slower or staying the same?

Data:

run17 dataset is available here
Random sample of 100 runners from 2017 race
time is finishing time in minutes

Inference

Let \(\mu\) be the mean finishing time for all runners in 2017
We will estimate the mean using a confidence interval
We will also test the following hypotheses:
- \(H_0: \mu = 93.29\)
- \(H_A: \mu \neq 93.29\)

EDA:

Histrogram showing distribution of 10-mile finish times.

n	mean	sd	min	max
100	99.02	17.93	53.27	139.07

SE vs Sample SD

The sample standard deviation measures the finish times vary from person to person (within the sample)
The standard error measures how the mean finish time varies from sample to sample (from the same population)
To make inferences, we need to understand how the statistic (mean finish time) varies from sample to sample
Since we cannot measure the SE directly, we will estimate it using a formula or bootstrapping

Bootstrapping

We can use bootstrapping to approximate the variability in the means
Resample (with replacement) from the sample data to create many new samples
Calculate the mean for each new sample
Variability in the bootstrapped means approximates the variability of the sampling distribution

Histrogram showing 1,000 bootstrapped means.

Variability in the data vs variability of the mean

Histrogram showing distribution of 10-mile finish times.

The sample standard deviation is \(s = 17.93\) minutes

Histrogram showing 1,000 bootstrapped means on the same horizontal axis.

The standard error of the means is \(SE = 1.78\) minutes

Bootstrap Percentile Confidence Interval for the Mean

We can use the bootstrap distribution to calculate a bootstrap percentile confidence interval for the mean
The 95% bootstrap percentile confidence interval is between the 2.5th and 97.5th percentiles of the bootstrapped means

1,000 bootstrapped means with dashed lines at 2.5% and 97.5% percentiles.

A 95% bootstrap percentile CI for mean finish time: (95.53, 102.55)

Mathematical Model for Distribution of Means

Central Limit Theorem for Sample Mean

When the following conditions are met, the sampling distribution of \(\bar{x}\) from for samples of size \(n\) from a population with mean \(\mu\) and standard deviation \(\sigma\) will be approximately normal with mean = \(\mu\) and standard error \[SE=\frac{\sigma}{\sqrt{n}}\]

Independent observations
Normality: when sample is small, sample observations must come from a normally distributed population. When sample is large, this condition can be relaxed.

We can use this rule of thumb for the normality check:

If \(n<30\) and their are no clear outliers, then we usually assume the data come from a nearly normal distribution to satisfy the condition.
If \(n\geq30\) and there are no particularly extreme outliers, then we usually assume the sampling distribution of \(\bar{x}\) is nearly normal, even if the underlying population distribution is not

Histrogram showing distribution of 10-mile finish times.

The Cherry Blossom run data satisfy the normality check, since there are 100 (\(\geq 30\)) observations, and no particularly extreme outliers
The observations are independent, because they come from a simple random sample of finishers

T-distribution

In order to estimate the SE using the formula, we need to estimate the population standard deviation \(\sigma\)
The best estimate is the sample standard deviation \(s\) \[SE = \frac{\sigma}{\sqrt{n}}\approx\frac{s}{\sqrt{n}}\]
The test statistic for assessing a single mean is \(T\) \[T=\frac{\bar{x}-null\,value}{s/\sqrt{n}}\]

T-Distribution

Mathematical Model for \(T\)

The \(T\) statistic (\(T\) score) will have will have a \(t\)-distribution with \(df=n-1\) degrees of freedom if the following conditions are met:

Independent observations
Large samples with no extreme outliers (use same rule of thumb)

The tails of the \(t\)-distribution are thicker than the normal distribution due to uncertainty in the SE estimate
This is especially true for smaller samples

Comparison of normal distribution and \(t\)-distributions with different degrees of freedom (IMS2 Figure 19.8).

One Sample T-Interval

If the conditions are met, we can use the \(t\)-distribution to calculate a confidence interval, called a one sample t-interval
The interval is \[\bar{x}\pm t^{\ast}_{df}\times \frac{s}{\sqrt{n}}\]
The value of \(t^{\ast}_{df}\) depends on the confidence level and degrees of freedom

For the Cherry Blossom run finish times, \(df=100-1=99\)
For a 95% CI, \(t^{\ast}_{99}\) is the value such that 95% of the area under the \(t\)-distribution with 99 degrees of freedom is between \(-t^{\ast}_{99}\) and \(t^{\ast}_{99}\)
Using the Randomize module in Jamovi we find that \(t^{\ast}_{99}=1.98\)
Thus the 95% confidence interval is \[99.02\pm 1.98\times \frac{17.93}{\sqrt{100}}\]

Comparison of Confidence Intervals

Type	95% CI
One sample \(t\)-interval	(95.46, 102.56)
Bootstrap percentile	(95.53, 102.55)

One Sample T-Test

If the conditions are met, we can use the \(t\)-distribution to conduct a hypothesis test
Recall that we want to determine of the average finish time is different than it was in 2006 (93.29 min)
- \(H_0: \mu = 93.29\)
- \(H_A: \mu \neq 93.29\)

The \(T\)-statistic is \[\begin{array}{lcr}T &=& \frac{\bar{x}-null\,value}{s/\sqrt{n}}\\ &=& \frac{99.02-93.29}{17.93/\sqrt{100}}\\ &=& \frac{99.02-93.29}{1.793}\\ &=& 3.20 \end{array}\]

The p-value is calculated by finding the areas below and above the observed value of \(T\) (\(\geq3.2\)). The two-sided p-value is twice the smaller area.
Since the \(t\)-distribution is symmetric, the p-value can also be calculated as total area that is \(\leq-3.2\) or \(\geq3.2\)
We can use the Randomize module in Jamovi to calculate this area

The \(t\)-distribution with 99 degrees of freedom. The observed \(T\)-statistic is 3.2. The p-value is the total area to the left of -3.2 or to the right of 3.2 (small red areas).

The resulting p-value is 0.00185

Conclusion

We are able to reject the null hypothesis at the \(\alpha = 0.05\) significance level. We conclude that the mean finishing time in 2017 is different than 93.29 minutes (the mean time in 2006)

Note that this result is consistent with the 95% confidence intervals (for example, the one sample t-interval was (95.46, 102.56))
93.29 minutes is not considered a plausible value for the mean based on the CI
We almost always get consistent results between a two-sided hypothesis test with significance level \(\alpha\) and a confidence interval with confidence level \((1-\alpha)\times 100 \%\)
E.g., if we reject the null hypothesis with \(\alpha=0.01\), the null value will not be included as a plausible value in the 99% CI.