60% of US voters support Candidate X for president (parameter: p = 0.6)
We repeatedly selected random samples of 30 voters from the theoretical population (1,000 times) and calculated the proportion of supporters for each sample(statistic: \(\hat{p}\))
What does the sampling distribution look like?
Sampling distribution. Proportions for 1,000 samples of 30 from a population with 60% (dashed vertical line) support for Candidate X.
Results of the poll
one_poll_boot <- one_poll |>
specify(response = vote, success = "yes") |>
generate(reps = 1000, type = "bootstrap") |>
calculate(stat = "prop")
glimpse(one_poll_boot)Rows: 1,000
Columns: 2
$ replicate <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1…
$ stat <dbl> 0.5666667, 0.6333333, 0.7000000, 0.6666667, 0.6333333, 0.800…
NULL
Distribution of bootstrapped proportions.
A 99% CI is between 0.05% and 99.9% percentiles of the bootstrap distribution
99% CI is larger than 95% CI
It needs to be wider for us to be more confident that it contains the value of the parameter
Both intervals have the same center at 0.7 which is the sample proportion of the original sample
Illustration of sampling distribution and bootstrap distribution. From IMS1 Tutorial 4.4.