Math 115 Final Exam Practice Questions

Final Exam: Topics 1-18

Topics Covered:

Topic 1: Hello Data - Cases, variables, types of variables (categorical vs. numerical; nominal vs. ordinal; discrete vs. continuous)
Topic 2: Exploring Data - Frequency tables, bar plots, histograms, shape descriptions, measures of center (mean, median), measures of spread (SD, IQR, range), effect of outliers, boxplots
Topic 3: Introduction to Hypothesis Testing (One Proportion) - Population parameter ($p$) vs. sample statistic ($\hat{p}$), null and alternative hypotheses, simulation/randomization, null distribution, p-value, significance level ($\alpha$), decision rules
Topic 4: Introduction to Confidence Intervals (One Proportion) - Bootstrap resampling, bootstrap distribution, percentile method, CI interpretation, confidence level
Topic 5: Scope of Inference: Generalization - Random sampling vs. convenience sampling, generalization to population
Topic 6: Inference with Mathematical Models (One Proportion) - Normal distribution, Z-scores, 68-95-99.7 rule, standard error, success-failure condition, model-based inference
Topic 7: Two-Sided Tests and Decision Errors (One Proportion) - Two-sided hypotheses ($H_A: p \neq p_0$), two-sided p-values, Type I and Type II errors, significance level, power, CI and hypothesis test connection
Topic 8: Associations and Causation - Association vs. independence, contingency tables, conditional proportions, explanatory and response variables, observational studies vs. experiments, confounding variables, random assignment vs. random sampling
Topic 9: Inference for Two Proportions (Randomization) - Permutation tests, difference in proportions ($\hat{p}_1 - \hat{p}_2$), null distribution via shuffling, bootstrap CI for difference in proportions
Topic 10: Inference for Two Proportions (Normal Distribution) - Pooled proportion, SE formulas (pooled vs. separate), success-failure conditions, Z-test, confidence intervals for difference in proportions
Topic 11: Inference for One Mean - t-distribution, degrees of freedom (df = n - 1), SE = s/√n, T-statistic, t-test, confidence intervals for means
Topic 12: Inference for Two Independent Means - Pooled standard deviation, SE for difference in means, df = n₁ + n₂ - 2, two-sample t-test, confidence intervals
Topic 13: Inference for Paired Means - Identifying paired data, computing differences, analyzing differences as one sample, df = n_pairs - 1, paired t-test
Topic 14: Inference for Contingency Tables - Chi-square test, expected counts formula, chi-square statistic $X^2$, df = (r-1)(c-1), chi-square distribution
Topic 15: Inference for Multiple Means (ANOVA) - F-test, MSG (Mean Square Groups), MSE (Mean Square Error), F-statistic, F-distribution, ANOVA table interpretation
Topic 16: Multiple Comparisons - Familywise error rate (FWE), Bonferroni correction, pairwise comparisons, adjusted p-values
Topic 17: Simple Linear Regression - Scatterplots, slope and intercept interpretation, least squares line, residuals, correlation (r), coefficient of determination (R²), extrapolation, influential points
Topic 18: Inference for Regression - T-test for slope, hypotheses ($H_0: \beta_1 = 0$), standard error of slope, confidence interval for slope, df = n - 2

Tests of Significance

Here is a summary of the data types needed to run each test. Some of the problems of the final exam will ask you to identify an appropriate test of significance for a given research question.

Single proportion: One categorical variable with two categories (binary).
Single mean: One quantitative variable.
Matched pairs test: Dependent samples, subtract to obtain one quantitative variable for the difference.
Comparing means: Response is quantitative, explanatory is categorical with two independent categories.
Comparing proportions: Both response and explanatory are categorical with two categories.
Comparing Multiple Proportions/Chi-square test for independence (association): Response and explanatory are categorical. Either one can have more than two categories.
Comparing Multiple Means/ANOVA: Response is quantitative and explanatory is categorical with more than two categories. If the results of the test are significant then there is a follow-up (post-hoc) analysis)
Correlation and Regression: Response and explanatory are quantitative.

Practice Problems

1. Variable types

A veterinary clinic collects data on pets brought in for annual checkups. For each variable, classify it as categorical or numerical. If categorical, specify nominal or ordinal. If numerical, specify discrete or continuous. Circle one for each.

(a) Species (dog, cat, bird, reptile)

Categorical, Nominal
Categorical, Ordinal
Numerical, Discrete
Numerical, Continuous

(b) Weight measured in pounds

Categorical, Nominal
Categorical, Ordinal
Numerical, Discrete
Numerical, Continuous

(c) Number of previous visits to the clinic

Categorical, Nominal
Categorical, Ordinal
Numerical, Discrete
Numerical, Continuous

(d) Pain level assessed by the vet (none, mild, moderate, severe)

Categorical, Nominal
Categorical, Ordinal
Numerical, Discrete
Numerical, Continuous

2. Meeting hours

The following data shows the number of hours per week that 16 employees at a small company spend in meetings:

2, 3, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 10, 12, 14, 28

(a) Calculate the mean number of hours. (Round to one decimal place.)

Answer: _________

(b) Calculate the median number of hours.

Answer: _________

(c) Which measure of center is larger? What does this suggest about the shape of the distribution?

(d) If the value 28 were removed, which measure of center would change more: the mean or the median? Explain briefly.

3. Boxplot

The boxplot below shows the distribution of daily tips (in dollars) earned by 40 servers at a restaurant during one week.

(a) Estimate the median daily tips from the boxplot.

Answer: _________

(b) Estimate the IQR (interquartile range).

Answer: _________

(c) Are there any outliers visible in this boxplot? If so, are they on the low end, the high end, or both?

4. Study design

For each study described below, determine: (1) whether it is an experiment or an observational study, (2) whether results can be generalized to a broader population, and (3) whether a causal conclusion can be drawn.

(a) A tech company randomly selects 200 employees from its full roster and randomly assigns half to use standing desks and half to use traditional desks for three months, then measures productivity scores.

Experiment or Observational study? Generalize? Yes / No Causal? Yes / No

(b) A university surveys all incoming freshmen about their social media habits and records their first-semester GPA from university records.

Experiment or Observational study? Generalize? Yes / No Causal? Yes / No

(c) A fitness app recruits volunteer users and randomly assigns some to receive daily motivational notifications and others to receive no notifications, then compares weekly exercise minutes.

Experiment or Observational study? Generalize? Yes / No Causal? Yes / No

5. Confounding Variables

A study finds that neighborhoods with more ice cream trucks have higher rates of heat-related hospital visits. A researcher concludes that ice cream trucks cause heat-related illness.

(a) Identify a likely confounding variable in this study.

(b) Explain how this confounding variable provides an alternative explanation for the observed association.

6. One-proportion simulation test

A city claims that 60% of residents support a new public transit expansion. A local journalist suspects the true proportion is lower than 60%. She surveys a random sample of 150 residents and finds that 80 support the expansion.

She conducts a simulation: assuming $H_0: p = 0.60$, she generates 1,000 simulated sample proportions. The histogram below shows the null distribution. The red dashed line marks the observed sample proportion.

(a) State the null and alternative hypotheses using proper notation.

(b) What is the observed sample proportion $\hat{p}$? (Round to three decimals.)

Answer: _________

(c) Based on the histogram, estimate the p-value for this one-sided test. Circle one.

Less than 0.01
Between 0.01 and 0.05
Between 0.05 and 0.10
Greater than 0.10

(d) At $\alpha = 0.05$, what is the conclusion? Circle one.

Reject $H_0$. There is convincing evidence that less than 60% of residents support the expansion.
Fail to reject $H_0$. There is not enough evidence that less than 60% of residents support the expansion.

7. Bootstrap confidence interval

A researcher surveys 45 randomly selected college students and finds that 31 prefer digital textbooks over print. She uses bootstrap resampling to construct a confidence interval for the true proportion of students who prefer digital textbooks.

The bootstrap distribution of 1,000 resampled proportions is shown below:

Based on the bootstrap distribution, estimate the bounds of a 95% confidence interval for the true proportion of students who prefer digital textbooks.

(a) Lower bound (estimate to two decimals):

Answer: _________

(b) Upper bound (estimate to two decimals):

Answer: _________

8. One-proportion Z-test

A national survey reports that 45% of adults exercise at least 3 times per week. A health researcher believes the proportion among young adults (ages 18–25) is different from 45%. She surveys a random sample of $n = 200$ young adults and finds that 104 exercise at least 3 times per week.

Assume conditions for inference using a normal distribution are met.

(a) State the null and alternative hypotheses using proper notation.

(b) Compute the sample proportion $\hat{p}$. (Round to two decimals.)

Answer: _________

(c) Compute the standard error under $H_0$. (Round to four decimals.)

Answer: _________

(d) Compute the Z-score. (Round to two decimals.)

Answer: _________

(e) For this two-sided test, suppose the Z-score was 1.85. The p-value would be approximately: Circle one.

Less than 0.01
Between 0.01 and 0.05
Between 0.05 and 0.10
Greater than 0.10

(f) If the p-value is 0.032, what is the correct conclusion at $\alpha = 0.05$? Circle one.

Reject $H_0$. There is convincing evidence that the proportion of young adults who exercise at least 3 times per week differs from 45%.
Fail to reject $H_0$. There is not enough evidence that the proportion differs from 45%.
Accept $H_0$. The proportion of young adults who exercise at least 3 times per week is exactly 45%.

9. Errors in hypothesis testing

A pharmaceutical company is testing a new allergy medication. Let $p$ be the proportion of patients who experience symptom relief with the new medication. They test $H_0: p = 0.50$ vs. $H_a: p > 0.50$ at $\alpha = 0.05$.

(a) What is a Type I error in this context? Circle one.

Conclude the medication provides relief to more than 50% of patients when it actually does not.
Conclude the medication does not provide relief to more than 50% of patients when it actually does.
Fail to detect a side effect of the medication.

(b) What is a Type II error in this context? Circle one.

Conclude the medication provides relief to more than 50% of patients when it actually does not.
Conclude the medication does not provide relief to more than 50% of patients when it actually does.
Accept the null hypothesis as definitely true.

10. CI and hypothesis test

A researcher tests $H_0: \mu = 100$ vs. $H_a: \mu \neq 100$ at $\alpha = 0.05$ and constructs a 95% confidence interval of $(97.2, 103.8)$.

(a) Based on the confidence interval, should the researcher reject or fail to reject $H_0$? Circle one.

Reject $H_0$
Fail to reject $H_0$

(b) Explain your reasoning.

11. Two-way table proportions

A sports league surveys 300 fans about their preferred way to watch games. The results are broken down by age group:

Age Group	In-Person	TV Broadcast	Streaming	Row Total
18–34	25	40	55	120
35–54	35	50	25	110
55+	30	30	10	70
Column Total	90	120	90	300

(a) Among fans aged 18–34, what proportion prefer streaming? (Round to three decimals.)

Answer: _________

(b) Among fans aged 55+, what proportion prefer streaming? (Round to three decimals.)

Answer: _________

(c) Based on parts (a) and (b), does there appear to be an association between age group and viewing preference? Explain briefly.

12. Two-proportion Z-test

A school district wants to compare pass rates on a standardized reading test between two curricula. Random samples of students from schools using each curriculum yield:

Curriculum A: $n_1 = 180$, 126 students passed ($\hat{p}_1 = 0.70$)
Curriculum B: $n_2 = 160$, 96 students passed ($\hat{p}_2 = 0.60$)

Test whether there is a difference in pass rates at $\alpha = 0.05$.

Assume conditions for inference using a normal distribution are met.

(a) State the null and alternative hypotheses.

(b) Compute the pooled proportion $\hat{p}_{pool}$. (Round to four decimals.)

Answer: _________

(c) Compute the standard error for the difference in proportions. (Round to four decimals.)

Answer: _________

(d) Compute the Z-score. (Round to two decimals.)

Answer: _________

(e) If the two-sided p-value is 0.046, what is the correct conclusion at $\alpha = 0.05$? Circle one.

Reject $H_0$. There is convincing evidence of a difference in pass rates between the two curricula.
Fail to reject $H_0$. There is not enough evidence of a difference in pass rates.

13. One-sample t-test

A coffee chain claims that the average caffeine content in their medium latte is 150 mg. A consumer group suspects it may be different from 150 mg and tests a random sample of $n = 36$ lattes.

Summary statistics: $\bar{x} = 157.4$ mg, $s = 22.8$ mg

Assume conditions for inference using a t-distribution are met.

(a) State the null and alternative hypotheses.

(b) Compute the standard error. (Round to two decimals.)

Answer: _________

(c) Compute the T-statistic. (Round to two decimals.)

Answer: _________

(d) The degrees of freedom for this test is:

Answer: _________

(e) Suppose the two-sided p-value is 0.058. What is the correct conclusion at $\alpha = 0.05$? Circle one.

Reject $H_0$. There is convincing evidence that the average caffeine content differs from 150 mg.
Fail to reject $H_0$. There is not enough evidence that the average caffeine content differs from 150 mg.

14. Two-sample t confidence interval

A company tests whether a new ergonomic keyboard reduces typing errors. They randomly assign 50 employees: 25 to the new keyboard and 25 to the standard keyboard. After one month, the number of errors per day is recorded.

Summary statistics:

New keyboard: $n_1 = 25$, $\bar{x}_1 = 4.2$, $s_1 = 2.1$
Standard keyboard: $n_2 = 25$, $\bar{x}_2 = 5.8$, $s_2 = 2.5$

Construct a 95% confidence interval for the difference in population means ($\mu_1 - \mu_2$).

Assume conditions for inference using a t-distribution are met and population variances are equal.

(a) Compute the pooled standard deviation $s_p$. (Round to two decimals.)

Answer: _________

(b) Compute the standard error for the difference in means. (Round to three decimals.)

Answer: _________

(c) The degrees of freedom for this interval is:

Answer: _________

(d) Suppose the correct critical value $t^*$ for a 95% CI is approximately 2.011. Compute the margin of error. (Round to two decimals.)

Answer: _________

(e) Compute the lower and upper limits of the 95% CI.

Lower: _________ Upper: _________

15. Paired t confidence interval

A running coach measures the 5K race times (in minutes) of 20 runners before and after an 8-week training program. For each runner, the difference is computed as: $d$ = After $-$ Before.

Summary statistics for the differences:

$n = 20$
$\bar{d} = -1.8$ minutes (negative indicates improvement)
$s_d = 2.4$ minutes

Construct a 90% confidence interval for the true mean difference in race time ($\mu_d$).

Assume conditions for inference using a t-distribution are met.

(a) Compute the standard error $SE_d$. (Round to three decimals.)

Answer: _________

(b) The degrees of freedom is:

Answer: _________

(c) Suppose the correct critical value $t^*$ for a 90% CI is approximately 1.729. Compute the margin of error. (Round to two decimals.)

Answer: _________

(d) Compute the lower and upper limits of the 90% CI.

Lower: _________ Upper: _________

(e) Based on your CI, is there convincing evidence at the 10% significance level that the training program changes race times? Explain.

16. Paired or independent

For each scenario, determine whether the data should be analyzed using a paired or independent samples approach. Circle one for each.

(a) A school measures students’ math scores before and after a tutoring program.

Paired
Independent

(b) A researcher compares the average salary of nurses in Michigan versus Ohio using separate random samples from each state.

Paired
Independent

(c) A taste test records each participant’s rating for Brand A and Brand B coffee.

Paired
Independent

(d) A study compares the reaction times of 30 professional gamers and 30 non-gamers.

Paired
Independent

17. Chi-squared test of independence

A public health researcher surveys 360 adults about their preferred source of health information, broken down by education level:

Education	Doctor/Clinic	Internet	Social Media	Row Total
High School	40	30	50	120
Bachelor’s	55	35	30	120
Graduate	65	30	25	120
Column Total	160	95	105	360

The researcher wants to test whether there is an association between education level and preferred health information source.

(a) State the null and alternative hypotheses.

(b) Calculate the expected count for the “High School / Doctor” cell. (Round to two decimals.)

Answer: _________

(c) Calculate the expected count for the “Graduate / Social Media” cell. (Round to two decimals.)

Answer: _________

(d) Using your answer from part (b), calculate that cell’s contribution to the chi-squared statistic. (Round to three decimals.)

Answer: _________

(e) What are the degrees of freedom for this chi-squared test?

Answer: _________

(f) The chi-squared test statistic is $\chi^2 = 21.45$ and the p-value is 0.0003. At $\alpha = 0.05$, what is the conclusion? Circle one.

Reject $H_0$. There is convincing evidence of an association between education level and preferred health information source.
Fail to reject $H_0$. There is not enough evidence of an association.

18. Chi-squared true/false

Determine if each statement about the chi-squared test is true or false. Circle one for each.

(a) The chi-squared test can only be used for 2$$2 contingency tables.

True
False

(b) If all expected counts are at least 5, the condition for the chi-squared test is satisfied.

True
False

(c) A large chi-squared statistic provides evidence against independence.

True
False

(d) The chi-squared distribution can take negative values.

True
False

19. ANOVA

A food scientist compares the crunchiness rating (on a 1–100 scale) of potato chips fried at three different temperatures. Each temperature is tested with 20 batches of chips, for a total of 60 batches. The ANOVA table below summarizes the results.

Source	df	Sum of Squares	Mean Square	F-statistic	p-value
Temperature	2	1240.8	620.4		0.0021
Residuals	57	5472.0	96.0

(a) Verify the calculation of Mean Square for Groups (MSG). Show your work.

(b) Compute the F-statistic. (Round to two decimals.)

Answer: _________

(c) State the null and alternative hypotheses. (Let $\mu_1$, $\mu_2$, and $\mu_3$ represent the mean crunchiness rating for each temperature.)

(d) At $\alpha = 0.05$, what is the appropriate decision? Circle one.

Reject $H_0$
Fail to reject $H_0$

(e) Does a significant ANOVA result tell us which specific groups differ from each other?

20. ANOVA true/false

Determine if each statement about ANOVA is true or false. Circle one for each.

(a) A significant ANOVA result means all group means are different from each other.

True
False

(b) The F-statistic is always non-negative.

True
False

(c) If the F-statistic is close to 1, this suggests the between-group variability is similar to the within-group variability.

True
False

(d) One of the conditions for ANOVA is that the populations should have roughly equal variances.

True
False

21. Bonferroni adjustment

After finding a significant ANOVA result comparing customer satisfaction scores across 4 restaurant locations (A, B, C, D), a manager performs pairwise t-tests. The unadjusted p-values are:

Comparison	Unadjusted p-value
A vs. B	0.234
A vs. C	0.003
A vs. D	0.087
B vs. C	0.018
B vs. D	0.412
C vs. D	0.006

She uses the Bonferroni method with $\alpha = 0.05$.

(a) How many pairwise comparisons are there?

Answer: _________

(b) What is the Bonferroni-adjusted significance level $\alpha^*$ for each individual test? (Round to four decimals.)

Answer: _________

(c) Using the Bonferroni method, for which comparisons should the manager reject the null hypothesis? Circle all that apply.

A vs. B (ii) A vs. C (iii) A vs. D (iv) B vs. C (v) B vs. D (vi) C vs. D

(d) True or false: If ANOVA finds a significant result, that guarantees all pairwise comparisons will also be significant after Bonferroni adjustment.

True
False

22. Regression interpretation

A real estate agent studies the relationship between home size (in hundreds of square feet) and monthly electricity bill (in dollars) for 60 homes. The least squares regression line is:

\[\widehat{bill} = 28.5 + 8.4 \times size\]

where bill is in dollars and size is in hundreds of square feet. The coefficient of determination is $R^2 = 0.68$.

(a) Interpret the slope in context.

(b) Interpret the intercept in context. Does this interpretation make practical sense?

(c) Predict the monthly electricity bill for a home that is 1,800 square feet (i.e., $size = 18$).

Answer: _________

(d) Interpret $R^2 = 0.68$ in context.

23. Prediction and residual

Using the regression model from Problem 22, suppose one home is 1,500 square feet and has an actual monthly electricity bill of $165.

(a) Calculate the predicted electricity bill for this home.

Answer: _________

(b) Calculate the residual for this home.

Answer: _________

(c) Did the model overpredict or underpredict this home’s electricity bill?

Overpredict
Underpredict

(d) On a residual plot, would this point appear above or below the horizontal line at 0?

Above
Below

24. Correlation coefficients

Match each description to its most likely correlation coefficient $r$ from the list: 0.89, 0.32, -0.75, -0.05

(a) Points tightly clustered around a line sloping downward: $r =$ _________

(b) Points with almost no apparent linear pattern: $r =$ _________

(c) Points closely following a line sloping upward: $r =$ _________

(d) Points loosely scattered around a line sloping upward: $r =$ _________

25. Regression inference

A researcher studies the relationship between daily commute time (in minutes) and job satisfaction score (on a 0–100 scale) for $n = 48$ randomly selected workers. The regression output is:

Term	Estimate	Std. Error	T statistic	P-value
Intercept	81.062	2.948	27.49	< 0.001
Commute	-0.444	0.070	-6.33	< 0.001

(a) Write the equation of the least squares regression line.

(b) State the null and alternative hypotheses for testing whether there is a linear relationship between commute time and job satisfaction.

(c) Verify the T-statistic for the slope using the estimate and standard error. Show your work.

(d) What are the degrees of freedom for this test?

Answer: _________

(e) At $\alpha = 0.05$, what is the conclusion? Circle one.

Reject $H_0$. There is convincing evidence of a linear relationship between commute time and job satisfaction.
Fail to reject $H_0$. There is not enough evidence of a linear relationship.

26. Confidence interval for slope

For a regression with $n = 38$ observations, the slope estimate is $b_1 = -1.24$ with standard error $SE = 0.42$.

(a) What are the degrees of freedom?

Answer: _________

(b) Using $t^*_{36} = 2.028$, calculate the 95% confidence interval for the slope $\beta_1$.

Lower: _________ Upper: _________

(c) Interpret this confidence interval in context (assume the regression predicts test score from hours of video games played per day).

(d) Based on this confidence interval, would you reject $H_0: \beta_1 = 0$ at $\alpha = 0.05$? Explain.

27. Regression conditions

For each description of a residual plot, identify which condition for regression inference is most clearly violated. Circle one for each.

(a) The residuals show a clear curved (U-shaped) pattern.

Linearity
Independence
Normality of residuals
Constant variance

(b) The spread of the residuals increases as the predicted values increase (a “fan” shape).

Linearity
Independence
Normality of residuals
Constant variance

(c) A histogram of the residuals shows strong right skew.

Linearity
Independence
Normality of residuals
Constant variance

28. Procedure selection

For each scenario, identify the most appropriate inference procedure. Circle one for each.

(a) Testing whether there is an association between political party (Democrat, Republican, Independent) and opinion on a policy (Support, Oppose).

Chi-squared test
One-way ANOVA
Two-proportion Z-test
Simple linear regression

(b) Comparing mean blood pressure levels among patients on four different medications.

Chi-squared test
One-way ANOVA
Paired t-test
Two-sample t-test

(c) Testing whether hours of sunlight per day predicts plant growth (in cm).

Chi-squared test
One-way ANOVA
Two-sample t-test
Simple linear regression

(d) Estimating the difference in the proportion of men and women who own a pet.

Chi-squared test
One-way ANOVA
Two-proportion Z confidence interval
Two-sample t confidence interval

(e) Testing whether a new fertilizer changes the mean yield of tomato plants compared to the current fertilizer, where each plot is split in half and one half gets the new fertilizer and the other gets the current fertilizer.

Two-proportion Z-test
Paired t-test
Two-sample t-test
One-way ANOVA

(f) Testing whether the true mean customer wait time at a restaurant differs from the advertised 15 minutes.

One-proportion Z-test
One-sample t-test
Paired t-test
Chi-squared test

29. P-values in a two-sample t-test

For each change described below (while keeping everything else the same), what happens to the p-value in a two-sample t-test?

(a) The observed difference in sample means ($\bar{x}_1 - \bar{x}_2$) increases.

Smaller (stronger evidence against $H_0$)
Larger (weaker evidence against $H_0$)
It stays the same

(b) Both sample standard deviations ($s_1$ and $s_2$) increase.

Smaller (stronger evidence against $H_0$)
Larger (weaker evidence against $H_0$)
It stays the same

(c) Both sample sizes ($n_1$ and $n_2$) increase.

Smaller (stronger evidence against $H_0$)
Larger (weaker evidence against $H_0$)
It stays the same

30. Confidence interval width

For each change described below (while keeping everything else the same), what happens to the width of a confidence interval for a population mean $\mu$?

(a) The sample size ($n$) increases.

Wider
Narrower
It stays the same

(b) The sample standard deviation ($s$) increases.

Wider
Narrower
It stays the same

(c) The confidence level is increased (e.g., from 90% to 95%).

Wider
Narrower
It stays the same

31. Success-failure condition

A researcher wants to use a two-proportion Z-test to compare complication rates for two surgical methods. Method A: $n_1 = 80$, 6 complications. Method B: $n_2 = 100$, 11 complications.

Is the success-failure condition met for using the normal model?

32. Interpreting p-values

For each scenario, select the best interpretation of the result at $\alpha = 0.05$.

(a) A company tests whether a new training program improves employee performance compared to the old program. The two-sided p-value is 0.02.

There is convincing evidence that mean performance differs between the two programs.
The new program definitely improves performance.
There is a 2% chance that the programs are equally effective.
The new program improves performance by 2%.

(b) A researcher tests whether a new diet reduces cholesterol compared to a standard diet. The two-sided p-value is 0.35.

There is not enough evidence to conclude that mean cholesterol differs between the two diets.
The diets produce exactly the same cholesterol levels.
There is a 35% probability that the new diet works.
The new diet definitely does not reduce cholesterol.

Answer Key

1. Variable types

Categorical, nominal. Species are names with no natural order.
Numerical, continuous. Weight is measured and can take decimal values.
Numerical, discrete. Number of visits is a count.
Categorical, ordinal. Pain levels are categories with a natural order.

2. Meeting hours

8.4 hours. The mean is (134/16 = 8.375), rounded to one decimal.
7 hours. The median is the average of the 8th and 9th values: ((7+7)/2=7).
The mean is larger. This suggests the distribution is right-skewed because the large value 28 pulls the mean upward.
The mean changes more. Removing 28 changes the mean from 8.4 to about 7.1, while the median remains 7.

3. Boxplot

About 51.5 dollars. This is the middle line of the box.
About 20.75 dollars. The IQR is (Q_3-Q_1).
Yes, outliers appear on both the low and high ends. Outliers are points beyond the whiskers.

4. Study design

Experiment; generalize: yes; causal: yes. Employees were randomly selected and randomly assigned.
Observational study; generalize: yes to incoming freshmen at that university; causal: no. There is no treatment assignment.
Experiment; generalize: no; causal: yes. Random assignment supports causation, but volunteers limit generalization.

5. Confounding variables

Temperature / hot weather is a likely confounding variable.
Hotter days can increase both the number of ice cream trucks and heat-related hospital visits, so temperature provides an alternative explanation for the association.

6. One-proportion simulation test

(H_0:p=0.60); (H_a:p<0.60).
0.533. The observed proportion is (80/150=0.533).
Between 0.05 and 0.10. The observed value is somewhat low, but not extremely far into the left tail.
Fail to reject (H_0). There is not enough evidence at () that less than 60% support the expansion.

7. Bootstrap confidence interval

About 0.56. This is the lower 2.5th percentile of the bootstrap distribution.
About 0.82. This is the upper 97.5th percentile of the bootstrap distribution.

8. One-proportion Z-test

(H_0:p=0.45); (H_a:p).
0.52. (p=104/200).
0.0352. (SE=).
1.99. (Z=(0.52-0.45)/0.0352).
Between 0.05 and 0.10. A two-sided test with (Z=1.85) gives about 0.064.
Reject (H_0). Since 0.032 < 0.05, there is convincing evidence the proportion differs from 45%.

9. Errors in hypothesis testing

(i). A Type I error is rejecting a true null: concluding more than 50% get relief when that is not true.
(ii). A Type II error is failing to reject a false null: not concluding more than 50% get relief when that is true.

10. CI and hypothesis test

Fail to reject (H_0).
The hypothesized value 100 is inside the 95% CI ((97.2,103.8)), so it is a plausible value at the 5% level.

11. Two-way table proportions

0.458. Among 18–34 fans, (55/120=0.458).
0.143. Among 55+ fans, (10/70=0.143).
Yes. Streaming preference is much higher for 18–34 fans than for 55+ fans, so age group and viewing preference appear associated.

12. Two-proportion Z-test

(H_0:p_A-p_B=0); (H_a:p_A-p_B).
0.6529. (p_{pool}=(126+96)/(180+160)).
0.0517. Use the pooled SE formula.
1.93. (Z=(0.70-0.60)/0.0517).
Reject (H_0). Since 0.046 < 0.05, there is convincing evidence of a difference in pass rates.

13. One-sample t-test

(H_0:); (H_a:).
3.80. (SE=22.8/).
1.95. (T=(157.4-150)/3.80).
35. Degrees of freedom are (n-1=36-1).
Fail to reject (H_0). Since 0.058 > 0.05, there is not enough evidence that the mean differs from 150 mg.

14. Two-sample t confidence interval

2.31. This is the pooled standard deviation.
0.653. (SE=s_p).
48. Degrees of freedom are (25+25-2).
1.31. (ME=2.011(0.653)).
Lower: -2.91; Upper: -0.29. The point estimate is (4.2-5.8=-1.6).

15. Paired t confidence interval

0.537. (SE_d=2.4/).
19. Degrees of freedom are (20-1).
0.93. (ME=1.729(0.537)).
Lower: -2.73; Upper: -0.87.
Yes. The 90% CI does not contain 0, so there is evidence at the 10% level that the program changes race times; because the interval is negative, it suggests improvement.

16. Paired or independent

Paired. Same students before and after.
Independent. Separate samples from two states.
Paired. Each participant rates both brands.
Independent. Gamers and non-gamers are separate groups.

17. Chi-squared test of independence

(H_0): education level and preferred source are independent. (H_a): they are associated.
53.33. Expected count (=120(160)/360).
35.00. Expected count (=120(105)/360).
3.333. Contribution (=(40-53.33)^2/53.33).
4. Degrees of freedom are ((3-1)(3-1)).
Reject (H_0). Since 0.0003 < 0.05, there is convincing evidence of an association.

18. Chi-squared true/false

False. Chi-squared tests can be used for larger tables.
True. Expected counts of at least 5 satisfy the usual condition.
True. A large chi-squared statistic indicates observed counts are far from expected counts under independence.
False. Chi-squared values are never negative.

19. ANOVA

MSG = 1240.8 / 2 = 620.4. Mean square equals sum of squares divided by df.
6.46. (F=620.4/96.0).
(H_0:_1=_2=_3). (H_a): at least one mean differs.
Reject (H_0). Since 0.0021 < 0.05, there is evidence that not all means are equal.
No. ANOVA indicates at least one difference, but follow-up comparisons are needed to identify which groups differ.

20. ANOVA true/false

False. A significant result means at least one mean differs, not necessarily all.
True. The F-statistic is a ratio of variances and is non-negative.
True. (F) means between-group and within-group variability are similar.
True. Roughly equal variances is one ANOVA condition.

21. Bonferroni adjustment

6. There are six pairwise comparisons.
0.0083. (^*=0.05/6).
A vs. C and C vs. D. Their p-values, 0.003 and 0.006, are below 0.0083.
False. A significant ANOVA does not guarantee all adjusted pairwise comparisons are significant.

22. Regression interpretation

Slope: For each additional 100 square feet of home size, the predicted monthly electricity bill increases by $8.40, on average.
Intercept: A home of size 0 hundreds of square feet has a predicted bill of $28.50. This does not make practical sense because a 0-square-foot home is not realistic.
$179.70. (28.5+8.4(18)=179.7).
68% of the variability in monthly electricity bills is explained by the linear relationship with home size.

23. Prediction and residual

$154.50. (28.5+8.4(15)=154.5).
$10.50. Residual (=165-154.5).
Underpredict. The actual bill is higher than predicted.
Above. Positive residuals appear above 0.

24. Correlation coefficients

-0.75. Strong negative linear relationship.
-0.05. Almost no linear relationship.
0.89. Strong positive linear relationship.
0.32. Weak positive linear relationship.

25. Regression inference

( = 81.062 + -0.444,commute).
(H_0:_1=0); (H_a:_1).
(t = -0.444 / 0.07 = -6.33). The t-statistic is estimate divided by standard error.
46. Degrees of freedom are (48-2).
Reject (H_0). The p-value for the slope is 9.12^{-8}, which is below 0.05, so there is evidence of a linear relationship.

26. Confidence interval for slope

36. Degrees of freedom are (38-2).
Lower: -2.09; Upper: -0.39. The interval is (-1.24(0.42)).
We are 95% confident that each additional hour of video games per day is associated with an average change of between 2.09 fewer and 0.39 fewer test-score points.
**Yes, reject (H_0:_1=0).** The entire interval is below 0, so 0 is not a plausible slope value at the 5% level.

27. Regression conditions

Linearity. A U-shaped residual pattern suggests the relationship is not linear.
Constant variance. A fan shape means the residual spread changes.
Normality of residuals. A strongly skewed residual histogram violates normality.

28. Procedure selection

Chi-squared test. Two categorical variables.
One-way ANOVA. Compare means across four groups.
Simple linear regression. Quantitative predictor and quantitative response.
Two-proportion Z confidence interval. Estimate a difference in two proportions.
Paired t-test. Each plot receives both fertilizers in matched halves.
One-sample t-test. Compare one mean wait time to 15 minutes.

29. P-values in a two-sample t-test

Smaller. A larger observed difference gives stronger evidence against (H_0).
Larger. Larger standard deviations increase the SE and weaken evidence.
Smaller. Larger sample sizes reduce the SE and strengthen evidence, all else equal.

30. Confidence interval width

Narrower. Larger (n) reduces the standard error.
Wider. Larger (s) increases the standard error.
Wider. A higher confidence level uses a larger critical value.

31. Success-failure condition

No. For Method A, there are only 6 complications, which is less than 10. The normal approximation condition is not met, even though the non-complications counts are large.

32. Interpreting p-values

(i). Since 0.02 < 0.05, there is convincing evidence that mean performance differs between programs.
(i). Since 0.35 > 0.05, there is not enough evidence to conclude the mean cholesterol differs between diets.