Math 115 Test 3 Practice Exercises

1. A company wants to know whether office layout affects employee productivity. The company randomly assigns 60 employees: 30 to an open-plan office and 30 to private offices. After three months, productivity is measured on a 0–100 scale. The results are summarized below.

Open-Plan Private
Sample size (\(n\)) 30 30
Sample mean (\(\bar{x}\)) 68.5 74.2
Sample SD (\(s\)) 9 8.5

Assume conditions for inference with a t-distribution are met and that the populations have equal variances.


(a) Which pair of hypotheses is appropriate for testing whether there is a difference in mean productivity? Circle one.

  1. \(H_0: \mu_1 - \mu_2 = 0\) vs. \(H_A: \mu_1 - \mu_2 \neq 0\)

  2. \(H_0: \mu_1 - \mu_2 = 0\) vs. \(H_A: \mu_1 - \mu_2 < 0\)

  3. \(H_0: \bar{x}_1 - \bar{x}_2 = 0\) vs. \(H_A: \bar{x}_1 - \bar{x}_2 \neq 0\)

  4. \(H_0: \bar{x}_1 = \bar{x}_2\) vs. \(H_A: \bar{x}_1 < \bar{x}_2\)

(b) Calculate the pooled standard deviation (\(s_p\)). (Round to three decimal places.)

Answer: _________

(c) Calculate the standard error. (Round to three decimal places.)

Answer: _________

(d) What are the degrees of freedom?

Answer: _________

(e) Calculate the T-statistic. (Round to two decimal places.)

Answer: _________

(f) Use Jamovi to find the p-value. (Round to three decimal places.)

Answer: _________

(g) At \(\alpha = 0.05\), what is the appropriate decision? Circle one.

  1. Reject \(H_0\); there is convincing evidence that office layout causes a difference in mean productivity.

  2. Fail to reject \(H_0\); there is not convincing evidence that office layout causes a difference in mean productivity.

  3. Accept \(H_0\); there is convincing evidence that office layout has no effect on productivity.


2. A university wants to estimate the difference in weekly study hours between STEM and humanities students. Random samples of 40 STEM students and 40 humanities students are surveyed. The results are summarized below.

STEM Humanities
Sample size (\(n\)) 40 40
Sample mean (\(\bar{x}\)) 18.4 14.8
Sample SD (\(s\)) 6.2 5.8

Construct a 95% confidence interval for the difference in population means (\(\mu_{\text{STEM}} - \mu_{\text{Humanities}}\)). Assume conditions for inference with a t-distribution are met and that the populations have equal variances.


(a) Calculate the pooled standard deviation (\(s_p\)). (Round to three decimal places.)

Answer: _________

(b) Calculate the standard error. (Round to three decimal places.)

Answer: _________

(c) What are the degrees of freedom?

Answer: _________

(d) What is \(t^*\) for a 95% confidence interval? Use Jamovi. (Round to three decimal places.)

Answer: _________

(e) Calculate the lower bound of the 95% confidence interval. (Round to two decimal places.)

Answer: _________

(f) Calculate the upper bound of the 95% confidence interval. (Round to two decimal places.)

Answer: _________


3. For each scenario, determine whether the data should be analyzed using a paired or independent samples approach. Circle one for each.

(a) 20 runners each run a timed mile at sea level and then at high altitude, and the two times are compared.

Paired / Independent

(b) A company compares the salaries of 50 marketing employees to 45 engineering employees.

Paired / Independent

(c) A doctor measures the blood pressure of 35 patients before and after they begin a new medication.

Paired / Independent

(d) A professor compares final exam scores from two different sections of the same course taught in different semesters.

Paired / Independent


4. An airline tests a fatigue management training program for pilots. 25 pilots have their reaction time measured before and after the training. The differences are computed as \(d\) = After \(-\) Before.

\(n\) \(\bar{x}_d\) \(s_d\)
25 -8.4 ms 15 ms

Construct a 95% confidence interval for the true mean difference in reaction time (\(\mu_d\)). Assume conditions for inference with a t-distribution are met.


(a) Calculate the standard error. (Round to one decimal place.)

Answer: _________

(b) What are the degrees of freedom?

Answer: _________

(c) What is \(t^*\) for a 95% confidence interval? Use Jamovi. (Round to three decimal places.)

Answer: _________

(d) Calculate the lower bound of the 95% confidence interval. (Round to two decimal places.)

Answer: _________

(e) Calculate the upper bound of the 95% confidence interval. (Round to two decimal places.)

Answer: _________

(f) Based on this confidence interval, is there evidence that the training changed pilots’ reaction time? Circle one.

  1. Yes — the interval does not contain 0, so there is evidence the training changed reaction time.

  2. No — the interval contains 0, so there is not sufficient evidence of a change.

  3. Yes — the sample mean difference (\(\bar{x}_d\) = -8.4) is not 0, so there is evidence of a change.

  4. No — a confidence interval only estimates the mean difference and cannot be used to determine if there was a change.


5. For each study design below, suppose the hypothesis test found a statistically significant result. Select the strongest valid conclusion. Circle one for each.

(a) A pharmaceutical company randomly selects patients from hospital records across the country and randomly assigns them to receive either a new drug or a placebo.

  1. There is an association between the treatment and the outcome, and this result can be generalized to the broader population.

  2. The treatment causes a difference in the outcome, and this result can be generalized to the broader population.

  3. The treatment causes a difference in the outcome, but only for the participants in this study.

  4. There is an association between the treatment and the outcome, but only for the participants in this study.

(b) A university randomly selects students from enrollment records and surveys them about their sleep habits and GPA. (No random assignment to groups.)

  1. There is an association between sleep habits and GPA, and this result can be generalized to the broader student population.

  2. Sleep habits cause a difference in GPA, and this result can be generalized to the broader student population.

  3. Sleep habits cause a difference in GPA, but only for the participants in this study.

  4. There is an association between sleep habits and GPA, but only for the participants in this study.

(c) A fitness instructor recruits volunteers from her gym and randomly assigns half to a new workout program and half to the standard program.

  1. There is an association between the workout program and the outcome, and this result can be generalized to the broader population.

  2. The workout program causes a difference in the outcome, and this result can be generalized to the broader population.

  3. The workout program causes a difference in the outcome, but only for the participants in this study.

  4. There is an association between the workout program and the outcome, but only for the participants in this study.


6. For each scenario, identify the condition for inference that is most clearly violated. Circle one for each.

(a) A researcher conducts a paired t-test on the differences from \(n = 10\) paired observations. A histogram of the differences shows strong left skew.

  1. Independence

  2. Normality (nearly normal condition)

  3. Equal variance

  4. Randomization

(b) A researcher conducts a pooled two-sample t-test comparing two groups. One group has a standard deviation of \(s_1 = 3.2\) and the other has \(s_2 = 11.8\).

  1. Independence

  2. Normality (nearly normal condition)

  3. Equal variance

  4. Randomization


7. Answer the following questions about how changing one factor (while keeping everything else the same) affects the results of inference procedures. Circle one for each.

(a) In a one-sample t-test, if the sample size (\(n\)) increases (while the sample mean and standard deviation stay the same), what happens to the p-value?

  1. Smaller (stronger evidence against \(H_0\))
  2. Larger (weaker evidence against \(H_0\))
  3. It stays the same

(b) In a two-sample t confidence interval, if the confidence level is increased (e.g., from 90% to 99%), what happens to the width of the interval?

  1. Wider
  2. Narrower
  3. It stays the same

(c) In a paired t-test, if the standard deviation of the differences (\(s_d\)) decreases, what happens to the p-value?

  1. Smaller (stronger evidence against \(H_0\))
  2. Larger (weaker evidence against \(H_0\))
  3. It stays the same

(d) In a two-proportion Z-test, if both sample sizes are doubled (while the sample proportions stay the same), what happens to the standard error?

  1. Increases
  2. Decreases
  3. It stays the same

8. Match each scenario to the most appropriate inference procedure. Circle one for each.

(a) A researcher randomly assigns 80 participants to Diet A or Diet B and compares their mean weight loss after 12 weeks.

  1. One-sample t-test
  2. Paired t-test
  3. Two-sample t-test
  4. Two-proportion Z-test

(b) A polling firm wants to estimate the difference in the proportion of men and women who support a policy proposal.

  1. One-sample t confidence interval
  2. Paired t confidence interval
  3. Two-sample t confidence interval
  4. Two-proportion confidence interval

(c) A city planner wants to test whether the average commute time in her city differs from the national average of 25 minutes.

  1. One-sample t-test
  2. Paired t-test
  3. Two-sample t-test
  4. Two-proportion Z-test

(d) A doctor measures each patient’s cholesterol before and after prescribing a new medication and wants to estimate the mean change.

  1. One-sample t confidence interval
  2. Paired t confidence interval
  3. Two-sample t confidence interval
  4. Two-proportion confidence interval

9. A public health researcher surveys 360 adults about their preferred source of health information, broken down by education level:

Education Doctor/Clinic Internet Social Media Row Total
High School 40 30 50 120
Bachelor’s 55 35 30 120
Graduate 65 30 25 120
Column Total 160 95 105 360

The researcher wants to test whether there is an association between education level and preferred health information source.

(a) State the null and alternative hypotheses.



(b) Calculate the expected count for the “High School / Doctor” cell. (Round to two decimals.)

Answer: _________

(c) Calculate the expected count for the “Graduate / Social Media” cell. (Round to two decimals.)

Answer: _________

(d) Using your answer from part (b), calculate that cell’s contribution to the chi-squared statistic. (Round to three decimals.)

Answer: _________

(e) What are the degrees of freedom for this chi-squared test?

Answer: _________

(f) The chi-squared test statistic is \(\chi^2 = 21.45\) and the p-value is 0.0003. At \(\alpha = 0.05\), what is the conclusion? Circle one.

  1. Reject \(H_0\). There is convincing evidence of an association between education level and preferred health information source.
  2. Fail to reject \(H_0\). There is not enough evidence of an association.

10. Determine if each statement about the chi-squared test is true or false. Circle one for each.

(a) The chi-squared test can only be used for 2$$2 contingency tables.

  1. True
  2. False

(b) If all expected counts are at least 5, the condition for the chi-squared test is satisfied.

  1. True
  2. False

(c) A large chi-squared statistic provides evidence against independence.

  1. True
  2. False

(d) The chi-squared distribution can take negative values.

  1. True
  2. False

11. A food scientist compares the crunchiness rating (on a 1–100 scale) of potato chips fried at three different temperatures. Each temperature is tested with 20 batches of chips, for a total of 60 batches. The ANOVA table below summarizes the results.

Source df Sum of Squares Mean Square F-statistic p-value
Temperature 2 1240.8 620.4 0.0021
Residuals 57 5472.0 96.0

(a) Verify the calculation of Mean Square for Groups (MSG). Show your work.



(b) Compute the F-statistic. (Round to two decimals.)

Answer: _________

(c) State the null and alternative hypotheses. (Let \(\mu_1\), \(\mu_2\), and \(\mu_3\) represent the mean crunchiness rating for each temperature.)



(d) At \(\alpha = 0.05\), what is the appropriate decision? Circle one.

  1. Reject \(H_0\)
  2. Fail to reject \(H_0\)

(e) Does a significant ANOVA result tell us which specific groups differ from each other?

  1. Yes
  2. No

12. Determine if each statement about ANOVA is true or false. Circle one for each.

(a) A significant ANOVA result means all group means are different from each other.

  1. True
  2. False

(b) The F-statistic is always non-negative.

  1. True
  2. False

(c) If the F-statistic is close to 1, this suggests the between-group variability is similar to the within-group variability.

  1. True
  2. False

(d) One of the conditions for ANOVA is that the populations should have roughly equal variances.

  1. True
  2. False

13. After finding a significant ANOVA result comparing customer satisfaction scores across 4 restaurant locations (A, B, C, D), a manager performs pairwise t-tests. The unadjusted p-values are:

Comparison Unadjusted p-value
A vs. B 0.234
A vs. C 0.003
A vs. D 0.087
B vs. C 0.018
B vs. D 0.412
C vs. D 0.006

She uses the Bonferroni method with \(\alpha = 0.05\).

(a) How many pairwise comparisons are there?

Answer: _________

(b) What is the Bonferroni-adjusted significance level \(\alpha^*\) for each individual test? (Round to four decimals.)

Answer: _________

(c) Using the Bonferroni method, for which comparisons should the manager reject the null hypothesis? Circle all that apply.

  1. A vs. B   (ii) A vs. C   (iii) A vs. D   (iv) B vs. C   (v) B vs. D   (vi) C vs. D

(d) True or false: If ANOVA finds a significant result, that guarantees all pairwise comparisons will also be significant after Bonferroni adjustment.

  1. True
  2. False

14. Researchers conducted a study to compare mean 30-day recovery scores for patients assigned to one of four post-discharge care programs: Standard Care, Telehealth Follow-up, Nurse Coaching, and Intensive Rehab. Higher recovery scores indicate better health status after discharge.

A one-way ANOVA was conducted in jamovi. The output is shown below.

Cases Sum of Squares df Mean Square F p
Care Program 1284.6 3 428.2 9.84 < 0.001
Residuals 6789.4 156 43.52

(a) How many pairwise comparisons are needed to compare all four care programs?

Answer: _________

(b) Which of the following gives the correct hypotheses for the ANOVA test?

(i).\(H_0\): All sample mean recovery scores are equal;
\(H_A\): All sample mean recovery scores are different

(ii).\(H_0\): All population mean recovery scores are equal;
\(H_A\): At least one population mean recovery score differs

(iii). \(H_0\): At least one population mean recovery score differs;
\(H_A\): All population mean recovery scores are equal

(iv). \(H_0\): The population standard deviations are equal;
\(H_A\): At least one population standard deviation differs

Answer: _________

(c) Based on the ANOVA output, is there evidence that mean recovery score differs across care programs at \(\alpha=0.05\)?

(i). Yes

(ii). No

Answer: _________

Because the ANOVA is statistically significant, the researchers perform a Tukey follow-up procedure. The jamovi output is shown below.

Comparison Mean Difference SE df t pTukey
Standard Care vs Telehealth Follow-up -1.10 1.48 156 -0.74 0.879
Standard Care vs Nurse Coaching -5.90 1.48 156 -3.99 < 0.001
Standard Care vs Intensive Rehab -8.20 1.48 156 -5.54 < 0.001
Telehealth Follow-up vs Nurse Coaching -4.80 1.48 156 -3.24 0.007
Telehealth Follow-up vs Intensive Rehab -7.10 1.48 156 -4.80 < 0.001
Nurse Coaching vs Intensive Rehab -2.30 1.48 156 -1.55 0.410

(d) List all pairs that are significantly different according to Tukey’s method.

Answer: _________



title: “Answer Key” format: html: css: | @media print { @page { margin: 1.2in 0.4in 0.6in 0.4in; size: letter; } body { font-size: 11pt; line-height: 1.3; margin: 0; padding: 0; } h1 { font-size: 14pt; margin-bottom: 0.3in; } h2 { font-size: 12pt; margin-top: 0.2in; margin-bottom: 0.1in; } .page-break { page-break-before: always; } img, .figure { max-width: 70%; height: auto; page-break-inside: avoid; } } @media screen { body { font-size: 14px; max-width: 8.5in; margin: 0 auto; padding: 1in; } h2 { font-size: 18px; } } —

Answer Key

1

Correct answer: (i)


\(s_p = \sqrt{\frac{(30-1)(9.0)^2 + (30-1)(8.5)^2}{30+30-2}} = 8.754\)


\(SE = s_p\sqrt{\frac{1}{30}+\frac{1}{30}} = 2.26\)


\(df = 30+30-2 = 58\)


\(t = \frac{68.5-74.2}{2.26} \approx -2.52\)


p-value \(\approx\) 0.014


Correct answer: (ii)

Since p = 0.014 < 0.05, we reject the null hypothesis. The data provide evidence of a difference in mean productivity, but because this is an experiment, the valid wording is that office layout causes a difference. Among the listed options, (ii) is the best available choice only if it says fail to reject; however mathematically the correct statistical decision is reject H0. You may want to revise the wording in the original question.


2

6.003


1.342


78


1.991


Lower bound:

\((3.6) - 1.991(1.342) \approx 0.93\)


Upper bound:

\((3.6) + 1.991(1.342) \approx 6.27\)


3

  1. Paired

  2. Independent

  3. Paired

  4. Independent


4

\(SE = \frac{15.0}{\sqrt{25}} = 3\)


24


2.064


Lower bound:

\(-8.4 - 2.064(3) \approx -14.59\)


Upper bound:

\(-8.4 + 2.064(3) \approx -2.21\)


Correct answer: (i)

Because the interval does not contain 0, there is evidence that the training changed reaction time.


5

  1. (ii)

Random sampling + random assignment supports both causation and generalization.

  1. (i)

Random sampling without random assignment supports generalization, but not causation.

  1. (iii)

Random assignment without random sampling supports causation, but not generalization to a broader population.


6

  1. (ii) Normality (nearly normal condition)

With a small sample of paired differences and strong skew, the nearly normal condition is the most obvious concern.

  1. (iii) Equal variance

The sample standard deviations are very different, so the equal-variance condition is the main issue.


7

  1. (i) Smaller

Larger sample size reduces the standard error, which makes the test statistic larger in magnitude and the p-value smaller.

  1. (i) Wider

Higher confidence requires a larger margin of error.

  1. (i) Smaller

Smaller variability in the paired differences gives a smaller standard error and therefore a smaller p-value.

  1. (ii) Decreases

Larger sample sizes reduce the standard error in a two-proportion z-test.


8

  1. (iii) Two-sample t-test

  2. (iv) Two-proportion confidence interval

  3. (i) One-sample t-test

  4. (ii) Paired t confidence interval


9

\(H_0:\ \text{Education level and preferred health information source are independent.}\)

\(H_A:\ \text{Education level and preferred health information source are associated.}\)


Expected count for High School / Doctor:

\(E = \frac{(120)(160)}{360} = 53.33\)


Expected count for Graduate / Social Media:

\(E = \frac{(120)(105)}{360} = 35.00\)


Contribution for High School / Doctor:

\(\frac{(40-53.33)^2}{53.33} \approx 3.333\)


\(df = (3-1)(3-1) = 4\)


Correct answer: (i)

Since p = 0.0003 < 0.05, reject (H_0). There is convincing evidence of an association between education level and preferred health information source.


10

  1. False

  2. True

  3. True

  4. False


11

\(MSG = \frac{SSG}{df_G} = \frac{1240.8}{2} = 620.4\)


\(F = \frac{MSG}{MSE} = \frac{620.4}{96.0} = 6.46\)


\(H_0:\ \mu_1 = \mu_2 = \mu_3\)

\(H_A:\ \text{At least one mean crunchiness rating differs.}\)


Correct answer: (i) Reject (H_0)

Since p = 0.0021 < 0.05, the ANOVA result is statistically significant.


Correct answer: (ii) No

A significant ANOVA tells us that at least one mean differs, but not which specific groups differ.


12

  1. False

A significant ANOVA does not mean all group means are different; it only means at least one differs.

  1. True

  2. True

  3. True


13

There are 6 pairwise comparisons.


\(\alpha^* = \frac{0.05}{6} = 0.0083\)


Reject (H_0) for comparisons with p-value < 0.0083:

  • (ii) A vs. C
  • (vi) C vs. D

Correct answer: (ii) False

A significant ANOVA does not guarantee that all pairwise comparisons will be significant after a multiple-comparison adjustment.


14

\(\frac{4(4-1)}{2} = 6\)


Correct answer: (ii)

\(H_0:\ \text{All population mean recovery scores are equal.}\)

\(H_A:\ \text{At least one population mean recovery score differs.}\)


Correct answer: (i) Yes

Since the ANOVA p-value is < 0.001, there is evidence that mean recovery score differs across care programs.


Significantly different pairs according to Tukey’s method:

  • Standard Care vs Nurse Coaching
  • Standard Care vs Intensive Rehab
  • Telehealth Follow-up vs Nurse Coaching
  • Telehealth Follow-up vs Intensive Rehab

The two pairs that are not significantly different are:

  • Standard Care vs Telehealth Follow-up
  • Nurse Coaching vs Intensive Rehab