Hypothesis Testing with Randomization

IMS2 Ch. 11
Math 115

Yurk

Flavor Preferences

  • Research question: Do people on the East Coast have a higher preference for cola than people on the West Coast?
  • soda dataset
  • 2 variables
    • location: East or West
    • drink preference: Orange or Cola
  • 60 individuals (34 from East, 26 from West)

Results (EDA)

location Cola Orange total
East 28 6 34
West 19 7 26
total 47 13 60

Standardized barplot showing proportions of drink preferences

Difference in proportions

  • Success: drink = Cola
  • Statistic of interest: difference in proportions \[\hat{p}_E-\hat{p}_W\]
  • Observed difference: \[\frac{28}{34}-\frac{19}{26}=0.09276\]

Hypothesis Test

  • From the sample it appears that there is a stronger preference for cola on the East Coast
  • It may be that there is no real difference in preference in the population, and the observed difference is not surprising when selecting a sample of this size from the population
  • A hypothesis test states these two possibilities formally as hypotheses then weighs them against each other using the results from the sample as evidence

Hypotheses

  • The null hypothesis, denoted \(H_0\), represents a skeptical perspective or a claim of no difference
  • The alternative hypothesis, denote \(H_A\), represents an alternative claim of difference.
  • As statisticians, we usually establish hypotheses before viewing the data in order to avoid bias

In words:

    \(H_0:\) Location has no
    effect on preference for
    cola over orange soda.
    \(H_A:\) There is a higher
    preference for cola
    over orange soda on the
    East Coast than on the
    West Coast.

In symbols:

    \(H_0: p_E - p_W = 0\)
    \(H_A: p_E - p_W > 0\)

Null Distribution

  • We test the null hypothesis by comparing the observed value of the statistic to a null distribution
  • If the null hypothesis is true and we select different samples of the same size from the population, we would expect the value of the statistic to vary between samples
  • The null distribution is the distribution that describes those values
  • It is an example of a sampling distribution (distribution of a statistic)

Simulating the Null Hypothesis

  • To test our cola preference hypothesis, we need to compare our observed difference (0.093) to what we’d expect if there really was no regional difference
  • We can simulate “no difference” by mixing up the cola preferences and randomly reassigning them to East Coast and West Coast
  • Each of these shuffles is called a random permutation

Shuffling Cards to Simulate the Null Hypothesis

Original Samples

East Coast (34 people)

Cola: 28
Orange: 6
Proportion Cola: 0.824

West Coast (26 people)

Cola: 19
Orange: 7
Proportion Cola: 0.731
Difference in Proportions: 0.824 - 0.731 = 0.093 ← Original Data

Distribution of Differences

Red circle shows original difference (0.093). Blue circles show permutation differences.

Dot plot of 100 differences in randomized proportions (null distribution), showing observed difference as dashed vertical line.

Random Permutations with a Computer

  • We can use computers to create thousands of random permutations
  • With a computer, we don’t actually shuffle cards. Instead we randomly permute the values of the response variable
  • Each permutation gives us one possible difference under the assumption that region doesn’t matter
  • The distribution of these differences is the null distribution

Here is the original soda data with 5 random permutations of the response variable.

p-Value

  • To test the null hypothesis (\(p_E-p_W = 0\)) we consider how probable it would be to get a difference in proportions that is at least as large as the observed difference if \(H_0\) is true
  • This probability is called a p-value
  • We use the null distribution to calculate the p-value

There are 28 differences in randomization proportions that are greater than or equal to the observed value (0.09276). So we estimate the p-value to be 28/100 = 0.28.

Dot plot of 100 differences in randomized proportions (null distribution), showing observed difference as dashed vertical line.

Significance Level

  • Before we conduct a study, we define a significance level, denoted \(\alpha\)
  • We decide that in order to reject the null hypothesis as false, the p-value must be less than \(\alpha\)
  • The significance level is the standard of evidence we will use to judge the null hypothesis
  • We presume the null hypothesis is true, but we are willing to reject it if the evidence against it is strong enough (the p-value is less than \(\alpha\))
  • Typical values for \(\alpha\) are 0.05 and 0.01
  • Sometimes other values are used
  • Unless otherwise noted, we will always use \(\alpha = 0.05\)

Conclusion

  • In the soda example, the observed difference in proportions (\(\hat{p}_E-\hat{p}_W = 0.09276\)) does not allow us to reject the null hypothesis (p = 0.28) at the \(\alpha = .05\) significance level.
  • The difference in the proportions is not statistically significant
  • This means that it is plausible that there is no difference in the proportions of people who prefer cola to orange soda between the East and West Coast.