Inference for Contingency Tables

Chi-square Test

Math 115

Extending Our Framework

Previously, we compared two groups with a binary outcome (2×2 table).

  • Used difference in proportions: \(\hat{p}_1 - \hat{p}_2\)
  • Used Z-statistic for hypothesis testing

But what if we have:

  • More than 2 groups? (e.g., Dem, Ind, Rep)
  • Non-binary outcomes? (e.g., Spending: Too Little, About Right, Too Much)

We need a new approach!

Two Ways to Generalize

Generalization 1: More groups

  • Explanatory variable has 3+ categories
  • Example: Democrat, Independent, Republican

Generalization 2: Non-binary response

  • Response variable has 3+ categories
  • Example (Government Spending): “Too Little”, “About Right”, “Too Much”

With either generalization, we can’t use a single difference in proportions!

The Hypothesis Framework

Same concept as before, different statistic:

  • \(H_0\): No association between the variables (independence)
  • \(H_A\): There is an association (variables are not independent)

New statistic: Chi-square, \(\chi^2\), measures deviation from independence.

Government Spending

Research question: Is political party associated with opinions on military spending?

  • Explanatory: Party (Dem, Ind, Rep) — 3 categories
  • Response: Opinion on military spending (“Too Little”, “About Right”, “Too Much”) — 3 categories
  • Data: General Social Survey, 2016 (n = 149)

This creates a 3×3 contingency table.

Military Spending

  • Hypotheses stated in terms of an association

    • \(H_0\): There is no association between opinions on government spending on national defense and political affiliations.
    • \(H_A\): There is an association between opinions on government spending on national defense and political affiliations.
  • Hypotheses stated in terms of differences

    • \(H_0\): There is no difference in opinions on government spending on national defense between people with different political affiliations.
    • \(H_A\): There is some difference in opinions on government spending on national defense between people with different political affiliations.

The Contingency Table

Party Too Little About Right Too Much Total
Dem 17 14 12 43
Ind 20 28 24 72
Rep 24 8 2 34
Total 61 50 38 149

Beyond simple \(2\times 2\) tables, we can’t compute just one difference in proportions.

We need to compare all observed counts to what we’d expect under \(H_0\).

What Would We Expect Under \(H_0\)?

If \(H_0\) is true (no association), each party should have the same distribution of opinions.

Expected count for any cell:

\[\text{Expected} = \frac{\text{row total} \times \text{column total}}{\text{grand total}}\]

This distributes the overall opinion proportions equally across all parties.

Expected Counts

Example: Democrats who say “Too Little”

\[\text{Expected} = \frac{43 \times 61}{149} = 17.6\]

Party Too Little About Right Too Much
Dem 17 (17.6) 14 (14.4) 12 (11)
Ind 20 (29.5) 28 (24.2) 24 (18.4)
Rep 24 (13.9) 8 (11.4) 2 (8.7)

Observed (Expected) — How different are they?

The Chi-Square Statistic

Measures total deviation of observed from expected:

\[\chi^2 = \sum_{\text{all cells}} \frac{(\text{Observed} - \text{Expected})^2}{\text{Expected}}\]

Why this formula?

  • Squaring: Makes all contributions positive
  • Dividing by Expected: Standardizes (a difference of 5 matters more when Expected is 10 than when Expected is 100)

Computing \(\chi^2\)

For each cell, compute \(\frac{(O - E)^2}{E}\):

Party Too Little About Right Too Much
Dem \(\frac{(17 - 17.6)^2}{17.6} = 0.02\) \(\frac{(14 - 14.4)^2}{14.4} = 0.01\) \(\frac{(12 - 11)^2}{11} = 0.10\)
Ind \(\frac{(20 - 29.5)^2}{29.5} = 3.05\) \(\frac{(28 - 24.2)^2}{24.2} = 0.61\) \(\frac{(24 - 18.4)^2}{18.4} = 1.73\)
Rep \(\frac{(24 - 13.9)^2}{13.9} = 7.30\) \(\frac{(8 - 11.4)^2}{11.4} = 1.02\) \(\frac{(2 - 8.7)^2}{8.7} = 5.13\)

\[\chi^2 = 0.02 + 0.01 + 0.10 + 3.05 + 0.61 + 1.73 + 7.30 + 1.02 + 5.13 = 18.97\]

What Does \(\chi^2\) Tell Us?

  • \(\chi^2 = 0\) means observed = expected exactly (perfect independence)
  • Larger \(\chi^2\) means greater deviation from independence
  • Larger \(\chi^2\) → stronger evidence against \(H_0\)

But how large is “large enough” to reject \(H_0\)?

We need a reference distribution → chi-square distribution

Randomization Test for Independence

  • We can randomly permute the response (opinion) to simulate the null hypothesis being true
  • For each permuted sample, we calculate value of the \(\chi^2\) statistic
  • Let’s construct a null distribution for the military spending question

Here is the original GSS data with 5 random permutations.

# A tibble: 149 × 8
      id natarms    party randPerm1 randPerm2 randPerm3 randPerm4 randPerm5
   <int> <fct>      <chr> <fct>     <fct>     <fct>     <fct>     <fct>    
 1     1 TOO LITTLE Ind   Rep       Ind       Ind       Ind       Dem      
 2     2 TOO MUCH   Ind   Dem       Dem       Ind       Rep       Ind      
 3     3 TOO MUCH   Dem   Dem       Rep       Rep       Rep       Dem      
 4     4 TOO MUCH   Ind   Rep       Ind       Dem       Rep       Rep      
 5     5 TOO MUCH   Ind   Ind       Dem       Rep       Ind       Ind      
 6     6 TOO MUCH   Ind   Dem       Rep       Rep       Dem       Ind      
 7     7 TOO MUCH   Ind   Dem       Dem       Ind       Ind       Dem      
 8     8 TOO MUCH   Dem   Ind       Ind       Ind       Dem       Rep      
 9     9 TOO LITTLE Dem   Ind       Dem       Ind       Rep       Ind      
10    10 TOO LITTLE Ind   Dem       Dem       Ind       Rep       Ind      
# ℹ 139 more rows

Here is the dotplot of the corresponding \(\chi^2\) statistics.

Here is the resulting histogram of 1000 simulations

Histogram of \(X^2\) statistics for 1,000 random permutations. Observed value (\(18.97\)) indicated by dashed vertical line.

  • Note that the shape of the histogram is neither symmetric nor bell-shaped. In fact, it only uses non-negative values
  • The p-value is always in the in the right tail (as large or larger than observed \(\chi^2\) stat)

  • From the histogram, there were no values of \(\chi^2\) that were as extreme as the observed value

  • So the p-value is approximately 0

  • We reject null hypothesis

  • In the context of the problem, we can conclude that there is strong evidence of an association between opinions on military spending and political party (the two variables are not independent)

Test for Independence Using a Mathematical Model

Chi-squared test for assessing independence between categorical variables

When the null-hypothesis is true and the following conditions are met, \(X^2\) has a Chi-squared distribution with \(df=(r-1)\times(c-1)\) degrees of freedom:

  1. Independent observations
  2. Large samples: at least 5 expected counts in each cell
  • \(r\) is the number of rows and \(c\) is the number of columns in the two-way table (no totals)

Degrees of Freedom

For contingency tables:

\[df = (\text{rows} - 1) \times (\text{columns} - 1)\]

For our 3×3 table:

\[df = (3-1) \times (3-1) = 2 \times 2 = 4\]

Intuition: After fixing row and column totals, only \(df\) cells can vary freely.

  • The contingency table satisfies the large samples condition (at least 5 expected counts in each cell)

Chi-squared disribution with \(df=4\).

The Chi-Square Distribution

  • Always right-skewed, starts at 0
  • As df increases, distribution shifts right

The Null Distribution

When conditions are met, we can use the chi-square distribution as our reference:

  • Use the chi-square distribution with \(df = (r-1) \times (c-1)\)
  • This distribution describes what \(\chi^2\) values we’d expect if \(H_0\) is true

Computing the p-value:

  • P-value = area to the right of observed \(\chi^2\) under the chi-square curve
  • Larger \(\chi^2\) values are more extreme → stronger evidence against \(H_0\)

Conditions for Chi-Square Test

Independence:

  • Random sample
  • Observations independent of each other

Expected Counts (under \(H_0\)):

  • At least 5 expected in each cell

Check expected counts, not observed counts!

Finding the P-value

Chi-squared distribution with \(df=4\)

P-value < 0.001

Conclusion: Military Spending

Results:

  • \(\chi^2 = 18.97\)
  • df = 4
  • P-value < 0.001

Result:

  • As with the randomization-based test, the p-value is very small (<0.001) for the military spending question
  • Conclusion
    • We reject null hypothesis in this case
    • There is strong evidence that opinion on military spending and political affiliation are associated
    • We can generalize these results to a larger population since it was a representative sample
    • We cannot draw cause-and-effect conclusion since it was an observational study

Note: The chi-square test tells us there IS an association, but not the direction or which groups differ.

Second Example: Space Exploration

Same parties, same opinion categories, but asking about space exploration spending.

Party Too Little About Right Too Much Total
Dem 8 22 13 43
Ind 13 37 22 72
Rep 9 17 8 34

\(\chi^2 = 1.33\), df = 4, P-value = 0.857

Comparing the Two Results

Issue \(\chi^2\) P-value Conclusion
Military 18.97 < 0.001 Strong association
Space 1.33 0.857 No evidence of association

Same groups, different topics, very different conclusions!

  • Political party is associated with military spending views
  • No evidence that political party is associated with space exploration views

Goodness-of-Fit Test (Optional)

  • The Chi-square goodness-of-fit test checks if observed categorical data fit an expected distribution.

  • Formula: \[\chi^2 = \sum \frac{(Obs - Exp)^2}{Exp}\]

  • Used for genetics, health studies, or marketing data.

  • Example: Do observed blood type frequencies in a population match known distribution?

Blood Type

  • Assume that the expected probabilities of various blood types in the general population are:
    A = 0.40, B = 0.11, AB = 0.04, O = 0.45

  • Suppose we have a random sample of 350 people with the following observed blood types:
Blood Type Observed Count
A 170
B 120
AB 30
O 80
Total 350
  • Research Question: Is there significant evidence that the distribution of blood types in the sample is different from the population’s distribution?
  • We need to calculate the \(\chi^2\)-statistic for this data set.
  • The expected counts are calculated as “Sample Size” \(\times\) “Assumed Probability”
Blood Type Observed Count Expected Counts
A 155 350*0.40 = 140
B 40 350*0.11 = 38.5
AB 15 350*0.04 = 14
O 140 350*0.45 = 157.5
  • If the validity conditions are met (Expected counts \(\ge\) 5) then the distribution of the test statistic is \(\chi^2(d-1)\), where \(d\) is the number of values in the categorical variable. (In our example \(d = 4\))

Chi-square Calculation and p-value

The value of the \(\chi^2\) statistic is:\[\chi^2=\frac{(155-140)^2}{140}+\frac{(40-38.5)^2}{38.5}+\frac{(15-14)^2}{14}+\frac{(140-157.5)^2}{157.5}=3.6815\]

# Observed data
observed <- c(A = 155, B = 40, AB = 15, O = 140)

# Assumed distribution of proportions
expected_prop <- c(A = 0.40, B = 0.11, AB = 0.04, O = 0.45)

# Expected counts
total <- sum(observed)
expected <- total * expected_prop

# Chi-square statistic
chi_sq <- sum((observed - expected)^2 / expected)
chi_sq
[1] 3.681457
# Degrees of freedom = (categories - 1)
df <- length(observed) - 1
df
[1] 3
# p-value
p_val <- pchisq(chi_sq, df, lower.tail = FALSE)

Chi-Square distribution and p-value

Chi_Square DF P_Value
3.6815 3 0.298

Interpretation

  • If p-value < 0.05 → observed distribution significantly differs from expected.

  • If p-value > 0.05 → we don’t have significant evidence that the observed distribution significantly differs from expected.

  • Here, p-value indicates whether the sample matches the expected blood type proportions.

Summary

Chi-square test extends two-proportion test to larger tables:

Component Formula/Value
Expected count \(\frac{\text{row total} \times \text{column total}}{\text{grand total}}\)
Chi-square statistic \(\chi^2 = \sum \frac{(O - E)^2}{E}\)
Degrees of freedom \(df = (r-1) \times (c-1)\)
P-value Right tail of chi-square distribution

Conditions: Independence + Expected counts ≥ 5 in each cell

Connection to Topic 10

Two Proportions Chi-Square Test
Table size 2×2 only Any r×c
Test statistic Z \(\chi^2\)
Null distribution Normal Chi-square
What we test \(p_1 - p_2 = 0\) Independence
Direction Can be one-sided Always two-sided

Chi-square generalizes the two-proportion test to larger tables!

(For 2×2 tables, both methods give equivalent results)

References