Chi-square Test

Test the association between two categorical variables in a 2×2 contingency table, with odds ratio and relative risk.

Overview

The Chi-square (χ²) test of independence assesses whether two categorical variables are statistically associated. For a 2×2 table, this is equivalent to asking whether the proportion of an outcome differs between two groups (exposed vs. unexposed, treatment vs. control).

In addition to the test statistic and p-value, Statulator computes the Odds Ratio (OR) and Relative Risk (RR) with 95% confidence intervals, giving you measures of effect size alongside statistical significance.

The 2×2 Table Layout
Outcome +Outcome −
Exposedab
Unexposedcd

The diagonal cells (a, d) count exposure-outcome agreement; the off-diagonal cells (b, c) count disagreement. Chi-squared compares the observed pattern to what is expected under independence.

Worked Example

Scenario: Smoking and Lung Disease

A case-control study investigated the relationship between smoking and a respiratory condition. The data:

DiseaseNo Disease
Smoker4555
Non-smoker2575
Using Statulator:

1 Open the Chi-square Test calculator.

2 Enter the four cell values: a=45, b=55, c=25, d=75.

3 The calculator reports χ², p-value, OR, RR, and their CIs.

Hand calculation:
\[ \text{OR} = \frac{45 \times 75}{55 \times 25} = \frac{3375}{1375} = 2.45 \] \[ \text{RR} = \frac{45/100}{25/100} = \frac{0.45}{0.25} = 1.80 \] \[ \chi^2 = \frac{N(ad - bc)^2}{R_1 R_2 C_1 C_2} \quad \text{(Pearson, default)} \]

If you tick the Yates option in the calculator, the same formula is computed with a continuity correction: replace the numerator with \( N(|ad-bc| - N/2)^2 \).

Interpretation Guide

An OR of 2.45 means smokers have 2.45 times the odds of disease compared to non-smokers. An RR of 1.80 means the risk of disease is 80% higher in smokers. A p-value < 0.05 indicates a statistically significant association.

OR vs. RR: Use RR in cohort studies and RCTs (where the denominator is at risk). Use OR in case-control studies (where the outcome is sampled, not the exposure). In rare outcomes (< 10%), OR approximates RR.

Yates correction: An optional continuity correction recommended when expected cell counts are small. It makes the test slightly more conservative. Statulator defaults to Pearson's chi-squared (uncorrected); tick Pearson's chi-squared with Yates' continuity correction in the calculator to apply it.

Formula

Chi-squared Statistic
\[ \chi^2 = \sum \frac{(O_i - E_i)^2}{E_i} \quad \text{(Pearson, default)} \] \[ \chi^2_{\text{Yates}} = \sum \frac{(|O_i - E_i| - 0.5)^2}{E_i} \quad \text{(with Yates' continuity correction)} \]

Expected frequency: \( E_i = \dfrac{\text{Row total} \times \text{Column total}}{N} \). Degrees of freedom = 1.

Odds Ratio & 95% CI
\[ \text{OR} = \frac{a \cdot d}{b \cdot c} \quad;\quad \text{SE}(\ln\text{OR}) = \sqrt{\frac{1}{a}+\frac{1}{b}+\frac{1}{c}+\frac{1}{d}} \] \[ 95\%\;\text{CI} = \exp\!\left(\ln\text{OR} \pm 1.96 \times \text{SE}\right) \]
Relative Risk & 95% CI
\[ \text{RR} = \frac{a/(a+b)}{c/(c+d)} \quad;\quad \text{SE}(\ln\text{RR}) = \sqrt{\frac{1}{a} - \frac{1}{a+b} + \frac{1}{c} - \frac{1}{c+d}} \] \[ 95\%\;\text{CI} = \exp\!\left(\ln\text{RR} \pm 1.96 \times \text{SE}\right) \]

Assumptions & Requirements

Textbook Examples

Medicine

A study investigates whether smoking status (smoker/non-smoker) is associated with the development of chronic bronchitis (yes/no) in 400 adults.

Data: Smoker+Bronchitis = 45, Smoker+No = 105, Non-smoker+Bronchitis = 20, Non-smoker+No = 230.
Result: χ² = 33.34, df = 1, p < 0.001; OR = 4.93 (95% CI: 2.77, 8.76).
Interpretation: Smokers have nearly 5 times the odds of developing bronchitis compared to non-smokers.

Education

A school tests whether completing a homework programme (yes/no) is associated with passing the final exam (pass/fail) in 300 students.

Data: Completed+Pass = 120, Completed+Fail = 30, Not completed+Pass = 80, Not completed+Fail = 70.
Result: χ² = 24.00, df = 1, p < 0.001; RR = 1.50 (95% CI: 1.27, 1.78).
Interpretation: Students who completed the programme were 50% more likely to pass the exam.

Social Science

A criminology study examines whether neighbourhood type (urban/rural) is associated with property crime victimisation (yes/no) among 600 households.

Data: Urban+Victim = 90, Urban+No = 210, Rural+Victim = 40, Rural+No = 260.
Result: χ² = 24.55, df = 1, p < 0.001; OR = 2.79 (95% CI: 1.84, 4.22).
Interpretation: Urban households had about 2.8 times the odds of experiencing property crime.

Agriculture

A veterinary survey tests whether vaccination status is associated with disease occurrence in 500 cattle.

Data: Vaccinated+Disease = 15, Vaccinated+Healthy = 235, Unvaccinated+Disease = 40, Unvaccinated+Healthy = 210.
Result: χ² = 12.77, df = 1, p < 0.001; OR = 0.34 (95% CI: 0.18, 0.62).
Interpretation: Vaccinated cattle had about one-third the odds of disease, confirming the vaccine's protective effect.

References

  1. Agresti, A. (2013). Categorical Data Analysis (3rd ed.). Wiley.
  2. Fleiss, J. L., Levin, B., & Paik, M. C. (2003). Statistical Methods for Rates and Proportions (3rd ed.). Wiley.
  3. Bland, M. (2015). An Introduction to Medical Statistics (4th ed.). Oxford University Press.
  4. Yates, F. (1934). Contingency tables involving small numbers and the χ² test. Supplement to the Journal of the Royal Statistical Society, 1(2), 217–235.