Test whether observed frequencies across categories match a set of expected (theoretical) frequencies.
The goodness-of-fit test compares observed counts in k categories to the counts expected under a specified hypothesis. Common applications include testing whether a die is fair, whether disease cases are equally distributed across seasons, or whether a sample follows a theoretical distribution.
Statulator also provides post-hoc z-tests for each category, with optional Bonferroni correction, to identify which specific categories deviate from expectation.
An emergency department recorded injuries across four seasons: Spring = 82, Summer = 112, Autumn = 78, Winter = 128. If injuries were equally distributed, we would expect 100 per season (400/4). Test at α = 0.05.
1 Open the Chi-square Goodness-of-fit Test.
2 Enter the observed and expected counts for each category.
3 The result: χ² = 15.68, df = 3, p = 0.0013, significant. The post-hoc tests identify Summer and Winter as the categories that differ from expectation.
A significant overall χ² tells you the observed distribution deviates from the expected distribution but does not tell you which categories differ. The post-hoc z-tests (with Bonferroni correction) identify the specific categories that contribute most to the deviation.
Yates correction: Optional continuity correction that makes the test more conservative. Most useful when some expected counts are small.
where \( \hat{p}_i = O_i/N \) and \( p_{0i} = E_i/N \). With Bonferroni correction, compare p-values to \( \alpha/k \).
A sociologist tests whether births are equally distributed across the four quarters of the year (n = 400 births).
Data: Q1 = 110, Q2 = 95, Q3 = 88, Q4 = 107. Expected: 100 each.
Result: χ² = 3.14, df = 3, p = 0.37.
Interpretation: No significant departure from a uniform distribution; births appear evenly spread across quarters.
A genetics lab checks whether observed blood-type frequencies in 500 donors match the expected population distribution (O: 44%, A: 42%, B: 10%, AB: 4%).
Data: O = 235, A = 195, B = 50, AB = 20. Expected: 220, 210, 50, 20.
Result: χ² = 2.41, df = 3, p = 0.49.
Interpretation: The observed frequencies are consistent with the expected blood-type distribution (p = 0.49).
A quality engineer tests whether defects are equally likely across five production shifts (n = 250 defects).
Data: Shift A = 62, B = 43, C = 55, D = 48, E = 42. Expected: 50 each.
Result: χ² = 5.76, df = 4, p = 0.22.
Interpretation: There is no statistically significant difference in defect counts across shifts.
A registrar tests whether student enrolments follow the university's target distribution across four faculties: Arts 30%, Science 25%, Engineering 25%, Business 20%.
Data (n = 600): Arts = 200, Science = 140, Engineering = 135, Business = 125. Expected: 180, 150, 150, 120.
Result: χ² = 5.19, df = 3, p = 0.16.
Interpretation: Enrolments do not significantly deviate from the target distribution (p = 0.16).