Help: Sample Size for Comparing Paired Proportions

Overview

Paired proportions arise when the same subjects are classified on a binary outcome under two conditions. Common scenarios include comparing diagnostic tests on the same patients, assessing a binary outcome before and after an intervention, or evaluating agreement between two raters.

The analysis focuses on discordant pairs, subjects who change category between the two conditions. In a 2×2 table of paired outcomes, the discordant proportions are denoted b (positive→negative) and c (negative→positive). The test statistic is based on McNemar’s test.

Statulator offers two input methods: the marginal approach (specify the two marginal proportions and the correlation between them) or the discordant approach (specify the discordant proportions directly).

The 2×2 Paired Table

	Time 2: +	Time 2: −	Total
Time 1: +	a	b	p₀
Time 1: −	c	d	1 − p₀
Total	p₁	1 − p₁	1

b and c are the discordant proportions; a and d are concordant. Only the discordant pairs carry information about the difference between conditions.

Worked Example

Scenario: Comparing Two Diagnostic Tests

A radiologist wants to compare the sensitivity of two imaging techniques (CT vs. MRI) applied to the same patients for detecting liver lesions. Based on published data, CT detects lesions in p₀ = 0.75 of cases and MRI in p₁ = 0.85. The correlation between paired results is estimated at ρ = 0.60. The study requires 80% power at α = 0.05 (two-sided).

Using Statulator step-by-step (Marginal Method):

1 Open the Sample Size Calculator for Comparing Paired Proportions.

2 Select the Marginal input method.

3 Set α to 0.05 and Power to 0.80.

4 Enter Proportion at Time 1 (p₀) as 0.75 and Proportion at Time 2 (p₁) as 0.85.

5 Enter the Correlation (ρ) as 0.60.

6 The calculator computes the discordant proportions and displays the required number of pairs.

Deriving discordant proportions:

From the marginal inputs:

\[ q_0 = 1 - p_0 = 0.25, \quad q_1 = 1 - p_1 = 0.15 \] \[ b = p_0 q_1 - \rho \sqrt{p_0 q_0 p_1 q_1} = 0.75 \times 0.15 - 0.60 \times \sqrt{0.75 \times 0.25 \times 0.85 \times 0.15} \] \[ = 0.1125 - 0.60 \times \sqrt{0.0239} = 0.1125 - 0.60 \times 0.1546 = 0.1125 - 0.0928 = 0.0197 \] \[ c = b + (p_1 - p_0) = 0.0197 + 0.10 = 0.1197 \]

Sample size calculation:

\[ p_{\text{sum}} = b + c = 0.0197 + 0.1197 = 0.1394 \] \[ p_{\text{diff}} = c - b = 0.1197 - 0.0197 = 0.10 \] \[ n = \left(\frac{z_{\alpha/2}\sqrt{p_{\text{sum}}} + z_{\beta}\sqrt{p_{\text{sum}} - p_{\text{diff}}^2}}{p_{\text{diff}}}\right)^{2} \] \[ = \left(\frac{1.96 \times \sqrt{0.1394} + 0.842 \times \sqrt{0.1394 - 0.01}}{0.10}\right)^{2} = \left(\frac{1.96 \times 0.3734 + 0.842 \times 0.3598}{0.10}\right)^{2} \] \[ = \left(\frac{0.7319 + 0.3029}{0.10}\right)^{2} = (10.348)^{2} \approx 107.1 \]

Rounding up: n = 108 pairs.

Discordant Method

If you already know the discordant proportions (e.g., from a pilot study where b = 0.05 and c = 0.15), select the Discordant input method and enter these values directly. This bypasses the need for marginal proportions and correlation.

Interpretation Guide

Output	Interpretation
Number of Pairs	The minimum number of subjects, each assessed under both conditions. Every subject provides one paired observation.
Discordant Proportions (b, c)	b = proportion who are positive at Time 1 but negative at Time 2; c = proportion who are negative at Time 1 but positive at Time 2. The difference p₁ − p₀ = c − b.
Continuity Correction	Adds 1/\|c − b\| to the uncorrected sample size. Recommended when discordant proportions are small.

Practical tip: The correlation (ρ) between paired measurements has a large impact on the discordant proportions and hence on sample size. A higher correlation means fewer discordant pairs, which makes it harder to detect a difference (larger n needed). Verify your correlation estimate carefully from prior data.

Formula

McNemar’s Test Sample Size (Connor, 1987)

\[ n = \left(\frac{z_{\alpha/2}\sqrt{p_{\text{sum}}} + z_{\beta}\sqrt{p_{\text{sum}} - p_{\text{diff}}^{2}}}{p_{\text{diff}}}\right)^{2} \]

where:

\( p_{\text{sum}} = b + c \) (total discordant proportion)
\( p_{\text{diff}} = c - b \) (difference in discordant proportions)
\( b \) = proportion discordant positive→negative
\( c \) = proportion discordant negative→positive

Deriving b and c from Marginal Proportions

\[ b = p_0(1 - p_1) - \rho\sqrt{p_0(1-p_0) \cdot p_1(1-p_1)} \] \[ c = b + (p_1 - p_0) \]

where \( p_0 \) and \( p_1 \) are the marginal proportions at Time 1 and Time 2, and \( \rho \) is the correlation between paired binary outcomes.

Continuity Correction

\[ n_c = n + \frac{1}{|c - b|} \]

Cluster Sampling Adjustment

\[ n_{\text{cluster}} = n \times [1 + (m - 1) \cdot \rho_{\text{ICC}}] \]

Assumptions & Requirements

Paired binary data: Each subject is measured twice under two conditions, yielding a 2×2 table of paired outcomes.
Known discordant proportions: Either specify b and c directly, or provide marginal proportions with correlation to derive them. The accuracy of the sample size depends heavily on these estimates.
Normal approximation: The formula uses a normal approximation to the distribution of discordant pairs. This is adequate when the total number of discordant pairs (\( n \times (b + c) \)) is at least 20–30.
Distinct discordant proportions: The formula requires \( b \ne c \); if b = c, there is no difference to detect and the sample size is infinite.
No order effects: The result under one condition should not depend on whether it was administered first or second.

Textbook Examples

Medicine

A crossover study compares two diagnostic tests for detecting a cardiac biomarker on the same patients.

Inputs: Discordant proportions b = 0.15, c = 0.05, α = 0.05 (two-sided), power = 80%, continuity correction applied (default).
Result: n = 165 pairs.
Interpretation: Testing 165 patients with both diagnostic methods provides 80% power to detect the difference in sensitivity.

Education

A before-after study tests whether a workshop changes teachers' attitudes toward inclusive education.

Inputs: Discordant proportions b = 0.20, c = 0.08, α = 0.05 (two-sided), power = 90%, continuity correction applied (default).
Result: n = 209 pairs.
Interpretation: Surveying 209 teachers before and after the workshop gives 90% power to detect the shift in attitudes.

Social Science

A panel survey measures whether a media campaign changes public opinion on climate policy (same respondents, two time points).

Inputs: Discordant proportions b = 0.12, c = 0.06, α = 0.05 (two-sided), power = 80%, continuity correction applied (default).
Result: n = 407 pairs.
Interpretation: Re-interviewing 407 respondents will detect the 6 pp net shift in opinion with 80% power.

Medicine

A dermatology trial applies two topical creams to matched lesion sites on the same patients to compare healing rates.

Inputs: Discordant proportions b = 0.25, c = 0.10, α = 0.05 (two-sided), power = 80%, continuity correction applied (default).
Result: n = 127 pairs.
Interpretation: Enrolling 127 patients (each serving as their own control) provides 80% power for the paired comparison.

References

Connor, R. J. (1987). Sample size for testing differences in proportions for the paired-sample design. Biometrics, 43(1), 207–211.
Fleiss, J. L., Levin, B., & Paik, M. C. (2003). Statistical Methods for Rates and Proportions (3rd ed.). John Wiley & Sons., Chapter 9: McNemar’s test.
Lachin, J. M. (1992). Power and sample size evaluation for the McNemar test with application to matched case-control studies. Statistics in Medicine, 11(9), 1239–1251.
Machin, D., Campbell, M. J., Tan, S. B., & Tan, S. H. (2009). Sample Size Tables for Clinical Studies (3rd ed.). Wiley-Blackwell.
Chow, S.-C., Shao, J., Wang, H., & Lokhnygina, Y. (2018). Sample Size Calculations in Clinical Research (3rd ed.). Chapman & Hall/CRC.