Help: Sample Size for Estimating a Single Mean

Sample Size for Estimating a Single Mean

A guide to calculating the minimum sample size needed to estimate a population mean with a desired level of precision.

Overview

When planning a study that aims to estimate the average value of a quantity in a population (e.g., mean blood pressure, average income, mean reaction time), you need to determine how many observations are required so that your estimate is close enough to the true population mean.

This calculator computes the minimum sample size needed to construct a confidence interval for a single mean with a specified level of precision (margin of error). The calculation depends on three key inputs: the desired confidence level, the expected variability in the population (standard deviation), and the acceptable precision.

A larger sample size is required when you want higher confidence, when the population is more variable, or when you need a narrower margin of error.

Worked Example

Scenario: Estimating Average Systolic Blood Pressure

A researcher wants to estimate the mean systolic blood pressure (SBP) of adults aged 40–60 in a community. Previous studies suggest the standard deviation of SBP in this age group is approximately σ = 18 mmHg. The researcher wants the estimate to be within ±3 mmHg of the true mean with 95% confidence.

Using Statulator step-by-step:

1 Open the Sample Size Calculator for Estimating a Single Mean.

2 Set Confidence Level to 95% (the default).

3 Enter the Standard Deviation (σ) as 18.

4 Enter the Precision (margin of error) as 3.

5 The calculator instantly shows the required sample size. The result should be n = 139.

Hand calculation verification:

Using the formula with $ z_{0.025} = 1.96 $, $ \sigma = 18 $, and $ d = 3 $:

\[ n = \frac{z_{\alpha/2}^{2} \cdot \sigma^{2}}{d^{2}} = \frac{(1.96)^{2} \times (18)^{2}}{(3)^{2}} = \frac{3.8416 \times 324}{9} = \frac{1244.68}{9} \approx 138.3 \]

Rounding up: n = 139

Why does Statulator report a slightly larger sample size? By default, Statulator uses the t-distribution rather than the z-distribution because the population standard deviation is rarely known exactly. The t-distribution is more accurate for this situation but yields a slightly larger sample size than the manual z-based calculation shown above. To reproduce the manual z-based number, open the Adjust panel and deselect Adjust for t-distribution. Keeping the t-distribution adjustment on (the default) is recommended.

Optional Adjustments

Finite population correction: If the target population is small (e.g., N = 500), click Adjustments and enter the population size. The corrected sample size will be smaller. For our example with N = 500:

\[ n_{\text{adj}} = \frac{n}{1 + \frac{n - 1}{N}} = \frac{139}{1 + \frac{138}{500}} = \frac{139}{1.276} \approx 109 \]

Clustering: If the study uses cluster sampling (e.g., households), you can adjust using an intra-cluster correlation coefficient (ICC) or a design effect (DEFF).

Response rate: If you expect not all sampled subjects to respond (e.g., 60% response rate in a survey), open Adjust and tick Adjust for Response Rate. Statulator will inflate the required sample size to compensate. For our example with an anticipated response rate of 0.60:

\[ n_{\text{adj}} = \frac{n}{r} = \frac{139}{0.60} \approx 232 \]

Note: with the default t-distribution adjustment also enabled, Statulator starts from the slightly larger t-based sample size (≈ 142) rather than the z-based 139 used in the formula above, so the calculator will report 236. Deselect Adjust for t-distribution in the Adjust panel to reproduce the 232 in the formula exactly.

Interpretation Guide

The calculator produces several outputs that help you understand your sample size requirement:

Output	Interpretation
Required Sample Size (n)	The minimum number of observations you need to collect. This is always rounded up to the next whole number because you cannot collect a fraction of an observation.
Live Interpretation	A plain-language sentence that summarises the result. For example: “A sample of 139 subjects is needed to estimate the mean within ±3 units with 95% confidence, assuming a standard deviation of 18.”
Visualisation tab	Shows how sample size changes as the standard deviation varies, for three different precision levels. Use this to understand sensitivity: if your σ estimate is uncertain, you can see the range of sample sizes you might need.
Tabulate tab	Generates a table of sample sizes across a grid of standard deviation and precision values, useful for protocol development and grant applications.

Practical tip: The standard deviation (σ) is often the most uncertain input. If prior data are limited, consider using the upper end of plausible σ values to ensure your study is adequately powered. You can also use the Visualisation tab to assess how sensitive the sample size is to changes in σ.

Formula

Base Formula

\[ n = \frac{z_{\alpha/2}^{2} \cdot \sigma^{2}}{d^{2}} \]

where:

$ n $ = required sample size
$ z_{\alpha/2} $ = critical value of the standard normal distribution for a two-sided confidence level of $ (1 - \alpha) $. For 95% confidence, $ z_{0.025} = 1.96 $.
$ \sigma $ = expected population standard deviation
$ d $ = desired precision (half-width of the confidence interval, i.e., margin of error)

The sample size is always rounded up to the nearest integer ($ \lceil n \rceil $).

t-Distribution Adjustment

The base formula uses the normal distribution. When sample sizes are small, the t-distribution provides more accurate critical values but depends on degrees of freedom $ (\nu = n - 1) $, which in turn depends on n. Statulator uses an iterative approach:

\[ n_{t} = \frac{t_{\alpha/2,\, n-1}^{2} \cdot \sigma^{2}}{d^{2}} \]

Starting from the normal-based $ n $, the calculator substitutes $ t_{\alpha/2,\, n-1} $ and recomputes until the sample size converges.

Finite Population Correction (FPC)

When the population size $ N $ is known and finite, the required sample size is reduced:

\[ n_{\text{adj}} = \frac{n}{1 + \dfrac{n - 1}{N}} \]

where $ N $ is the total population size. This correction is meaningful when $ n $ is more than about 5% of $ N $.

Cluster Sampling Adjustment

When observations are sampled in clusters (e.g., schools, clinics), the effective sample size is reduced. The adjustment uses either a design effect (DEFF) or the intra-cluster correlation coefficient (ICC):

Using DEFF:

\[ n_{\text{cluster}} = n \times \text{DEFF} \]

Using ICC ($ \rho $) and average cluster size ($ m $):

\[ \text{DEFF} = 1 + (m - 1) \cdot \rho \] \[ n_{\text{cluster}} = n \times [1 + (m - 1) \cdot \rho] \]

Response Rate Adjustment

When non-response is expected, the required sample size is inflated by dividing by the anticipated response rate $ r $ (a value between 0 and 1):

\[ n_{\text{adj}} = \frac{n}{r} \]

For example, if 139 respondents are needed and the anticipated response rate is 0.60, the study should approach $ \lceil 139 / 0.60 \rceil = 232 $ subjects.

Assumptions & Requirements

Simple random sampling: The formula assumes that observations are drawn independently from the target population. If cluster sampling is used, apply the clustering adjustment.
Known or estimated σ: The population standard deviation must be specified. This is typically estimated from pilot studies, previous research, or expert opinion. If σ is unknown, a range of values should be explored using the Visualisation or Tabulate tabs.
Normal distribution assumption: The formula is derived from normal-theory confidence intervals. For large samples ($ n \geq 30 $), the Central Limit Theorem ensures the sample mean is approximately normal regardless of the underlying distribution. For smaller samples, the data should be approximately normally distributed.
Continuous outcome: The variable of interest should be measured on a continuous (interval or ratio) scale.
Absolute precision: The precision $ d $ is specified in the same units as the variable being measured. It represents the maximum acceptable difference between the sample mean and the true population mean.

Textbook Examples

Medicine

A nutritionist estimates the mean daily sodium intake (mg) of adults in a coastal city.

Inputs: Expected SD = 480 mg, confidence level = 95%, desired precision (margin of error) = 50 mg.
Result: n = 354.
Interpretation: Measuring sodium intake in 354 adults will estimate the population mean within ±50 mg with 95% confidence.

Engineering

A civil engineer estimates the mean compressive strength (MPa) of concrete cylinders from a batch.

Inputs: Expected SD = 3.5 MPa, confidence level = 99%, desired precision = 1 MPa, population = 500.
Result: n = 77 (after finite-population correction).
Interpretation: Testing 77 cylinders from the batch of 500 will estimate mean strength within ±1 MPa with 99% confidence.

Education

A school board estimates the mean reading score of Year 4 students across the district.

Inputs: Expected SD = 12 points, confidence level = 95%, desired precision = 2 points, design effect = 1.8 (schools as clusters).
Result: n = 250 (after cluster correction).
Interpretation: Sampling 250 students (accounting for clustering within schools) yields a mean estimate within ±2 points.

Agriculture

An agronomist estimates the mean grain yield (kg/ha) across a region's wheat farms.

Inputs: Expected SD = 350 kg/ha, confidence level = 95%, desired precision = 40 kg/ha.
Result: n = 295.
Interpretation: Measuring yield on 295 farms will estimate the regional mean within ±40 kg/ha.

Social Science

A labour economist estimates the mean hourly wage of gig-economy workers in a metropolitan area.

Inputs: Expected SD = $8.00, confidence level = 95%, desired precision = $1.00.
Result: n = 246.
Interpretation: Surveying 246 workers will estimate the mean hourly wage within ±$1.00 with 95% confidence.

References

Cochran, W. G. (1977). Sampling Techniques (3rd ed.). John Wiley & Sons., Chapter 4: estimation of sample size for means.
Daniel, W. W., & Cross, C. L. (2013). Biostatistics: A Foundation for Analysis in the Health Sciences (10th ed.). John Wiley & Sons., Section 7.4: determining sample size.
Lwanga, S. K., & Lemeshow, S. (1991). Sample Size Determination in Health Studies: A Practical Manual. World Health Organization.
Naing, L., Winn, T., & Rusli, B. N. (2006). Practical issues in calculating the sample size for prevalence studies. Archives of Orofacial Sciences, 1, 9–14.
Kish, L. (1965). Survey Sampling. John Wiley & Sons. — Design effect and cluster sampling correction.

Back to Sample Size Calculator for Estimating a Single Mean