Once we have taken a sample from our population, we obtain a point estimate (Chapter 10) of the parameter of interest, and calculate its standard error to indicate the precision of the estimate. However, to most people the standard error is not, by itself, particularly useful. It is more helpful to incorporate this measure of precision into an interval estimate for the population parameter. We do this by making use of our knowledge of the theoretical probability distribution of the sample statistic to calculate a confidence interval for the parameter. Generally the confidence interval extends either side of the estimate by some multiple of the standard error; the two values (the confidence limits) which define the interval are generally separated by a comma, a dash or the word ‘to’ and are contained in brackets.
Confidence Interval for the Mean
Using the Normal Distribution
In Chapter 10 we stated that the sample mean follows a Normal distribution if the sample size is large. Therefore we can make use of the properties of the Normal distribution when considering the sample mean. In particular, 95% of the distribution of sample means lies within 1.96 standard deviations (SD) of the population mean. We call this SD the standard error of the mean (SEM), and when we have a single sample, the 95% confidence interval (CI) for the mean is:
(Sample mean − (1.96 × SEM) to Sample mean + (1.96 × SEM))
If we were to repeat the experiment many times, the range of values determined in this way would contain the true population mean on 95% of occasions. This range is known as the 95% confidence interval for the mean. We usually interpret this confidence interval as the range of values within which we are 95% confident that the true population mean lies. Although not strictly correct (the population mean is a fixed value and therefore cannot have a probability attached to it), we will interpret the confidence interval in this way as it is conceptually easier to understand.
Using the t-Distribution
Strictly, we should only use the Normal distribution in the calculation if we know the value of the variance, σ2, in the population. Furthermore, if the sample size is small, the sample mean only follows a Normal distribution if the underlying population data are Normally distributed. Where the data are not Normally distributed, and/or we do not know the population variance but estimate it by s2, the sample mean follows a t-distribution (Chapter 8). We calculate the 95% confidence interval for the mean as
(Sample mean − (t0.05 × SEM) to Sample mean + (t0.05 × SEM))
i.e. it is
where t0.05 is the percentage point (percentile) of the t-distribution with (n − 1) degrees of freedom which gives a two-tailed probability (Chapter 17) of 0.05 (Appendix A2). This generally provides a slightly wider confidence interval than that using the Normal distribution to allow for the extra uncertainty that we have introduced by estimating the population standard deviation and/or because of the small sample size. When the sample size is large, the difference between the two distributions is negligible. Therefore, we always use the t-distribution when calculating a confidence interval for the mean even if the sample size is large.
By convention we usually quote 95% confidence intervals. We could calculate other confidence intervals, e.g. a 99% confidence interval for the mean. Instead of multiplying the standard error by the tabulated value of the t-distribution corresponding to a two-tailed probability of 0.05, we multiply it by that corresponding to a two-tailed probability of 0.01. The 99% confidence interval is wider than a 95% confidence interval, to reflect our increased confidence that the range includes the true population mean.
Confidence Interval for the Proportion
The sampling distribution of a proportion follows a Binomial distribution (Chapter 8). However, if the sample size, n, is reasonably large, then the sampling distribution of the proportion is approximately Normal with mean π. We estimate π by the proportion in the sample, p = r/n (where r is the number of individuals in the sample with the characteristic of interest), and its standard error isestimated by (Chapter 10).
The 95% confidence interval for the proportion is estimated by
If the sample size is small (usually when np or n(1 − p) is less than 5) then we have to use the Binomial distribution to calculate exact confidence intervals1. Note that if p is expressed as a percentage, we replace (1 − p) by (100 − p).
Interpretation of Confidence Intervals
When interpreting a confidence interval we are interested in a number of issues.
- How wide is it? A wide interval indicates that the estimate is imprecise; a narrow one indicates a precise estimate. The width of the confidence interval depends on the size of the standard error, which in turn depends on the sample size and, when considering a numerical variable, the variability of the data. Therefore, small studies on variable data give wider confidence intervals than larger studies on less variable data.
- What clinical implications can be derived from it? The upper and lower limits provide a way of assessing whether the results are clinically important (see Example).
- Does it include any values of particular interest? We can check whether a hypothesized value for the population parameter falls within the confidence interval. If so, then our results are consistent with this hypothesized value. If not, then it is unlikely (for a 95% confidence interval, the chance is at most 5%) that the parameter has this value.
Degrees of Freedom
You will come across the term ‘degrees of freedom’ in statistics. In general they can be calculated as the sample size minus the number of constraints in a particular calculation; these constraints may be the parameters that have to be estimated. As a simple illustration, consider a set of three numbers which add up to a particular total (T). Two of the numbers are ‘free’ to take any value but the remaining number is fixed by the constraint imposed by T. Therefore the numbers have two degrees of freedom. Similarly, the degrees of freedom of the sample variance, (Chapter 6), are the sample size minus one, because we have to calculate the sample mean ( ), an estimate of the population mean, in order to evaluate s2.
Bootstrapping and Jackknifing
Bootstrapping is a computer-intensive simulation process which we can use to derive a confidence interval for a parameter if we do not want to make assumptions about the sampling distribution of its estimate (e.g. the Normal distribution for the sample mean). From the original sample, we create a large number of random samples (usually at least 1000), each of the same size as the original sample, by sampling with replacement, i.e. by allowing an individual who has been selected to be ‘replaced’ so that, potentially, this individual can be included more than once in a given sample. Every sample provides an estimate of the parameter, and we use the variability of the distribution of these estimates to obtain a confidence interval for the parameter, for example, by considering relevant percentiles (e.g. the 2.5th and 97.5th percentiles to provide a 95% confidence interval).
Jackknifing is a similar technique to bootstrapping. However, rather than creating random samples of the original sample, we remove one observation from the original sample of size n and then compute the estimated parameter on the remaining (n − 1) observations. This process is repeated, removing each observation in turn, giving us n estimates of the parameter. As with bootstrapping, we use the variability of the estimates to obtain the confidence interval.
Bootstrapping and jackknifing may both be used when generating and validating prognostic scores (Chapter 46).