Variation
Variation in data may be caused by biological factors (e.g. sex, age) or measurement ‘errors’ (e.g. observer variation), or it may be unexplainable random variation (see also Chapter 39). We measure the impact of variation in the data on the estimation of a population parameter by using the standard error (Chapter 10). When the measurement of a variable is subject to considerable variation, estimates relating to that variable will be imprecise, with large standard errors. Clearly, it is desirable to reduce the impact of variation as far as possible, and thereby increase the precision of our estimates. There are various ways in which we can do this, as described in this chapter.
Replication
Our estimates are more precise if we take replicates (e.g. two or three measurements of a given variable for every individual on each occasion). However, as replicate measurements are not independent, we must take care when analysing these data. A simple approach is to use the mean of each set of replicates in the analysis in place of the original measurements. Alternatively, we can use methods that specifically deal with replicated measurements (see Chapters 41 and 42).
Sample Size
The choice of an appropriate size for a study is a crucial aspect of study design. With an increased sample size, the standard error of an estimate will be reduced, leading to increased precision and study power (Chapter 18). Sample size calculations (Chapter 36) should be carried out before starting the study.
In any type of study, it is important that the sample size included in the final study analysis is as close as possible to the planned sample size to ensure that the study is sufficiently powered (Chapter 18). This means that response rates should be as high as possible in cross-sectional studies and surveys. In clinical trials and cohort studies, attempts should be made to minimize any loss-to-follow-up; this will also help attenuate any biases (Chapter 34) that may be introduced if non-responders or cohort drop-outs differ in any respect to responders or those remaining in the trial or cohort.
Particular Study Designs
Modifications of simple study designs can lead to more precise estimates. Essentially, we are comparing the effect of one or more ‘treatments’ on experimental units. The experimental unit (i.e. the unit of observation in an experiment – see Chapter 12) is the ‘individual’ or the smallest group of ‘individuals’ whose response of interest is not affected by that of any other units, such as an individual patient, volume of blood or skin patch. If experimental units are assigned randomly (i.e. by chance) to treatments (Chapter 14) and there are no other refinements to the design, we have a complete randomized design. Although this design is straightforward to analyse, it is inefficient if there is substantial variation between the experimental units. In this situation, we can incorporate blocking and/or use a cross-over design to reduce the impact of this variation.
Blocking (Stratification)
It is often possible to group experimental units that share similar characteristics into a homogeneous block or stratum (e.g. the blocks may represent different age groups). The variation between units in a block is less than that between units in different blocks. The individuals within each block are randomly assigned to treatments; we compare treatments within each block rather than making an overall comparison between the individuals in different blocks. We can therefore assess the effects of treatment more precisely than if there was no blocking.
Parallel and Cross-Over Designs (Fig. 13.1)
Generally, we make comparisons between individuals in different groups. For example, most clinical trials (Chapter 14) are parallel trials, in which each patient receives one of the two (or occasionally more) treatments that are being compared, i.e. they result in between-individual comparisons.