Aim of Controls 60
Where to Find Controls? 62
Controls From a Known Group 62
Random-Digit Dialling 63
Controls From an Unknown Group 64
How Many Control Groups? 66
How Many Controls Per Case? 66
What to Look for in Controls 67
Use of control (comparison) groups is a powerful research tool. In case-control studies, controls estimate the frequency of an exposure in the population under study. Controls can be taken from known or unknown study populations. A known group consists of a defined population observed over a period, such as passengers on a cruise ship. When the study group is known, a sample of the population can be used as controls. If no population roster exists, then techniques such as random-digit dialling can be used. Sometimes, however, the study group is unknown (e.g., motor-vehicle crash victims brought to an emergency department, who may come from far away). In this situation, hospital controls, neighbourhood controls, and friend, associate, or relative controls can be used. In general, one well-selected control group is better than two or more. When the number of cases is small, the ratio of controls to cases can be raised to improve the ability to find important differences. Although no ideal control group exists, readers need to think carefully about how representative the controls are. Poor choice of controls can lead to both wrong results and possible medical harm.
When asked ‘How’s your wife?’, comedian Henny Youngman would quip, ‘Compared to what?’ Although sexist by contemporary standards, this old vaudeville line frames the question relating to the results of case-control studies: compared to what? Valid conclusions hinge on finding an appropriate comparison group. Stated alternatively, use of suboptimal control groups has undermined much research.
Use of control groups is a powerful scientific tool—and an old one. The first documentation of a comparison group appears in The Holy Bible in the Book of Daniel. Daniel ( Fig. 6.1 ) and his three colleagues, captured by King Nebuchadnezzar of Babylon, carried out a 10-day trial of healthy food versus the royal diet of the court. At the end, Daniel and his buddies appeared healthier than did the Babylonian youth who enjoyed the usual fare. This trial has been criticised over the years for an inadequate duration of exposure to note any change in appearance and, thus, probable divine confounding. The experiment took place around 600 BC and was finally published four centuries later. Delay in publishing is not a new problem: Daniel perished, then published. Perhaps as a result, control groups disappeared from published work for millennia.
James Lind’s ( Fig. 6.2 ) 1747 trial of scurvy treatments rekindled interest in contemporaneous controls. Despite its small size (six treatment groups with two sailors assigned to each), the trial showed the benefit of citrus-fruit supplementation. In studies without randomisation, finding an appropriate control group can sometimes be challenging. We will explain the role of control groups in case-control studies, describe special difficulties in choosing them, and discuss some implications of these choices.
Aim of Controls
Controls in a case-control study ( Chapter 5 ), which progresses backwards in time from outcome to exposure, indicate the background frequency of an exposure in individuals who are free of the disease in question. Controls do not need to be healthy; inclusion of sick people is sometimes appropriate. Indeed, exclusion of ill people as controls can distort the results. (Like healthy individuals, ill people can develop a different condition of interest.) The final point is key: controls in a case-control study should represent those at risk of becoming a case. Stated another way, controls should have the same risk of exposure as the cases, if the exposure and disease are unrelated ( Panel 6.1 ).
Free of the outcome of interest
Representative of the population at risk of the outcome
Selected independent of the exposure of interest
If cases (with the disease) have a higher frequency of the exposure than do the controls, then a positive association emerges (e.g., multiple sexual partners are more common among cases of cervical cancer than among controls without cervical cancer). If the exposure prevalence among cases is lower than among controls, a protective association exists (e.g., oral contraceptive use is less common among ovarian cancer cases than among controls without this cancer).
Avoidance of bias is important when choosing controls for a case-control study. Selection bias arises if controls are not representative of those at risk of the disease in question. An early case-control study of cigarette smoking and lung cancer underestimated the effect of smoking. Controls in this hospital-based case-control study included 709 hospitalised patients without lung cancer. In that era, myocardial infarction patients routinely spent 3 weeks in hospital recuperating and would have been readily available controls. Thus the controls chosen likely had a higher proportion of smokers than in the general population, which would overestimate the background smoking rate and underestimate the association between smoking and lung cancer.
Case-control studies of potential protection against colorectal cancer associated with nonsteroidal antiinflammatory drugs (NSAIDs) provide another example ( Fig. 6.3 ). Assume that colorectal cancer cases are identified at the time of their operations in hospital. Controls are hospital patients without colorectal cancer. If the researcher identified controls from the rheumatology service, this selection would bias the results: patients with arthritis would be more likely than most people in the community to be exposed to NSAIDs, thereby reducing the estimate of the association between these drugs and colorectal cancer. By contrast, if controls were selected from the gastroenterology service, this choice would bias the results in the opposite direction. Patients with ulcers would be less likely than the general population at risk of colorectal cancer to be exposed to NSAIDs, because of warnings from their clinicians. This bias would increase the estimate of the effect.
Research in endometriosis provides another example of challenges in selection of a control group. Because endometriosis needs an operation for diagnosis, investigators frequently use as controls women having laparoscopy or laparotomy without this diagnosis being made. However, women having operations are unlikely to be representative of all those at risk of developing endometriosis, because operations do not occur at random.
Where to Find Controls?
The investigator (and, ultimately, the reader) need to determine the group of individuals from which cases and controls will be drawn. A known group consists of a defined population observed over a period ( Fig. 6.4 ). This group might consist of passengers and crew on a week-long cruise of the Caribbean or all individuals living in Sweden over a decade. Cases are those who develop the disorder of interest, and controls are those in the same group without the condition. Thus case-control studies can be thought of as occurring in the midst of a larger cohort study (nested case-control studies being a nice example). The task here is to find the cases in the group in question; choosing controls is easier in a defined population.
Usually the group from which cases come is unknown. For example, victims of motor-vehicle crashes in a hospital emergency department pose this sort of challenge. Some might live nearby, others could be passing through on a highway, and others may arrive from rural areas by helicopter. Here, the cases are chosen before the study group is deduced. Finding cases is the simple part; the challenge now is to define the group from which controls should come. They should come from the same group. (One approach would be to limit both cases and controls to people who live within the city limits.)
Poor control groups can lead to big mistakes. The case-control study of AIDS in homosexual men in San Francisco described in Chapter 5 is illustrative. Use of sexually transmitted disease clinic controls grossly underestimated the true risk, because the likelihood of using a clinic was strongly related to the exposure of interest (i.e., it was not independent of the number of partners). Controls in a public clinic at which sexually transmitted diseases are treated were much more likely to have multiple partners than were other homosexual men in San Francisco. Neighbourhood controls were the better control group.
Controls From a Known Group
When possible, random samples of people without the disease can serve as controls; this approach helps avoid selection bias. Investigation of an outbreak of food-borne illness on a cruise ship generally uses a case-control approach. Cases are those who develop gastroenteritis; controls are those on board who do not. The study seeks to identify food exposures that are more common among the cases than the controls. Moreover, no one who had not eaten the suspect food should have become ill. On the ship, probability sampling among those unaffected could be done. Thus controls could be a random sample of everyone on board without food poisoning.
Population controls have both advantages and disadvantages. Random sampling should provide representative controls, and extrapolation of results to the study group is easily justified. On the other hand, population controls can be inappropriate when cases have not been completely identified in the population or when substantial numbers of potential controls cannot be reached (e.g., those on holiday). Moreover, population controls could be less motivated to take part in research than individuals in a healthcare setting, such as hospitalised patients.
When no roster of the population exists, random-digit dialling of telephone numbers has been used to sample potential controls. A random sample of incomplete telephone numbers (e.g., eight digits) is taken from working telephone exchanges; random two-digit numbers then complete the number to be called ( Fig. 6.5 ). This approach has strengths and weaknesses. It attempts to sample residential telephone numbers equally while keeping calls to commercial numbers to a minimum. The strategy reaches both new numbers and unlisted numbers not available through directories.