Basic Case-Control Study Design 52
Advantages and Disadvantages 53
Selection of Case and Control Groups 54
Measurement of Exposure Information 56
Control for Confounding 56
Epidemiologists benefit greatly from having case-control study designs in their research armamentarium. Case-control studies can yield important scientific findings with relatively little time, money, and effort compared with other study designs. This seemingly quick road to research results entices many newly trained epidemiologists. Indeed, investigators implement case-control studies more frequently than any other analytical epidemiological study. Unfortunately, case-control designs also tend to be more susceptible to biases than other comparative studies. Although easier to do, they are also easier to do wrong. Five main notions guide investigators who do, or readers who assess, case-control studies. First, investigators must explicitly define the criteria for diagnosis of a case and any eligibility criteria used for selection. Second, controls should come from the same population as the cases, and their selection should be independent of the exposures of interest. Third, investigators should blind the data gatherers to the case or control status of participants or, if impossible, at least blind them to the main hypothesis of the study. Fourth, data gatherers need to be thoroughly trained to elicit exposure in a similar manner from cases and controls; they should use memory aids to facilitate and balance recall between cases and controls. Finally, investigators should address confounding in case-control studies, either in the design stage or with analytical techniques. Devotion of meticulous attention to these points enhances the validity of the results and bolsters the reader’s confidence in the findings.
Case-control studies contribute greatly to the research toolbox of an epidemiologist. They embody the strengths and weaknesses of observational epidemiology. Moreover, epidemiologists use them to study a huge variety of associations. To show this variety, we searched PubMed for topics investigated with case-control studies ( Panel 5.1 ). We identified diverse diseases and exposures, with outcomes ranging from hip fracture to premature ejaculation, and exposures ranging from hair dyes to vitamin D.
|Uterine fibromas||Postpartum haemorrhage|
|Shiftwork||Violence against nurses|
|History of migraine||Concussion|
|Hypothyroidism||Unruptured cerebral aneurysms|
|Vitamin D||Early childhood fracture|
|HPV||Invasive cervical cancer|
|Vitamin B 12||Premature ejaculation|
|Human Papillomavirus||Colorectal cancer|
|Untreated psoriasis||Male fertility|
|Body mass index, hormone therapy||Cutaneous melanoma|
|Agricultural occupation||Testicular cancer|
|Hair dyes||Connective tissue disorders|
|Digital rectal examination||Metastatic prostate cancer|
|Paracetamol use||Ovarian cancer|
|Physical activity||Breast cancer|
|Influenza vaccination||Recurrent myocardial infarction|
The strength of case-control studies can be appreciated in early research done by investigators hoping to understand the cause of AIDS. Case-control studies identified risk groups (e.g., homosexual men, intravenous drug users, and blood-transfusion recipients) and risk factors (e.g., multiple sex partners, receptive anal intercourse in homosexual men, and not using condoms) for AIDS. Based on such studies, blood banks restricted high-risk individuals from donating blood, and educational programmes began to promote safer behaviours. As a result of these precautions, the speed of transmission of HIV-1 was greatly reduced, even before the virus had been identified.
By comparison with other study types, case-control studies can yield important findings in a relatively short time, and with relatively little money and effort. This apparently quick road to research results entices many newly trained epidemiologists. However, case-control studies tend to be more susceptible to biases than other analytical, epidemiological designs. A notable friend of ours (the late David L. Sackett, personal communication, 2001) told us that he would trust only six people in the world to do a proper case-control study. And Ken Rothman comments in his book that ‘because it need not be extremely expensive nor time-consuming to conduct a case-control study, many studies have been conducted by would-be investigators who lack even a rudimentary appreciation for epidemiological principles. Occasionally such haphazard research can produce fruitful or even extremely important results, but often the results are wrong because basic research principles have been violated’.
Basic Case-Control Study Design
Case-control designs might seem easy to understand, but many clinicians stumble over them. Because this type of study runs backwards by comparison with most other studies, it often confuses researchers and readers alike. Indeed, it so confuses researchers that they frequently do not know what type of study they have done (and readers do not know the difference). For example, in a review of 124 published articles in four US obstetrics and gynaecology journals labelled as ‘case-control’ studies, clearly 30% were not case-control studies. Most of the mislabelled case-control studies were actually retrospective cohort studies. This mislabelling of studies as ‘case-contol’ extends to other specialties as well. In a review of studies in diabetes labelled as ‘case-control’, 43.8% were mislabelled and thereby misleading. Certainly, researchers, reviewers, editors, and readers need better training in methods and terminology.
In cohort studies, study groups are defined by exposure. In case-control studies, however, study groups are defined by outcome ( Fig. 5.1 ). To study the association between smoking and lung cancer, therefore, people with lung cancer are enrolled to form the case group, and people without lung cancer are identified as controls. Researchers then look back in time to ascertain each person’s exposure status (smoking history), hence the retrospective nature of this study design. Investigators compare the frequency of smoking exposure in the case group with that in the control group, and calculate a measure of association.
Unlike cohort studies, case-control studies cannot yield incidence rates. Instead, they provide an odds ratio, derived from the proportion of individuals exposed in each of the case and control groups. When the cumulative incidence rate of an outcome in the population of interest is low (usually under 5% suffices in both the exposed and unexposed), the odds ratio from a case-control study is a good estimate of relative risk. Epidemiologists refer to this condition as the rare disease assumption, which pertains to a type of case-control study that ascertains cases after the end of the risk period of interest, with controls being selected from among those who did not become cases. This represents the type of case-control study that we address in this chapter, usually labelled a cumulative case-control study. Of note, although beyond the scope of this chapter, this rare disease assumption is not needed for other case-control study designs in which researchers estimate the incidence density ratio.
Advantages and Disadvantages
Researchers often tout case-control studies as the most efficient epidemiological study design. Indeed, they tend to take less time, less money, and less effort. That makes sense when the incidence rate of an outcome is low, because in a cohort design the researchers would have to follow up many individuals to identify one with the outcome. Case-control studies are also efficient in the investigation of diseases that have a long latency period (e.g., cancer), in which instance a cohort study would involve many years of follow-up before the outcome became evident.
However, cohort studies can be more efficient than case-control studies. If the frequency of exposure is low, for example, case-control studies quickly become inefficient. Researchers would have to examine many cases and controls to find one who had been exposed. For instance, a case-control study of oral contraceptive use and transmission of HIV-1 would be impractical in parts of Africa because of the rarity of use of oral contraceptives. As a rule of thumb, cohort designs are more efficient in settings in which the incidence of outcome is higher than the prevalence of exposure.
Finally, many methodological issues affect the validity of the results of case-control studies, and two factors (i.e., choosing a control group and obtaining exposure history) can greatly affect a study’s vulnerability to bias.