Defining Populations for Cohort Studies

html xmlns=”http://www.w3.org/1999/xhtml” xmlns:mml=”http://www.w3.org/1998/Math/MathML” xmlns:epub=”http://www.idpf.org/2007/ops”>


25 Defining Populations for Cohort Studies



Chapter 6 presented an overview of cohort studies, which are observational studies of the relationship between exposures and outcomes in a specific population. Part of the study design is determining what population should be studied, the availability and quality of information about the cohort, and, for prospective studies, the potential for long-term follow-up. These are more general concepts than the specific inclusion and exclusion criteria that are described in Chapter 12, which addresses individual characteristics. We start by discussing a study with a single cohort. The principles apply to a study with multiple cohorts as well, which is discussed in the last section.



25.1 Data Availability


In addition to defining detailed criteria for inclusion or exclusion of individuals in the protocol, it is important to think in general terms about the population in which you are interested. In most studies you will be collecting data over some interval of time. It may be historical data for a retrospective study or follow-up data for a prospective study. Perhaps you are interested in following a population with a specific disease who attend your clinic. You propose a prospective study where you will follow individuals and collect data over time. The key issue is whether these participants can be followed for the required interval of time. What follow-up mechanisms need to be in place to achieve this? Will participants be expected to be seen on a regular schedule by medical personnel as part of their regular care? How much of the information that is required for the study will be available from regular care, and what evaluations must be done specifically as part of the study protocol?



Example 25A:

Investigators wished to do a 5-year follow-up study of women who have gestational diabetes to determine if they will develop metabolic abnormalities after the delivery. The women are in a special high-risk pregnancy program and, as part of the program, are expected to visit a clinic 6 months after delivery for a follow-up examination, including an oral glucose tolerance test, and then annually after that. This center is treating an immigrant population, which is very mobile, and the expected loss to follow-up, even for the first follow-up evaluation, is high. To improve the follow-up, as part of the protocol, the investigators proposed to employ a bilingual research assistant to call patients and remind them of their appointments and ask if they can help with any problems. The women were also asked to supply the name of another individual who should be able to reach them in case they moved or changed phone numbers.


In a retrospective study, the key issue is whether the information required for the study can be retrieved from existing data. This means that not only the exposures and outcomes but also potential confounding variables must be available. In most studies the availability of information varies with the individual, but in some populations and for some data items there may be special problems obtaining information that must be considered in the design.



Example 25B:

Investigators studying risk factors for cardiac disease know that family history is an important contributor to risk. However, there is often a problem obtaining accurate family history for some populations. For immigrant populations from less developed countries, family members may not have been tested or diagnosed. Moreover, if the family members are not in this country, little may be known about their past. In addition, in all populations, some illnesses are considered shameful and not discussed.


Before any data can be extracted, it is important to ensure that both Institutional Review Board (IRB) approval is obtained (Section 2.3) and the requirements for patient privacy are met (Section 2.1.2).



25.1.1 Data Sources


Data may be obtained from existing sources or collected during the course of the study. Existing resources include, but are not limited to, information from the participant, hospital or clinic records, public health records (such as death certificates), private physician records, teachers, or school counselors. Often a family member or someone close to the participant can provide relevant information. This person is often referred to as the “best informant.” Institutional records are a good resource for many studies. Many retrospective studies have used data from large community-based medical centers with stable populations, such as the Mayo Clinic in Rochester, Minnesota, which is known for excellent tracking of and record keeping about its patients. Health Maintenance Organizations, such as Kaiser Permanente, also provide a valuable resource for longitudinal observation studies, such as cohort studies, for investigators with access to their database. Other sources include the Veterans Administration, larger secondary and tertiary health centers, and public records of births and deaths.



Example 25C:

Investigators want to study the effect of repeated fever in young children on growth in later life. They work at a large regional health center that serves most of the residents in an area and maintains a database on all patients. The population is relatively stable. Therefore, the investigators believe they can get adequate and complete information from the institution’s electronic medical records.


Prospective studies often begin with current or historical information, and then continue collecting information from interviews, tests, and from new medical records when the participant allows access.



25.1.2 Stratified Sampling


Stratified sampling, also known as targeted sampling, is a method used to acquire a cohort with specific characteristics, usually demographic. The purpose is generally to create a cohort that is more similar to the general population than would be expected through random, untargeted samples, and thus improve the generalizability of your study (Section 11.1). The investigator must first define what important subgroups exist in the target population and what percent of the total population they represent. Then a plan must be developed to enroll participants from the different groups in the same proportions into the study.



Example 25C (continued):

The investigators will not use the total population, which is very large, but will take a sample of 3,000 participants. They would like this cohort to have the same ethnic and sex composition as the total population; therefore, they will ask the data management personnel to identify the group membership of each participant, then select a specified number of children at random within each ethnic group.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Feb 18, 2017 | Posted by in GENERAL SURGERY | Comments Off on Defining Populations for Cohort Studies

Full access? Get Clinical Tree

Get Clinical Tree app for offline access