Epidemiologic Data Measurements

2 Epidemiologic Data Measurements

Clinical phenomena must be measured accurately to develop and test hypotheses. Because epidemiologists study phenomena in populations, they need measures that summarize what happens at the population level. The fundamental epidemiologic measure is the frequency with which an event of interest (e.g., disease, injury, or death) occurs in the population of interest.

I Frequency

The frequency of a disease, injury, or death can be measured in different ways, and it can be related to different denominators, depending on the purpose of the research and the availability of data. The concepts of incidence and prevalence are of fundamental importance to epidemiology.

C Illustration of Morbidity Concepts

The concepts of incidence (incident cases), point prevalence (prevalent cases), and period prevalence are illustrated in Figure 2-2, based on a method devised in 1957.1 Figure 2-2 provides data concerning eight persons who have a given disease in a defined population in which there is no emigration or immigration. Each person is assigned a case number (case no. 1 through case no. 8). A line begins when a person becomes ill and ends when that person either recovers or dies. The symbol t1 signifies the beginning of the study period (e.g., a calendar year) and t2 signifies the end.

In case no. 1, the patient was already ill when the year began and was still alive and ill when it ended. In case nos. 2, 6, and 8, the patients were already ill when the year began, but recovered or died during the year. In case nos. 3 and 5, the patients became ill during the year and were still alive and ill when the year ended. In case nos. 4 and 7, the patients became ill during the year and either recovered or died during the year. On the basis of Figure 2-2, the following calculations can be made. There were four incident cases during the year (case nos. 3, 4, 5, and 7). The point prevalence at t1 was four (the prevalent cases were nos. 1, 2, 6, and 8). The point prevalence at t2 was three (case nos. 1, 3, and 5). The period prevalence is equal to the point prevalence at t1 plus the incidence between t1 and t2, or in this example, 4 + 4 = 8. Although a person can be an incident case only once, he or she could be considered a prevalent case at many points in time, including the beginning and end of the study period (as with case no. 1).

D Relationship between Incidence and Prevalence

Figure 2-1 provides data from the U.S. Centers for Disease Control and Prevention (CDC) to illustrate the complex relationship between incidence and prevalence. It uses the example of AIDS in the United States from 1981, when it was first recognized, through 1992, after which the definition of AIDS underwent a major change. Because AIDS is a clinical syndrome, the present discussion addresses the prevalence of AIDS, rather than the prevalence of its causal agent, human immunodeficiency virus (HIV) infection.

In Figure 2-1, the full height of each year’s bar shows the total number of new AIDS cases reported to the CDC for that year. The darkened part of each bar shows the number of people in whom AIDS was diagnosed in that year, and who were known to be dead by December 31, 1992. The clear space in each bar represents the number of people in whom AIDS was diagnosed in that year, and who presumably were still alive on December 31, 1992. The sum of the clear areas represents the prevalent cases of AIDS as of the last day of 1992. Of the people in whom AIDS was diagnosed between 1990 and 1992 and who had had the condition for a relatively short time, a fairly high proportion were still alive at the cutoff date. Their survival resulted from the recency of their infection and from improved treatment. However, almost all people in whom AIDS was diagnosed during the first 6 years of the epidemic had died by that date.

The total number of cases of an epidemic disease reported over time is its cumulative incidence. According to the CDC, the cumulative incidence of AIDS in the United States through December 31, 1991, was 206,392, and the number known to have died was 133,232.2 At the close of 1991, there were 73,160 prevalent cases of AIDS (206,392 − 133,232). If these people with AIDS died in subsequent years, they would be removed from the category of prevalent cases.

On January 1, 1993, the CDC made a major change in the criteria for defining AIDS. A backlog of patients whose disease manifestations met the new criteria was included in the counts for the first time in 1993, and this resulted in a sudden, huge spike in the number of reported AIDS cases (Fig. 2-3). Because of this change in criteria and reporting, the more recent AIDS data are not as satisfactory as the older data for illustrating the relationship between incidence and prevalence. Nevertheless, Figure 2-3 provides a vivid illustration of the importance of a consistent definition of a disease in making accurate comparisons of trends in rates over time.

Prevalence is the result of many factors: the periodic (annual) number of new cases; the immigration and emigration of persons with the disease; and the average duration of the disease, which is defined as the time from its onset until death or healing. The following is an approximate general formula for prevalence that cannot be used for detailed scientific estimation, but that is conceptually important for understanding and predicting the burden of disease on a society or population:


This conceptual formula works only if the incidence of the disease and its duration in individuals are stable for an extended time. The formula implies that the prevalence of a disease can increase as a result of an increase in the following:

In the specific case of AIDS, its incidence in the United States is declining, whereas the duration of life for people with AIDS is increasing as a result of antiviral agents and other methods of treatment and prophylaxis. These methods have increased the length of survival proportionately more than the decline in incidence, so that prevalent cases of AIDS continue to increase in the United States. This increase in prevalence has led to an increase in the burden of patient care in terms of demand on the health care system and dollar cost to society.

A similar situation exists with regard to cardiovascular disease. Its age-specific incidence has been declining in the United States in recent decades, but its prevalence has not. As advances in technology and pharmacotherapy forestall death, people live longer with disease.

II Risk

B Limitations of the Concept of Risk

Often it is difficult to be sure of the correct denominator for a measure of risk. Who is truly at risk? Only women are at risk for becoming pregnant, but even this statement must be modified, because for practical purposes, only women aged 15 to 44 years are likely to become pregnant. Even in this group, some proportion is not at risk because they use birth control, do not engage in heterosexual relations, have had a hysterectomy, or are sterile for other reasons.

Ideally, for risk related to infectious disease, only the susceptible population—that is, people without antibody protection—would be counted in the denominator. However, antibody levels are usually unknown. As a practical compromise, the denominator usually consists of either the total population of an area or the people in an age group who probably lack antibodies.

Expressing the risk of death from an infectious disease, although seemingly simple, is quite complex. This is because such a risk is the product of many different proportions, as can be seen in Figure 2-4. Numerous subsets of the population must be considered. People who die of an infectious disease are a subset of people who are ill from the disease, who are a subset of the people who are infected by the disease agent, who are a subset of the people who are exposed to the infection, who are a subset of the people who are susceptible to the infection, who are a subset of the total population.

The proportion of clinically ill persons who die is the case fatality ratio; the higher this ratio, the more virulent the infection. The proportion of infected persons who are clinically ill is often called the pathogenicity of the organism. The proportion of exposed persons who become infected is sometimes called the infectiousness of the organism, but infectiousness is also influenced by the conditions of exposure. A full understanding of the epidemiology of an infectious disease would require knowledge of all the ratios shown in Figure 2-4. Analogous characterizations may be applied to noninfectious disease.

The concept of risk has other limitations, which can be understood through the following thought experiment. Assume that three different populations of the same size and age distribution (e.g., three nursing homes with no new patients during the study period) have the same overall risk of death (e.g., 10%) in the same year (e.g., from January 1 to December 31 in year X). Despite their similarity in risk, the deaths in the three populations may occur in very different patterns over time. Suppose that population A suffered a serious influenza epidemic in January (the beginning of the study year), and that most of those who died that year did so in the first month of the year. Suppose that the influenza epidemic did not hit population B until December (the end of the study year), so that most of the deaths in that population occurred during the last month of the year. Finally, suppose that population C did not experience the epidemic, and that its deaths occurred (as usual) evenly throughout the year. The 1-year risk of death (10%) would be the same in all three populations, but the force of mortality would not be the same. The force of mortality would be greatest in population A, least in population B, and intermediate in population C. Because the measure of risk cannot distinguish between these three patterns in the timing of deaths, a more precise measure—the rate—may be used instead.

III Rates

A Definition

A rate is the number of events that occur in a defined time period, divided by the average number of people at risk for the event during the period under study. Because the population at the middle of the period can usually be considered a good estimate of the average number of people at risk during that period, the midperiod population is often used as the denominator of a rate. The formal structure of a rate is described in the following equation:


Risks and rates usually have values less than 1 unless the event of interest can occur repeatedly, as with colds or asthma attacks. However, decimal fractions are awkward to think about and discuss, especially if we try to imagine fractions of a death (e.g., “one one-thousandth of a death per year”). Rates are usually multiplied by a constant multiplier—100, 1000, 10,000, or 100,000—to make the numerator larger than 1 and thus easier to discuss (e.g., “one death per thousand people per year”). When a constant multiplier is used, the numerator and the denominator are multiplied by the same number, so the value of the ratio is not changed.

The crude death rate illustrates why a constant multiplier is used. In 2011, this rate for the United States was estimated as 0.00838 per year. However, most people find it easier to multiply this fraction by 1000 and express it as 8.38 deaths per 1000 individuals in the population per year. The general form for calculating the rate in this case is as follows:


Rates can be thought of in the same way as the velocity of a car. It is possible to talk about average rates or average velocity for a period of time. The average velocity is obtained by dividing the miles traveled (e.g., 55) by the time required (e.g., 1 hour), in which case the car averaged 55 miles per hour. This does not mean that the car was traveling at exactly 55 miles per hour for every instant during that hour. In a similar manner, the average rate of an event (e.g., death) is equal to the total number of events for a defined time (e.g., 1 year) divided by the average population exposed to that event (e.g., 12 deaths per 1000 persons per year).

A rate, as with a velocity, also can be understood as describing reality at an instant in time, in which case the death rate can be expressed as an instantaneous death rate or hazard rate. Because death is a discrete event rather than a continuous function, however, instantaneous rates cannot actually be measured; they can only be estimated. (Note that the rates discussed in this book are average rates unless otherwise stated.)

B Relationship between Risk and Rate

In an example presented in section II.B, populations A, B, and C were similar in size, and each had a 10% overall risk of death in the same year, but their patterns of death differed greatly. Figure 2-5 shows the three different patterns and illustrates how, in this example, the concept of rate is superior to the concept of risk in showing differences in the force of mortality.

Because most of the deaths in population A occurred before July 1, the midyear population of this cohort would be the smallest of the three, and the resulting death rate would be the highest (because the denominator is the smallest and the numerator is the same size for all three populations). In contrast, because most of the deaths in population B occurred at the end of the year, the midyear population of this cohort would be the largest of the three, and the death rate would be the lowest. For population C, both the number of deaths before July 1 and the death rate would be intermediate between those of A and B. Although the 1-year risk for these three populations did not show differences in the force of mortality, cohort-specific rates did so by reflecting more accurately the timing of the deaths in the three populations. This quantitative result agrees with the graph and with intuition, because if we assume that the quality of life was reasonably good, most people would prefer to be in population B. More days of life are lived by those in population B during the year, because of the lower force of mortality.

Rates are often used to estimate risk. A rate is a good approximation of risk if the:

Aug 27, 2016 | Posted by in PUBLIC HEALTH AND EPIDEMIOLOGY | Comments Off on Epidemiologic Data Measurements
Premium Wordpress Themes by UFO Themes
%d bloggers like this: