html xmlns=”http://www.w3.org/1999/xhtml” xmlns:mml=”http://www.w3.org/1998/Math/MathML” xmlns:epub=”http://www.idpf.org/2007/ops”>
In this chapter we discuss the outcome variables, which are the endpoints of a study. These are the most critical measurements in the study, as they are the basis for the results that you will present. Although some studies have only a single endpoint, most studies have multiple endpoints, of which one is primary. “Outcome” sometimes implies a Yes or No result, and “endpoint” sometimes implies a result after some time interval, but the words are often used interchangeably in practice, and we do so in this chapter as well.
15.1 Defining the Outcome Variables
When you plan a study, you must choose outcome variables that will provide the data needed to answer the questions proposed in the study. Therefore, the selection and definition of these variables is critical. First of all, they must be relevant to the question of interest. They must address the major study questions directly. Moreover, since the human body consists of multiple complex biological systems that interact and overlap, it is likely that there will be other variables that are affected by the intervention or exposures that the participant experiences. For this reason, you may be interested in multiple, related outcomes. You must distinguish between outcomes that will provide information about the efficacy of an intervention and the safety variables that must be monitored to make sure that there are no adverse effects from the intervention.
We emphasize that in any study the outcome variables must be fully specified in advance. It is not acceptable to define an outcome variable after the study is under way. Sometimes investigators will describe an unexpected finding as an unexpected outcome, but this can only be viewed as a chance finding that must be verified in a study designed specifically for that purpose. It is possible that in the course of a study an unexpected result may become so clear and so well explained by the known facts that it may be reported as a valid outcome, but this is rare and needs an extensive justification to be acceptable to the scientific community.
The use of lithium for treatment of bipolar depression was discovered accidentally when investigators were using it for treatment of urinary tract infections and noticed changes in participants with bipolar disorder. Although the results were dramatic and reproducible, a great deal of further study was required to validate the results and to determine a safe and effective dose of lithium for the treatment of bipolar disease.
As you are collecting information anyway, there is a common tendency to collect as much information as you can, even though it is not relevant to the study hypotheses. This puts an unnecessary burden on participants in the study, and can actually reduce the quality of the primary data needed for the study because participants become overwhelmed by the demands of the study, or annoyed at the extensive and apparently irrelevant, and sometimes intrusive, information requested. In addition, if you look at a sufficiently large number of different variables, you are certain to find some results that are statistically significant just by chance.
In the randomized parallel group study of testosterone replacement in hypogonadal men (Example 14A), the primary outcome was the achievement of testosterone levels within the normal range after 30 days. The other outcome variables included hormone measures, bone markers, body composition, bone mineral density, mood, and sexual function. All of these variables were postulated to be affected by hormone replacement, and thus it was valid to include them as secondary outcomes. Variables that were not expected to be affected, such as cortisol or cerebral metabolites, were not measured.
In this example there were many outcome variables, but the primary variable was whether or not the testosterone level at 30 days was within the normal range of men in that age group. Other research goals might use a synthesis of several measurements. The choice of the primary outcome variable depends on the setting of the experiment. It may be related to a particular disease, as well as the investigator’s belief as to what is the most critical value.
The term “lipid levels” implies several measurements, including total cholesterol, HDL and LDL cholesterol, the ratio of total to HDL cholesterol, and triglycerides. In most studies, all these variables may be measured; however, for the purpose of defining efficacy in an interventional study, one variable must be selected as the primary outcome. In a study of a lipid-lowering agent that primarily affects LDL cholesterol, the primary endpoint would be changes in LDL cholesterol. In a study of a drug expected to have a lipid-lowering effect on a number of different lipid measures, the endpoint might be the proportion of participants achieving the National Cholesterol Education Program goal.
If the study is an interventional study of a new treatment, then it usually focuses on a single endpoint, and other outcomes are secondary. On occasion, however, such a study will have two equally important endpoints, called co-primary endpoints. If the study is not focused on efficacy, then several outcome variables may be equally important primary endpoints for either an observational or interventional study.
The Collaborative Study of the Psychobiology of depression was an in-hospital multi-center study of the effect of antidepressant drugs on the biological and clinical aspects of depression. Participants who met the study criteria for severe depression, unipolar or bipolar, were admitted to one of six participating hospitals. The study duration was six weeks. After a 10-day washout period, baseline samples of cerebral catecholamines and metabolites were measured in serum, cerebral spinal fluid, and urine for the study. After five more days, participants were randomly assigned to receive one of two different antidepressants, and the biological measures were reassessed after 18 days treatment. Clinical status was evaluated using a battery of behavioral instruments that measured different aspects of depression. Clinical evaluations were done at intake, at baseline, when drug treatment was started, and weekly during the treatment period, including the last day of treatment. The changes in neurochemical and behavioral measures were multiple, equally important endpoints for this study.