Evaluating the Evidence

CHAPTER 18 Evaluating the Evidence




Medical professionals will often find themselves in the position to form an opinion or make a recommendation to a patient based on the medical literature. A courtroom judge may need to decide on the validity and admissibility of expert testimony. A journalist may need to write an article about a so-called medical breakthrough to satisfy the public’s curiosity. In each case, the professional goes to the literature, hoping to find an answer. There is an overwhelming plethora of research out there. How does one begin to critique these studies?


Any well-designed study results in a solid conclusion. However, the type of study design limits the strength of the conclusion that can be drawn. A study is what it is. For example, the most precise methodology in a case–control study cannot eliminate all potential recall bias, or a meticulous cohort study could still be marred by unaccountable confounding variables. Even randomized, controlled trials have their limitations in selection and data analysis. Each type of study makes a major contribution to the fund of knowledge. The strength of the conclusion, however, is a function of the methodology and amount of bias that tags along.


The most precise study is like pure gold. Bias represents the impurities that are found in the finished product. It takes a concentrated effort to remove the impurities during the design and execution of a trial. Although some studies cannot be refined more than they are because of the inherent limitations in the design, the product can still have significant value. Bias cannot be eliminated totally but, when it is reduced as in a DBRCT, the research is more robust.


Several authors and committees have published a rating system for evaluating the strength of the evidence. For example, the U.S. Preventive Services Task Force has published a simple hierarchy of the quality of evidence based on study design (summarized in Table 18-1). In this scheme, a well-designed DBRCT described in a literature review would have more credence than an observational study. A recommendation is graded on a letter scale according to the supporting evidence in its favor. The strength of the recommendation will directly parallel the quality of evidence. Most published scales are in agreement with this rating system. In some of these, meta-analysis is placed at the top of the list as being the most reliable evidence.


TABLE 18-1 Rating Clinical Evidence


Assessment System of the U.S. Preventive Services Task Force




















Quality of Evidence
I Evidence from at least one properly designed randomized, controlled trial.
II-1 Evidence obtained from well-designed controlled trials without randomization.
II-2 Evidence from well-designed cohort or case–control studies, preferably from more than one center or research group.
II-3 Evidence from multiple time series with or without the intervention. Important results in uncontrolled experiments (such as the introduction of penicillin treatment in the 1940s) could also be considered as this type of evidence.
III Opinions of respected authorities, based on clinical experience, descriptive studies, or reports of expert committees.



















Strength of Recommendations
A Good evidence to support the intervention.
B Fair evidence to support the intervention.
C Insufficient evidence to recommend for or against the intervention, but recommendation might be made on other grounds.
D Fair evidence against the intervention.
E Good evidence against the intervention.

(From Friedland, D. J. et al. 1998. Evidence-based medicine: a framework for clinical practice. Stamford CT: Appleton & Lange, p. 229 with permission.)

Stay updated, free articles. Join our Telegram channel

Jun 18, 2016 | Posted by in BIOCHEMISTRY | Comments Off on Evaluating the Evidence

Full access? Get Clinical Tree

Get Clinical Tree app for offline access