The Leapfrog Group, a coalition of large employers and healthcare purchasers, has issued perhaps the most visible set of surgical quality indicators for its value-based purchasing initiative. Although originally focused exclusively on structural measures, including volume standards, their current standards also include selected processes and risk-adjusted outcomes. More recently, the Leapfrog Group began publicly reporting a composite measure of operative mortality and hospital volume as the primary measure for their evidence-based hospital referral initiative on their website.7 We will discuss composite measures in detail later in this chapter.
INDIVIDUAL QUALITY MEASURES
1 Quality measures fall into three categories: structure, process, and outcomes.
Structure
Healthcare structure refers to fixed attributes of the system in which patients receive care. Many structural measures describe hospital-level attributes, such as the resources or staff coordination and organization (e.g., nurse-to-patient ratios, hospital teaching status). Other structural measures reflect attributes of individual physicians (e.g., subspecialty board certification, procedure volume).
Strengths
Structural measures of quality have several attractive features. First, they are strongly related to patient outcomes. For example, with esophagectomy and pancreatic resection, operative mortality rates at high-volume hospitals are often 10% lower, in absolute terms, than low-volume centers.8,9 In some instances, structural measures such as procedure volume are more predictive of subsequent hospital performance than any known processes of care or even direct mortality measures (Fig. 16-1).
Perhaps the most important advantage of structural variables is the ease with which they can be assessed. Many can be determined using readily available sources, such as administrative billing data. Although some structural measures require surveying hospitals or providers, such data are much less expensive to collect than measures requiring detailed patient-level information.
Figure 16-1. Ability of hospital rankings based on 2003–2004 mortality rates and hospital volume to predict risk-adjusted mortality in 2005–2006. Data shown for abdominal aortic aneurysm repair (A) and pancreatic cancer resection (B). Source: National Medicare data.
Limitations
Perhaps the greatest limitation of structural measures is that they are not readily actionable. For example, a small hospital cannot readily make itself a high-volume center. Thus, while selected structural measures may be useful for selective referral initiatives, they have limited value for quality improvement purposes. Structural measures are also limited in their ability to discriminate the performance of individual providers. For example, in aggregate, high-volume hospitals have much lower mortality rates than lower-volume centers for pancreatic resection.8,9 However, some individual high-volume hospitals may have high mortality rates, and some low-volume hospitals may have low mortality rates.10 Although the true performance of individual hospitals is difficult to confirm empirically (for sample size reasons), this lack of discrimination is one reason structural measures are often viewed as “unfair” by many providers.
Process of Care
Process of care measures are the clinical details of care provided to patients. Although long the predominant quality indicators for medical care, their popularity in surgery is growing rapidly. Perhaps the best example of the trend toward using process measures is the CMS’s SCIP. As previously mentioned, this quality measurement initiative focuses exclusively on processes related to prevention of surgical site infections, postoperative cardiac events, venous thromboembolism, and respiratory complications.
Strengths
Since processes of care reflect the care actually delivered by physicians, they have face validity and enjoy greater buy-in from providers. They are also directly actionable and provide good substrate for quality improvement activities. Although risk adjustment may be important for outcomes, it is not required for many process measures. For example, the appropriate prophylaxis against postoperative venous thromboembolism is a widely used process measure. Since virtually all patients undergoing open abdominal surgery should be offered some form of prophylaxis, there is little need to collect detailed clinical data for risk adjustment.
Limitations
The biggest limitation of process measures is the lack of correlation between processes of care and important outcomes.4 There is a growing body of empirical data showing very little correlation between processes of care and important outcomes.11–13 Most data come from literature on medical diagnoses, such as acute myocardial infarction. For example, the Joint Commission and CMS process measures for acute myocardial infarction explained only 6% of the observed variation in risk-adjusted mortality for acute myocardial infarction.12
Emerging evidence demonstrates a similar relationship for surgical process measures, especially for SCIP measures, with no measurable relationship between these widely collected process measures and important outcomes.13
There are several reasons why existing process measures explain very little of the variation in important surgical outcomes. First, most process measures currently used in surgery relate to secondary outcomes. While none would dismiss the value of prophylactic antibiotics in reducing risks of superficial wound infection, this process is not related to the most important adverse events of major surgery, including death.
Second, process measures in surgery often relate to complications that are very rare. For example, there is consensus that venous thromboembolism prophylaxis is necessary and important. The SCIP measures, endorsed by the NQF, include the use of appropriate venous thrombosis prophylaxis. However, pulmonary embolism is very uncommon, and improving adherence to these processes will, therefore, not avert many deaths. Until we understand which processes of care account for those adverse events leading to death, process measures will have limited usefulness in surgical quality measurement.4,11–14
Outcomes
Outcome measures reflect the end result of care, from a clinical perspective or as judged by the patient. Although mortality is by far the most commonly used measure in surgery, other outcomes that could be used as quality indicators include complications, hospital readmission, and a variety of patient-centered measures of quality of life or satisfaction. The best example of this type of measurement is found in the American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP).15 The ACS-NSQIP is a surgeon-led clinical registry for feeding back risk-adjusted morbidity and mortality rates to participating hospitals. After its successful implementation in Veterans Affairs (VA) hospitals, it was introduced into the private sector with good results.16 Under the guidance of the American College of Surgeons, hospital participation in the NSQIP continues to grow, with more than 400 hospitals currently participating. Several innovations to the ACS-NSQIP measurement platform in the past few years will no doubt help make the program less expensive and more useful.3
Strengths
There are at least two key advantages of outcome measures. First, outcome measures have obvious face validity, and thus are likely to get the greatest “buy-in” from hospitals and surgeons. Surgeon enthusiasm for the ACS-NSQIP and the continued dissemination of the program clearly underline this point. Second, the act of simply measuring outcomes may lead to better performance – the so-called Hawthorne effect. For example, surgical morbidity and mortality rates in VA hospitals have fallen dramatically since implementation of the NSQIP two decades ago.15 No doubt many surgical leaders at individual hospitals made specific organizational or process improvements after they began receiving feedback on their hospitals’ performance. However, it is very unlikely that even a full inventory of these specific changes would explain such broad-based and substantial improvements in morbidity and mortality rates.
Limitations
Hospital- or surgeon-specific outcome measures are severely constrained by small sample sizes. For the large majority of surgical procedures, very few hospitals (or surgeons) have sufficient adverse events (numerators) and cases (denominators) for meaningful, procedure-specific measures of morbidity or mortality. For example, Dimick et al.17 used data from the Nationwide Inpatient Sample to study seven procedures for which mortality rates have been advocated as quality indicators by the AHRQ. For six of the seven procedures, a very small proportion of US hospitals had adequate caseloads to rule out a mortality rate twice the national average (Fig. 16-2). Although identifying poor-quality outliers is an important function of outcome measurement, focusing on this goal alone significantly underestimates problems with small sample sizes. Discriminating among individual hospitals with intermediate levels of performance is even more difficult.
Another significant limitation of outcome assessment is the expense of data collection. Reporting outcomes requires the costly collection of detailed clinical data for risk adjustment. For example, it costs over $100,000 annually for a private sector hospital to participate in the ACS-NSQIP. Because of the expense of data collection, the ACS-NSQIP currently collects data on only a sample of patients undergoing surgery at each hospital. Although this sampling strategy decreases the cost of data collection, it exacerbates the problem of small sample size with individual procedures.
COMPOSITE MEASURES
2 Composite measures, created by combining multiple individual quality indicators, are becoming increasingly used in the assessment of surgical quality.6,7,18 Most existing pay-for-performance efforts, including the CMS pilot, use composite measures to assess the quality of medical and surgical diagnoses. The Society of Thoracic Surgeons (STS) Measurement Taskforce has created a new composite score that combines elements of outcomes and processes of care into a single measure.18 With growing enthusiasm for this approach, the AHRQ recently published a technical review of composite measures.19
Figure 16-2. Big problems with small samples: the proportion of hospitals in the United States with sufficient caseloads (sample size) to reliably use mortality rates to measure quality.