Fig. 3.1
Map of Vermont demonstrating hospital services areas. Dark lines represent boundaries of hospital service areas; areas without dots are served principally by New Hampshire hospitals (Reproduced with permission according to JSTOR, Gittelsohn [2])
Wennberg’s findings were striking. Tonsillectomy rates per 10,000 persons, adjusted for age, varied from 13 in some regions to 151 in others. Similar extent of variation was seen in appendectomy (10–32 per 10,000 population), prostatectomy (11–38 per 10,000 population), and hysterectomy (20–60 per 10,000 population). And when Wennberg looked for explanations for these striking variations, he found a simple but elegant explanation. The more physicians and hospitals in a service area, the more services they provided. These relationships held fast across a broad variety of measures – number of procedures, population size, and number and type of specialists. What Wennberg did not find was large differences in patients across the communities in Vermont. Patients, overall, were similar – but the amount of care they received was not.
Wennberg concluded, in this early work, that there are wide variations in resource input, utilization of services, and expenditures – even in neighboring communities. Further, these variations in utilization seemed to be directly related to considerable uncertainty about the effectiveness of specific health services. His “prescription” for these uncertainties was to spend the next 40 years attempting to use informed choice to leverage patient decision-making towards trying to limit variations in care.
3.3 Gaining Momentum: Bigger Is Better
Building on these initial analyses, Wennberg and colleagues sought to broaden their work from a small state in New England to more representative – and generalizable – insights about the extent of variation occurring across the United States. Accordingly, Wennberg, John Birkmeyer and colleagues used an aggregate of the hospital service area – called the hospital referral region – to study variation in common surgical procedures [3]. While the hospital service area studied care at the level of neighborhoods and communities, the hospital referral region (n = 306 across the United States) studied care at the level of a regional referral center. And, instead of using data from one state, Medicare claims were selected to provide a national, generalizable view of variations in surgical care.
Birkmeyer’s findings centered around two important principles. First, just as Wennberg found dramatic variation across different – and sometimes neighboring communities in Vermont, Birkmeyer found dramatic variation across different hospital referral regions across the United States. For example, as shown in Fig. 3.2, rates of carotid endarterectomy varied nearly three-fold across different regions of the United States. Maps demonstrating region variation, inspired by the work of investigators in the Dartmouth Atlas, became universal in terms of a way to demonstrate regional differences in utilization. Darker areas represented areas where procedures were performed more commonly, and lighter areas represented the areas where procedures were performed less commonly. These representations brought these differences to stark contrast, and one cannot help looking at the map and seeing what color – and utilization rate – is reflected in the region you call home.
Fig. 3.2
Map demonstrating variation in rates of carotid endarterectomy across the 306 hospital referral regions of the United States (Reproduced with permission from Elsevier, Birkmeyer et~al. [3])
The second important finding this work demonstrated was that the extent of variation was different across different types of operations. As shown in Fig. 3.3, there were certain operations where consensus existed, in terms of when to proceed with surgery. Hip fracture demonstrated this axiom quite nicely and unsurprisingly so. The indication for surgery is clear in this setting, as a hip fracture is easy to diagnosis. The benefits are easily seen as well, as all but the most moribund patients do better with surgery than with non-operative care. Therefore, there is little variation across the United States in terms of the utilization of hip fracture surgery. Figure 3.3 demonstrates this concept by showing each hospital referral region (HRR) as a dot, and listing the procedures across the x-axis. All HRRs cluster closely together for procedures like hip fracture.
Fig. 3.3
Variation profiles of 11 surgical procedures, demonstrating the ratio of observed to expected Medicare rates in the 306 hospital referral regions of the United States. Rates are adjusted for age, sex, and race, with high and low outlier HRRs distinguished by dotted lines (Reproduced with permission from Elsevier, Birkmeyer et~al. [3])
However, for procedures like carotid endarterectomy, back surgery, and radical prostatectomy, the HRRs spread over a much wider range. These procedures, unlike hip fracture, are much more discretionary in their utilization. In general, it is evident that procedures with the highest degree of variation reflect areas of substantial disagreement about both diagnosis (what does an elevated PSA really mean) and treatment (is back surgery really better than conservative treatment)? Dealing with this variation will require, Birkmeyer argues, will require better understanding of surgical effectiveness, patient-specific outcome assessment, and a more thorough understanding of patient preferences. Patients, clinicians, payers, and policymakers all will need to work together, he argues, to determine “which rate is right.”
3.4 Innovating Approaches, and Integrating Ideas – From Medicine to Surgery
After these publications in the early 1990s, Wennberg and his colleagues spent the next decade refining analytic methods, and incorporating what seemed to be a recurrent theme in their work: that there was significant variation in the provision of medical care, and more care was not necessarily associated with better outcomes. But critics wondered if this work, limited in clinical detail, actually reflected different care on similar patients – because clinical variables for risk adjustment were commonly unavailable. To deal with these limitations, researchers began to use clinical events – such as death to create cohorts similar in risk strata.
In the most prominent of these approaches, Wennberg and Fisher created cohorts of patients who were undergoing care – medical, surgical and otherwise – at the end of life [4, 5]. By studying care provided in the last year of life, they argued, all patients in the cohort had similar 1-year mortality – 100 % – therefore limiting the effect of any un-measurable confounders. This research, published in 2003 and widely referenced, concluded that nearly 30 % of spending on end of life care offers little benefit, and may in fact be harmful.
Surgeons were quick to translate these innovate approaches, and integrate these ideas into surgical analyses. In a manuscript published in the Lancet in 2011, Gawande, Jha and colleagues adopted this technique and studied surgical care in the last year of life [6]. They had two basic questions. First, they asked if regional “intensity” of surgical care varied by the number of hospital beds, or by the number of surgeons in a region. And second, they examined relationships between regional surgical intensity and its mortality and spending rate.
Their team found that nearly one in three Medicare patients underwent a surgical procedure in the last year of life, and that this proportion was related to patient age (Fig. 3.4). Regions with the highest number of beds were mostly likely to operate on patients in the last year of life (R = 0.37), as were regions where overall spending in the last year of life was highest (R = 0.50). These findings reinforced earlier considerations about the need for patient-specific outcomes, and patient preferences in the provision of care at the end of life.
Fig. 3.4
Percentage of 2008 elderly Medicare decedents who underwent at least one surgical procedure in the last year of life (Reproduced with permission from Elsevier, Kwok et~al. [6])
3.5 Specialty Surgeons and Their Efforts in Describing and Limiting Variation
Many of the previously described investigations approached the subject of surgical variation using broad strokes – studying procedures as diverse as hip fracture, lower extremity bypass, and hernia repair, all within in the same cohorts. These approaches garnered effective, “big-picture” results, and surgeons grew interested in studying variation. Just as Wennberg sought to establish precise detail in the level of variation, surgeons now grew interested in exploring the different extent and drivers of variation across different specialties. In this section, we discuss three areas of subspecialty variation spine surgery and vascular surgery.
3.5.1 Variation in Spine Surgery
Patients presenting with back pain are a diverse cohort, and treatment with surgery is used at different rates in different parts of the country. As interest in studying the extent of variation and its causes began to build momentum, Weinstein and colleagues explored variation in the use of spine surgery for lumbar fusion [7]. These interests were brought to the fore with the development of devices such as prosthetic vertebral implants and biologics such as bone morphogenetic protein, all placed into everyday practice with a dearth of high quality evidence from randomized trials.
Weinstein and colleagues saw these changes occurring in “real-time”, in the context of their clinical interests as spine surgery specialists. They found that rates of spine surgery rose dramatically over between 1993 and 2003. By 2003, Medicare spent more than one billion dollars on spine surgery. In 1992, lumbar fusion accounted for 14 % of this spending, and by 2004, fusion accounted for almost half of total spending on spine surgery (Fig. 3.5). These observations led them to investigate the extent of this variation. What they found was truly remarkable. As shown in Fig. 3.6, there was nearly a 20-fold range in the rates of lumbar fusion across different hospital referral regions – the largest coefficient of variation reported with any surgical procedure to that date, a value five-fold greater than any variation seen in patients undergoing hip fracture. These data served to motivate extensive funding for the SPORT (Spine Patient Outcomes Research Trial), one of the largest continually funded randomized trials funded by the National Institutes of Health [8].