Missing the Mark on Pay-for Performance: Questionable Validity in Quality Metrics Limit Medicare P4P Program in Orthopedic Surgery

This study, using administrative data, aimed to evaluate the metrics used in a Medicare Pay-for-Performance (P4P) project in 2003, which included 260 hospitals across 38 states, with plans to expand nationally in 2009.

Source: Army Medicine (Flickr/CC)

Source: Army Medicine (Flickr/CC)

Arthroscopic surgery is an ideal candidate condition for evaluation for the following rationale: (1) the procedure is well-defined, common and costly; (2) treatment guidelines/clinical pathways exist; (3) outcomes are well-studied; and (4) structural aspects of care have been described (e.g., relationship between higher volume and improved outcomes).  Using a composite measure at the hospital level, the Centers for Medicare and Medicaid Services (CMS) ranked participating hospitals and doled out a bonus of 2% and 1% of diagnosis related group (DRG) payments to hospitals scoring in the top 10% and 20%, respectively. Other hospitals in the top 50% were publicly lauded on a website.

In order to validate the CMS composite measure and ranking scheme, the authors reconstructed both measures and rankings correlated with three external measures: inpatient mortality data, risk-adjusted iatrogenic complications and urinary tract infections, and surgical volume (a proxy measure for quality).  Fundamentally, 3 major findings call into question the use of the CMS methodology for measuring quality in the P4P program. First, the performance of providers using the authors selected outcome measures were virtually indistinguishable. Secondly, the CMS quality composite score was dependent exclusively on 3 process measures. These were related to antibiotics use—a particularly problematic process measure that has not been well-excepted in the orthopedic community. Lastly, mortality (r=.116, p=0.088) and complication rates (r=0.268) did not correlate strongly with CMS rank order.  A notable analogy used by the authors compares the measures to a grade school report card with nearly everyone receiving “A” (for example, 100% of hospitals scored a “99%” in metabolic & hematoma index)—how’s that for grade inflation!

Commentary:

The authors set out to evaluate an assessment tool used in orthopedic surgery for a Medicare P4P initiative, finding that the metrics do not measure quality (at least not well enough to discriminate between ‘good’ and ‘poor’ performers).  The findings are particularly salient since, as the authors point out, health reformists continue to call for P4P programs as a means to improve the quality of care and reduce costs; yet the evidence for such an assertion is lacking.

Pioneered by Robert Brooks and Katherine Kahn over 20 years ago, measuring quality of care, which forms the foundations for P4P, remains challenging and imperfect.  Some would argue that our zeal to improve quality has led us to measure anything and everything, validity be damned.  Others feel the quality of care movement has benefited from the proliferation of quality metrics and initiatives—getting providers accustomed to being measured.

The authors identify several shortcomings of CMS metrics and a logical way forward toward using quality measurement to improve quality—namely public reporting.  Hibbard, et al, showed that while measurement is laudable and feedback is imperative; public reporting is critical to quality improvement.  The same cannot be said for P4P…yet.

The notion that by paying “good performing surgeons” slightly more or even significantly more than “poor performing surgeons” based on essentially whether an antibiotic is started or stopped on time, will not necessarily improve the “value” of surgical care.  It is now the role of researchers and doctors to do the hard work required to understand how “high quality care” can be validly measured for comparison.  Furthermore, we need to know which tools are best used (e.g., financial incentives, reputation, regulation) to improve quality.

Health Affairs, 28, no. 2 (2009): 526-532 

by

Stanley Frencher, Jr., MD, MPH

2 Replies to “Missing the Mark on Pay-for Performance: Questionable Validity in Quality Metrics Limit Medicare P4P Program in Orthopedic Surgery”

Comments are closed.