Health care systems around the world are beginning to shift from paying for any and all care provided to incentivizing the provision of high quality care. However, pay-for-performance programs create their own problems.
The reinvention of health care begins with efforts to provide efficient care that simultaneously improves patient outcomes. We are all unknowing participants in at least one, if not multiple, pilot program involving systemic redesigns aimed at improving quality.
The authors of this study investigated the first five years of a pay-for-performance (P4P) program for diabetes patients in Taiwan’s National Health Insurance Program from 2001 through 2005. Taiwan’s P4P program is based on a system of bonuses for the initial enrollment and check-up (available only once per patient; pays for medical history collection, laboratory evaluation, physical examination, creation of a management plan, and self-management education), a follow-up management bonus (available once every three months for assessing treatment regimen, physical examination, laboratory evaluation, evaluation of the management plan, and self-management education), and an annual evaluation and report bonus (paid after a patient’s full year enrollment in the P4P program upon provider confirmation of 8 blood tests – HbA1c, glucose, total cholesterol, HDL, LDL, triglycerides, creatinine, and liver function – as well as the urine microalbumin test and an eye exam).
The initiative demonstrated improvements in quality of care for enrolled patients, producing 100 percent or nearly 100 percent adherence to all process measures. However, these results appeared questionable due to the study’s inherent selection bias. Over five years, the program enrolled only thirty percent of patients with diabetes; the P4P program’s design encouraged physicians to avoid enrollment for the most complicated patients. A comparison of enrolled and never-enrolled patients revealed significantly lower Diabetes Complication Severity Index (DCSI) scores in enrolled individuals compared to never enrolled patients. Never-enrolled patients showed higher DCSI scores at baseline, and their score rose faster than those of enrolled patients. By allowing this type of systematic selection bias, where the healthiest patients were “cherry picked” by physicians to be included in process measure reporting, physicians were able to game the system and potentially reap the rewards of higher pay-for performance payments without actually improving the care for all diabetic patients.
Unrefined versions of quality improvement programs are susceptible to abuses including “system gaming” (over, under, or misreporting), “cherry-picking” (selecting the healthiest patients for outcomes reporting), or “searching under the lamp post” (where providers focus more on incentivized conditions to the detriment of the patient’s overall health).
In this study, the P4P program failed to account for patient death, instead choosing to exclude patients who did not complete a full year of enrollment, for any reason, from analysis. This effectively created another source of bias with selective outcomes reporting. Finally, despite nearly 100 percent compliance for measured processes, the P4P program showed a slight increase in DCSI scores for the enrolled group suggesting no improvement in patients’ overall health despite apparent improved compliance. While this outcome could have resulted from inflated baseline scores as a function of initial selection bias, it also calls into question the use of process measures and intermediate outcomes to measure clinically significant health care quality.
Bias, and the questionable clinical significance of intermediate measures, played a role in producing questionable results. Yet, the real problem with the Taiwanese P4P program began with the health care culture and a failure to incentivize cultural changes. The authors cite system characteristics such as high workloads, short duration of patient visits (2-5 minutes), and “doctor shopping” as issues affecting a doctor’s power to offer continuity of care. Without cultural change, infrastructural support, and adequate staffing, putting the burden of quality improvement on individual physicians may create frustration and reduce buy-in.
Successful programs should not only offer incentives for physicians to overcome systemic problems, rather, P4P should incentivize cultural redesigns that remove barriers to quality care.