Saturday, August 2, 2008

The Impact Of Pay-For-Performance On Health Care Quality InMassachusetts,2001–2003 Few of these early P4P contracts were associated with greater quality improvement than was occurring in practices without such contracts. by Steven D. Pearson, Eric C. Schneider, Ken P. Kleinman, Kathryn L. Coltin, and Janice A. Singer ABSTRACT: Pay-for-performance (P4P) has become one of the dominant approaches to improving quality of care, yet few studies have evaluated its effectiveness. We evaluated the impact on quality of all P4P programs introduced into physician group contracts during 2001–2003 by the five major commercial health plans operating in Massachusetts. Overall, P4P contracts were not associated with greater improvement in quality compared to a rising secular trend. Future research is required to determine whether changes to the magnitude, structure, or alignment of P4P incentives can lead to improved quality. [Health Affairs 27, no. 4 (2008): 1167–1176; 10.1377/hlthaff.27.4.1167] The practice of paying physicians for performance on quality measures has spread rapidly, both nationally and internationally, and has become one of the most prominent policy initiatives aimed at improving the quality of health care.1 In the United States, pay-for-performance (P4P) has become a common part of contracts between private insurers and physicians.2 Purchaser coalitions such as the Leapfrog Group are now also active in measuring and rewarding high-quality health care.3 TheMedicare program has implemented P4P for hospitals and is poised to incorporate similar approaches in payments to individual physicians.4 Most editorials and commentaries on P4P have been highly laudatory and optimistic. 5 Others have noted the lack of research demonstrating that P4P improves the quality of care.6 Most studies to date have focused on the impact of single P4P programs, which differ widely in structure, focus, and the magnitude of financial reward or risk.7 Moreover, most P4P initiatives in the United States are still relatively new; most have been introduced through confidential business contracts, which makes it difficult to assess the parameters of the incentives; and nearly all have been introduced without evaluation as a goal, which makes it difficult to assess the independent contribution of such programs to quality improvement. This study took advantage of a new statewide qualitymeasurement and reporting system to evaluate the performance impact of multiple P4P programs introduced into physician group contracts during 2001–2003 by the five major commercial health plans operating in Massachusetts. In contrast to prior studies, we were able to determine, across a diverse set of P4P programs and physician groups, whether P4P programs were associated with improvements in quality and, if so, whether specific types of programs produced greater improvement than others. Study Data And Methods During the study period, P4P contracts between Massachusetts health plans and physician groups were introduced gradually, creating a series of natural experiments in which some physician groups were newly exposed to P4P incentives on particular quality measures while others were not. We used a quasi-experimental design to compare the change in performance on each quality measure among patients of newly incentivized groups to the change in performance on the same measure among patients at groups without any incentive. Because physician groupsmay vary onmany characteristics related to performance,we selected comparison groups based on clinical quality performance in the baseline year, 2001. Based onpriorworkusing the samedata, we determined that thenumber ofphysicians in a group was not an important predictor of performance.8  Performance data. MassachusettsHealthQuality Partners (MHQP) is a nonprofit collaborative of consumers, health care providers, health plans, purchasers, state government, and academic researchers.9 The five health plans participating in the are all nonprofit, cover nearly four million enrollees, and contract with approximately 5,000 primary care physicians (PCPs) in Massachusetts (more than 90 percent of the state’s practicing PCPs). Annually, these five health plans report on measures from the Health Plan Employer Data and Information Set (HEDIS) of the National Committee for Quality Assurance (NCQA). Since 2002, the five health plans have sharedHEDIS datawithMHQPso that the data can be aggregated at the individual physician level to produce a single annual report summarizing the delivery of primary care services by physician groups in Massachusetts. The vast majority of the data pertain to commercially insured enrollees rather than Medicaid and Medicare enrollees, so data on these latter groups of patients were not included in our study. ThirteenHEDIS measures thatwere potentially targets of P4P incentives were included in the current study (Exhibit 1).  Study sample. Our methods for identifying physicians and assigning them into groups are described in detail elsewhere.10 The physicians included in these analyses were those listed as a PCP by at least one of the five participating health plans. They include internists, family practitioners, and pediatricians, but also specialists who serve as PCPs for some patients and have dual status according to at least one of the health plans. Each physician has been mapped to a physician group based on health plan data and using physician rosters supplied by the medical groups to validate themapping. The initial sample consisted of 5,384 physicians (belonging to 177 physician groups) who had at least one HEDIS denominator observation during calendar year 2001 on at least one of the thirteen HEDIS measures included in the study. We decided a priori to exclude physicians from groups containing fewer than three PCPs (one- or two-physician practices) because of small patient sample sizes and greater difficulty determining their group affiliation. The final sample size was 5,350 physicians practicing in 154 physician groups.  P4P contract data. To gather information on the P4P contracts, one of the investigators (Pearson) performed an annual survey of senior medical staff at each health plan to learn about (1) the magnitude and structure of the P4P incentive, including the amount of money potentially available or at risk in the contract; (2) the structure of the incentive as either a bonus in addition to standard payment, a with- hold from standard payment that would be partly or wholly returned based upon performance, or some other mechanism; (3) the HEDIS measures used as performance targets within the contract; (4) whether performance was assessed via attainment of a minimum specified performance level or via improvement from previous performance; and (5) the presence of links in the contract between quality performance and utilization incentives. Data were gathered by a combination of a written data form and follow-up phone conversations with senior medical leaders and contracting staff of the health plans.  Analytic strategy. The overall strategy was to compare the change in HEDIS performance from 2001 to 2003 among patients cared for by newly incentivized physician groups to the change in performance for patients at “comparison” physician groups: groups that were matched with incentivized groups on their baseline performance in2001but that didnot subsequently receive any incentive for theparticular HEDIS measure(s).11 We considered a physician group to be “incentivized” for performance in a particular calendar year if it hada contract inplaceprior to 1 April of that year that included a financial incentive based on HEDIS performance during that year. Since HEDIS scores are calculated on a calendar-year basis, P4P contracts were universally framed as applying to calendar-year performance. All performance contracts in our study were formalized in the last few months (October–December) of the preceding calendar year. Because a common concern regarding P4P programs is that they may involve too little money to create a meaningful incentive, we performed a stratified analysis focused on physician groups exposed to the largest magnitude of financial incentiveswithin our study. A priori,we defined two criteria for selection of a physician group as having “high” incentives: (1) the amount of money at stake for the relevant HEDIS target was more than $100,000 at the group level; or (2) the amount of money at stake for the relevant HEDIS target, when divided equally among PCPs in the group, was more than $1,000 per physician. Study Results  Number of P4P contracts. In the baseline year, 2001, two of the five health plans had P4P programs in place with 40 (26 percent) of the 154 study physician groups. By 2003, four of the five health plans had P4P programs; all contracts from 2001 had been continued,while new ones had been added for a total of eighteen distinct contracts now involving eighty-one (53 percent) of the same physician groups.  HEDIS target measures. The most common HEDIS measures used as targets for quality performance in contractswerewell-established screeningmeasures such as mammography and chronic care measures such as hemoglobin A1c (HbA1c) testing for patients with diabetes mellitus (Exhibit 1). All eighteen P4P contracts included incentives tied to performance on two or more HEDIS measures, yielding thirty distinct contract-measure pairs for our analysis.  Type and size of incentives. Of the eighteen contracts in place by 2003, only onewas based onawithhold; all of the otherswere framed as bonuses, although five (28 percent) of these were deemed “shared savings” in which the amount of the bonus available was linked to the amount of savings arising from concomitant efforts to reduce health care use in other areas. The size of the incentives tied to quality performance on each HEDIS measure ranged from a low of approximately $200 to a high of approximately $2,500 per PCP. At the group level, the smallest overall amount tied to a P4P contractwas approximately $10,000 linked to performance on two HEDIS measures for a small physician group. At the other end of the scale, the largest amount linked to quality performance was $2.7 million in a contract with one of the state’s largest physician organizations, targeting five HEDIS measures.  Overall range of improvement. HEDIS performance improved from 2001 to 2003 on everyHEDIS measure, both among patients of groupswith new P4P incentives and among those receiving care in comparison groups. Among the thirty contract- measure pairs, twenty-two (73 percent) showed an improvement trend that was statistically indistinguishable between patients seen at groups with P4P and patients at comparison groups. Four of the contract-measure pairs (13 percent) demonstrated significantly greater improvement among patients in P4P groups than among patients in groups that lacked incentives, but an equal four (13 percent) contract-measure pairs showed less improvement among patients in P4P groups than among patients in groups that lacked incentives. Exhibit 2 shows the performance data for these eight contract-measure pairs that showed statistically significant differences between incentivized and comparison groups.  A distinctive outlier. A qualitative analysis of these eight contracts revealed no obvious distinctive features of “successful” or “unsuccessful” P4P contracts. There was one contract, however, labeled “A” in Exhibit 2, that stood out by demonstrating consistent, and positive, findings across all of its HEDIS targets. This contract had incentives tied to only two diabetes measures. It involved a single medical group of 30–50 PCPs (an approximate number is given here tomaintain anonymity). The P4P was structured as a potential bonus of $1 per member per month on top of a fee-forservice basic payment system. Based on this group’s number of health plan patients, the incentive was worth a total of approximately $44,000 to the group. However, this same medical group, through its participation in a large contracting network, also had another contract with incentives linked to performance on these same two HEDIS measures, providing an additional $28,500 as a potential bonus. Combining the incentives from the two P4P contracts therefore resulted in a possible bonus of $72,500 for performance on these measures—approximately $1,900 per PCP. This magnitude of incentive at the PCP level was among the highest in our study.  Improvement trends in “highly incentivized” groups. The percentage of groups that were classified by our a priori criteria as “highly incentivized” varied across HEDIS measures. For example, none of the groups with an incentive for performance on pediatric asthma care qualified as highly incentivized. For breast cancer screening, cervical cancer screening, and well-child adolescent visits, the proportion of highly incentivized groups among all intervention groups was 16–17 percent; for chlamydia screening, diabetes low-density lipoprotein measurement, andwell-child visits for children ages 3–6, the proportionswere 45–68 percent; and for the other diabetes care measures, 95 percent. Highly incentivized groups did not demonstrate superior quality improvement compared to comparison groups. Improvement trends were similar to matched comparison groups for all thirteen HEDIS measures. Improvement was small for some HEDISmeasures, such as breast cancer screening , and quite large for others, such as chlamydia screening for women ages 21–26 (Exhibit 3). Discussion Our study was designed to take advantage of the gradual implementation of P4P contracts between health plans and physician groups throughout Massachusetts during 2001–2003. P4P incentives spread rapidly during those years, with all but one major health plan actively engaged in P4P contracts with at least some physician groups by 2003. Notably, during the study period, clinical quality improved across all HEDIS measures among most Massachusetts physician groups, whether or not they had P4P incentives. A study that did not include comparison groups would conclude that initiation of P4P incentives was associated with notable performance improvement, but our results suggest that few P4P contracts were associated with greater improvement than was occurring in other practices throughout the state. One P4P contract with a single medical group was associated with superior improvement in quality performance on both of its HEDIS targets for diabetes care, but whether this case represented a chance finding or whether there was something special about this contract and medical group is impossible to ascertain. Our results should not obscure, however, the substantial improvement seen in HEDIS performance measures across group practices in Massachusetts from 2001 to 2003. Clearly something was going on, but what? It is possible that the general publicity attending P4P, or the anticipation of public reporting of quality performance at the physician-group level, might have changed the practice atmosphere throughout the state, leading all physicians to believe that performance on HEDIS measures would soon, in some way, affect their incomes. Other forces, such as growth in the use of electronic medical records, could also have contributed to quality improvement. But it should be noted that the improvement in HEDIS scores during 2001–2003 was a national and not just a state or regional phenomenon; the improvements inMassachusetts were not strikingly different from those noted in national averages during these same years.12  Size of incentives.Was P4P an essential element driving the secular trend toward quality improvement? Our study design cannot answer this question. But we found no relationship between the magnitude of quality improvement and specific P4P contracts. One key question for P4P policy is the amount of money needed to motivate greater improvements in clinical quality. There is no predefined standard forwhat amount constitutes an “adequate” incentive. Themost recent national data suggest that approximately 40 percent of P4P contracts nationally may include a maximum bonus greater than 5 percent of physicians’ income.13 The new family practitioner contract in the United Kingdom is structured to link quality performance for each physician to a maximum of approximately $139,000. Since the averageU. K. practitioner earns $122,000–$131,000 per year, themaximum quality bonus would more than double his or her income.14 In general, the financial incentives we studied amounted to less than $1,000 per PCP for improvement on a particular HEDIS measure. The incentives linked to diabetes care measures, when combined together, were usually $1,000–$2,000 per physician. A recent study by our research group suggested that Massachusetts groups had, on average, 2.2 percent of income at risk in P4P incentives.15 Even among groups with this larger incentive, we found no association between P4P and superior quality improvement,which suggests that P4P contracts in Massachusetts might not have put enough money at stake to drive additional quality improvement beyond the existing improvement trend.  Caveats. Ours is the first study to evaluate the cumulative impact of multiple P4P contracts on physician groups, but it is limited to Massachusetts, and our results might not generalize to other P4P programs and regions.We could not determinewhether or how the incentiveswere passed on to individual physicianswithin each group. Nor didwe have access to information on the amount of money actually paid out by the health plans to groups achieving P4P targets.  Policy implications. Like other studies, ours suggests that the initial generation of P4P contracts may have lacked key ingredients necessary to have—independent of other secular forces—a notable impact on quality performance. However, P4P can be viewed as an integral part of recent changes tomedical practice that also include public reporting of quality, tiering of physician networks, and other mechanisms that create an explicit or implicit link between physician performance and future income. It seems unlikely that this approach will be abandoned. Instead, we believe that the results of our study and others will be used to guide insurers, physicians, and policymakers involved in the design of future P4P efforts. Our results have several specific implications. First, incentives may need to exceed $2,000 per physician or $100,000 per physician group, or both, the levels at which incentives were found to be largely ineffective in Massachusetts. Second, unlike the natural experiment of this study, in which many physician groups had multiple incentive contracts with varying targets and thresholds, private health plans seeking greater impact for their performance incentives might need to align incentives for physician groups around a core set of quality targets. And, finally, given the financial and clinical risks of incentives, careful evaluations that include control or comparison groups should be an important part of future efforts to evaluate the impact of P4P programs. This studywould not have been possiblewithout the active engagement of the health plans and physician groups participating inMassachusettsHealth Quality Partners. Itwas funded by a grant fromthe RobertWood Johnson Foundation. The sponsor had no role in the design and conduct of the study; collection,management, analysis, and interpretation of the data; and preparation, review, or approval of themanuscript. Steven Pearson is a consultant forAmerica’sHealth Insurance Plans (AHIP).He,KenKleinman, andKathrynColtin are employees ofHarvard PilgrimHealth Care, one of the health plans involved in this study.None of the other authors has any potential conflict of interest to disclose. Pearson had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. NOTES 1. A.M. Epstein, “Paying for Performance in theUnited States and Abroad,” NewEngland Journal ofMedicine 355, no. 4 (2006): 406–408. See also P.McNamara, “Quality-Based Payment: Six Case Examples,” International Journal forQuality inHealthCare 17, no. 4 (2005): 357–362; and M.B. Rosenthal et al., “Paying forQuality: Providers’ Incentives for Quality Improvement,” Health Affairs 23, no. 2 (2004): 127–141. 2. Rosenthal et al., “Paying for Quality.” See also M.B. Rosenthal et al., “Pay for Performance in Commercial HMOs,” New England Journal ofMedicine 355, no. 18 (2006): 1895–1902. 3. Leapfrog Group, “Leapfrog Hospital Rewards Program,” https://leapfrog.medstat.com/rewards (accessed 10 January 2008). 4. J.K. Iglehart, “LinkingCompensation toQuality—Medicare Payments to Physicians,” NewEngland Journal of Medicine 353, no. 9 (2005): 870–872; and K. Milgate and S.B. Cheng, “Pay-for-Performance: The MedPAC Perspective,” Health Affairs 25, no. 2 (2006): 413–419. 5. See Epstein, “Paying for Performance in the United States and Abroad”; T. Fong, “Unfulfilled Potential: More Performance Pay Would Improve Care: NCQA,” Modern Healthcare 34, no. 39 (2004): 12; A.M. Epstein, T.H. Lee, andM.B. Hamel, “Paying Physicians for High-Quality Care,” New England Journal ofMedicine 350, no. 4 (2004): 406–410; and Institute ofMedicine, RewardingProvider Performance:Aligning Incentives in Medicare (Washington: National Academies Press, 2006). 6. See L.A. Petersen et al., “Does Pay-for-Performance Improve the Quality of Health Care?” Annals of Internal Medicine 145, no. 4 (2006): 265–272; M.B. Rosenthal and R.G. Frank, “What Is the Empirical Basis for Paying for Quality in Health Care?” Medical Care Research and Review 63, no. 2 (2006): 135–157; and M.B. Rosenthal et al., “Early Experience with Pay-for-Performance: From Concept to Practice,” Journal of the AmericanMedical Association 294, no. 14 (2005): 1788–1793. 7. R.A. Dudley, “Pay-for-Performance Research: How to Learn What Clinicians and PolicyMakers Need to Know,” Journal of the AmericanMedical Association 294, no. 14 (2005): 1821–1823. 8. M.W. Friedberg et al., “Does Affiliation of Physician Groups with One Another Produce Higher Quality Primary Care?” Journal ofGeneral InternalMedicine 22, no. 10 (2007): 1385–1392. 9. For details, see theMHQP home page at http://www.mhqp.org. 10. Friedberg et al., “Does Affiliation of PhysicianGroupswith One Another ProduceHigherQuality Primary Care?” 11. For a technical description of the analytic strategy and statistical methods, see online Technical Appendix A at http://content.healthaffairs.org/cgi/content/full/27/4/1167/DC1. 12. National Committee for Quality Assurance, The State of Health Care Quality, 2004 (Washington: NCQA, 2004). 13. Rosenthal et al., “Pay for Performance in Commercial HMOs.” 14. T. Doran et al., “Pay-for-Performance Programs in Family Practices in the United Kingdom,” New England Journal ofMedicine 355, no. 4 (2006): 375–384. 15. A.Mehrotra et al., “The Response of Physician Groups to P4P Incentives,” American Journal ofManaged Care 13, no. 5 (2007): 249–255. If you want to view the exhibits, please contact me at marcy@z-doc.com and I will send them directly.

No comments: