Data and Outcomes - U.S. Programs

Most early domestic published studies show either a positive, or a partially positive result (119, 182, 188, 197, 269, 273). These studies were done in hospitals and/or clinics, and a majority of these studies were done with measures based on acute myocardial infarction, congestive heart failure, and diabetes core measures. Literature analyzing multiple P4P trials shows at least some partial positive or positive results as well (18, 87, 115, 177, 197, 312). Meanwhile, mixed/neutral results were seen in two studies (87), and there were no studies showing negative quality improvement (worsening care) after P4P implementation. When negative results are present, they are demonstrated in the form of unintended consequences, as well as moral and ethical challenges. Many unintended consequences are hypothetical, but still very much a threat to P4P, and are best found in the "Controversial Issues" section of this website. It is important to note that there is a wealth of literature regarding the link between performance metrics and desired outcomes without added financial incentives. Because this website is dedicated to Pay for Performance, we have only included a few key articles from this area of research (51, 52, 120, 258).

Literature

Key Articles: 85, 87, 115, 182, 188, 316

Key Article

(18) Petersen LA, Woodard LCD, Urech T, Daw C, Sookanan S. Does Pay-for-Performance Improve the Quality of Health Care. Annals of Internal Medicine. 2006: 145(4) 265-272.

PMID: 16908917

Summary:

Summarizes 17 P4P effectiveness studies.
5/6 studies of physician-level incentives showed partial or positive effects on quality.
7/9 studies of provider group-level incentives showed partial or positive effects on quality.
1/2 studies of payment-system level incentives showed a positive effect on access to care. While the other showed negative effect on access to care for the sickest patients.
4 studies suggest unintended effects.

Significance to Literature:

In 2006, P4P appeared to have some initial positive outcomes and some unintended effects. Ongoing monitoring is recommended.

(51) Bradley EH, et al. Hospital Quality for Acute Myocardial Infarction. JAMA. 2006: 296(1) 72-78.

PMID: 16820549

Summary:

Authors sought to determine correlation between current process measures for acute MIs and 30-day post-MI mortality rates.
Found moderately strong correlations between some measures pertaining to pharmaceutical treatments and survival.
However, only 6% of hospital-level variation correlated with acute MI mortality rates; risk standardization was far more important in predicting 30-day mortality.

Significance to Literature:

CMS and JCAHO acute MI core process measures in 2002-2003 capture only a small proportion of the variation in the hospitals’ risk standardized short term mortality rates.

(52) Shojania et al. Effects of Quality Improvement Strategies for Type 2 Diabetes on Glycemic Control. JAMA. 2006: 296(4) 427-440.

PMID: 16868301

Summary:

A meta-regression analysis done across 66 trials investigating type 2 diabetes interventions, using HbA1C as the outcome standard.
Found team-based approach and case management produced the most effective outcome gains.

Significance to Literature:

While most QI strategies produced small to modest changes, team-based approaches and case management show the greatest improvements.

(57) Smith S. 2007 quality scores stall. Minnesota Medical Association: Quality review. Winter, 2008.

Link: http://www.mmaonline.net/Portals/mma/Publications/QualityReview/MMAQuali...

Summary:

In 2007, outcome scores did not increase in Minnesota for the first time in 4 years.
Thought to be because of increasing numbers of participating physicians; results no longer reflect only the most advanced practices.
Thought to be because of possible ceiling effect. “Lots of groups have done the easy things to improve quality.”

Significance to Literature:

Statewide report of sustained P4P over four years with possible ceiling effect.

Key Article

(85) Rosenthal MB, Landon BE, Normand SLT, Frank RG, Epstein AM. Pay for Performance in Commercial HMO’s. NEJM. 2006: 355:18 1895-1902.

PMID: 17079763

Summary:

Outlines the usage of P4P in 2006:
More than 50% of HMO’s using P4P, representing over 80% of persons enrolled.
Of the 126 health plans with P4P programs 90% reimburse physician, and 38% reimburse hospitals.
Matching patients with primary care providers was highly associated with use of P4P.

Significance to Literature:

Offers a comprehensive overview of the growth of P4P in commercial HMOs in 2006 and recommends CMS leverage their early experience.

Key Article

(87) Glickman et. al. Pay for Performance, Quality of Care, and Outcomes in Acute Myocardial Infarction. JAMA. 2007: 297(21): 2373-2380.

PMID: 17551130

Summary:

Seeks to answer what, if any, is the incremental benefit of P4P, and if there were any adverse effects from P4P.
An observational study of 105,383 patients with acute non-ST segment elevation myocardial infarction was completed to test effectiveness of P4P vs. non-P4P hospital process and outcomes.
Study showed no statistical difference in quality of care or outcomes between hospitals participating in P4P and hospitals not participating in P4P.
However, both groups did show a 5.2% increase in absolute improvement in meeting process based CRUSADE guidelines.
It was also noted that no adverse effects were observed due to P4P.

Significance to Literature:

3-year analysis of hospital based P4P showed no benefits or adverse effects directly attributable to P4P programs.

(110) Landon BE et. al. Quality of Care for the Treatment of Acute Medical Conditions in US Hospitals. Archives of Internal Medicine. 2006: 166 2511-2517.

PMID: 17159018

Summary:

Analysis of CMS and JCAHO performance data from 4,000 for-profit and not-for-profit hospitals in 2004.
Authors found that for profit hospitals had worse results, while not-for-profit hospitals had better results mainly attributable to investments in nurse staffing and technology.
Overall, 75.9% of patients hospitalized for acute MI, CHF, and/or pneumonia received recommended care.

Significance to Literature:

Care is influenced by payment model.

Key Article

(115) Rosenthal MB, Landon BE, Howitt K, Song HSR, Epstein AM. Climbing Up The Pay-For-Performance Learning Curve: Where Are The Early Adopters Now? Health Affairs. 2007: 26(6) 1674-1682.

PMID: 17978386

Summary:

Authors tracked 27 P4P programs across the United States that began prior to 2003, 24 of which still remained in 2007.
They found P4P is still mainly in primary care, but slowly expanding to other specialties as measures grow.
P4P programs sponsors cited measurement issues as the largest barrier to the inclusion of specialists.
Identifies risk adjustment as one of the next key issues.
Outcomes thus far have been positive, with possibly few negative side effects.
Also details a list of areas where improvement can be made.
Primary recommendation is the involvement of more physicians and hospitals.

Significance to Literature:

Tracks the progress of many prominent P4P programs over 3 years.

(119) Gilmore AS, et. al. Patient Outcomes and Evidence-Based Medicine in a Preferred Provider Organization Setting: A Six-Year Evaluation of a Physician Pay-for-Performance Program. Health Services Research 2007: 42(6) 2140-2159.

PMID: 17995557

Summary:

An observational study of a Blue Cross Blue Shield of Hawaii P4P program over a six year period comparing physicians who did or did not participate in the program.
Article analyzes total patients, quality indicators, performance measurements, and possible drawbacks/limitations.
Odds ratio for patients to receive recommended care from program-participating providers was 1.06-1.27 as compared to non-program participating providers.

Significance to Literature:

A positive case report of a P4P program over six years.

(120) Werner RM, Bradlow ET. Relationship Between Medicare’s Hospital Compare Performance Measures and Mortality Rates. JAMA. 2006: 296(22) 2694-2702.

PMID: 17164455

Summary:

3,657 hospitals were analyzed comparing mortality rates for acute myocardial infarctions, pneumonia, and heart failure between hospitals performing in the 25thpercentile vs. those performing in the 75th percentile.
Outcomes were similar for these two groups.
However, outcomes were risk-adjusted for comorbidities, age, race, ZIP-code level median income and education, sex, insurance status, and whether the admission was emergent or elective.
Although these results are statistically significant, the authors are careful to make sure that this should not discourage current efforts to improve quality.
Authors urge development of performance measures more tightly linked to outcomes.

Significance to Literature:

Performance as reported in Hospital Compare is not strongly linked to better patient outcomes when risk adjusted for medical and social factors. Hospital performance measures should be redesigned to better correlate with outcomes.

(177) Schatz M. Does pay-for-performance influence the quality of care? Current Opinion in Allergy and Clinical Immunology. 2008: 8 213-221.

PMID: 18560295

Summary:

A review, by a Kaiser Permanente of California physician, of 7 randomized control studies and 15 nonrandomized control studies of P4P programs dating back as far as 1990.
14 out of 15 nonrandomized studies showed at least some positive results, and less than half of the randomized studies showed positive results.

Significance to Literature:

P4P can improve quality markers, but not always.

Key Article

(182) Pearson SD, Schneider EC, Kleinman KP, Coltin KL, Singer JA. The Impact of Pay-for-Performance on Health Care Quality in Massachusetts, 2001-2003. Health Affairs. 2008: 27(4) 1167-1176.

PMID: 18607052

Summary:

Study in Massachusetts of statewide quality measurements and reporting systems from 2001-2003 to evaluate the performance impact of P4P by five major commercial health plans.
Overall strategy was to compare change in HEDIS performance over three years between incentivized physicians and non-incentivized physicians.
Improvement trends were similar throughout many HEDIS measures amongst incentivized physicians and their comparison group. Substantial improvement of HEDIS measures was seen across the board.
However, one P4P contract with a single medical group was associated with superior improvement for diabetes care.
The size of incentive did not show a relationship to magnitude of improvement.

Significance to Literature:

P4P can be viewed as an integral part of recent positive changes in medical practice, but current studies lack the ability to show direct effectiveness of P4P.

Key Article

(188) Mandel KE, Kotagal UR. Pay for Performance Alone Cannot Drive Quality. Archives of Pediatric Adolescent Medicine. 2007: 161(7) 650-655

PMID: 17606827

Summary:

Analysis of the impact of P4P within the asthma improvement collaborative amongst 44 pediatric practices in Cincinnati, OH.
Anthem Blue Cross Blue Shield gave practice-based rewards that were tiered based on performance, with the maximum potential to earn 7% fee increase.
- 43 practices qualified for bonuses
- 3 received a 2% increase
- 13 received a 4% increase
- 2 received a 5% increase
- 14 received a 5% increase
- 11 received a 7% increase
Overall, the percentage of the population receiving perfect care went from 4% to 88%
Provided 5 key design principles P4P programs should embrace for sustainable success.

Significance to Literature:

“P4P, when coupled with robust approaches to quality improvement, can be a catalyst to accelerate sustainable transformation among providers.”

(197) Rosenthal MB, Frank RG, Li Z, Epstein AM. Early Experience With Pay-for-Performance. From Concept to Practice. JAMA. 2005: 294(14) 1788-1793.

PMID: 16219882

Summary:

Addresses paying physicians according to performance improvement vs. overall performance scores.
Evaluated two quality improvement programs from July 2003-April 2004, one utilized P4P in California, the other was a similar demographic not using incentive payments.
Of the three like processes measured, the California P4P program demonstrated greater quality improvement in cervical cancer screenings.
Improvement occurred in all three areas in both groups.
Physician groups whose performance was initially the lowest improved the most.

Significance to Literature:

Incentivizing physicians may prove little gain in quality for the total dollars spent.

(252) Edwards, FH. How one medical specialty society’s use of measures and reporting dramatically improved patient care. The Journal of Family Practice. 2008: 57(10a) S6-S9.

Link: http://www.currentclinicalpractice.com/ccp_article.asp?a=1&ref=5710ACCP_...

Summary:

The Society of Thoracic Surgeons has kept a national database of performance statistics for 20 years with over 3 million patients, which has helped quality improvement initiatives for thoracic surgeons.
Risk adjustment has been a critical piece.
Physicians can sort through the data to find out where they rank amongst their colleagues.
“Database feedback lead to reduced CABG operative mortality from 1994-2003, in contrast to an expected increase due to heightened risks.”
The Society has urged congress to fund similar projects for the future.

Significance to Literature:

P4P must accurately measure data over a long period of time. This Society has shown that longitudinal performance data can be used to save lives.

(258) Fonarow et al. Association Between Performance Measures and Clinical Outcomes for Patients Hospitalized with Heart Failure. JAMA. 2007: 297(1) 61-70.

PMID: 17200476

Summary:

Article assesses the link between the 5 performance measures in hospitalized heart failure patients and mortality risk.
54% of heart failure patients met all five performance measures/indicators upon leaving the hospital.
None of the five quality indicators were associated with decreased 60 and 90 day mortality rates. Only ACE inhibitors or ARB’s were associated with improvements in the combined mortality/rehospitalization rates.
Authors also noted that Beta blockers (not one of the performance measures) demonstrated decreased mortality and rehospitalization in heart failure patients.

Significance to Literature:

Performance measures ought to be strongly associated with improved clinical outcomes prior to large-scale dissemination.

(269) Armour BS, et al. The Influence of Year-end Bonuses on Colorectal Cancer Screening. The American Journal of Managed Care. 2004: 10(9) 617-624.

PMID: 15515994

Summary:

Retrospective study using managed care plan claims from 2000 and 2001 which sought to examine explicit financial incentives to improve colorectal screenings in patients 50 years or older.
A $10,000 increase in income raises the probability of flexible sigmoidoscopy or colonoscopy screening approximately 2%.
Bonuses are more effective when targeted to individual physicians as opposed to a physician group.

Significance to Literature:

Cash bonuses to individual physicians can modestly increase screening of commercially insured patients.

(273) Robeznieks A. California Pay for Performance Program Achieving Results. Metro Doctors. September/October, 2005.

Summary:

Discusses the results of the Integrated Healthcare Association (IHA) P4P program in California which used 14 clinical markers and paid a total of $40 million dollars in bonuses.
P4P here was not designed to lower cost, but rather to improve quality.
IHA saw improvement statewide in all 14 clinical quality indicators.

Significance to Literature:

Early (published 2005) success story of one P4P program.

(312) Mehrotra A. Damberg CL, Sorbero ME, Teleki SS. Pay for Performance in the Hospital Setting: What is the State of the Evidence? American Journal of Medical Quality. 2009: 24(1) 19-28.

PMID: 19073941

Summary:

Article outlines 3 hospital based P4P programs and its effects on quality improvement by analyzing 8 published studies.
Authors found that participating hospitals had a 2- to 4- percentage point greater improvement than control hospitals when present in studies.
Authors question the need for monetary incentive if hospitals demonstrate that quality can be raised through public reporting mechanisms.

Significance to Literature:

P4P programs in hospitals are thus far limited, yet studies do show positive outcomes for existing programs, and few published unintended consequences.