Designing Measures

 

Designing measures is potentially the most important and difficult aspect of P4P. An overview of the complexities inherent to designing measures is provided both in our "Program Design and Implementation" section as well as by Rosenthal (98) and Landon (264). When designing measures, authors caution against using mortality as a quality measure (101236). Some authors support using medical knowledge and clinical judgment as measures of quality (114), while others emphasize the use of patient outcome surveys (148) and measures that are unbiased from industry (150). Finally, the controversy surrounding the use of process measures versus outcome measures is addressed (140147).

Additionally, controversy surrounds the use of P4P measures in complex cases with multiple comorbidities. Boyd (4) and argues that Clinical Practice Guidelines (CPGs) should not be used for older patients with multiple comorbid diseases. A recurrent theme in discussions over the design of P4P measures is that the measures must reinforce excellent medical care (5245) and cannot jeopardize the patient’s best interest. Werner (250) suggests that rewards could simply be offered for implementing quality improvement initiatives, and more recent research has focused on developing measures that more adequately represent the broad scope of providing high-quality care while reducing the administrative burden on health care workers (467).

 

Designing Measures Literature

 

(4) Boyd CM, et al. Clinical Practice Guidelines and Quality of Care for Older Patients With Multiple Comorbid Diseases. JAMA. 2005: 294(6) 716-724.

PMID: 16091574

Summary:

  • CPGs should not be used to assess P4P standards for elderly patients due to complex diseases and comorbidities.
  • Different quality of care standards should be used for older patients with complex comorbitities when designing P4P measures.

Significance to Literature:

P4P must question which guidelines it uses for assessment.

 

(5) O’Connor PJ. Adding Value to Evidence-Based Clinical Guidelines. JAMA. Editorial. 2005: 294(6) 741-743.

PMID: 16091578

Summary:

  • Editorial support of Boyd’s contribution in same journal. (4)
  • Addresses the limitations of CPGs, but argues they are still necessary.

Significance to Literature:

P4P measures must reinforce excellent medical care.

 

Key Article

(98) Rosenthal MB, Dudley RA. Pay-for-Performance: Will the Latest Payment Trend Improve Care? JAMA. Commentary. 2007: 297(7) 740-744.

PMID: 17312294

Summary:

  • The author highlights 5 Dimensions that must be considered when designing a P4P system including:
    • Who is paid
    • What is paid for
    • How much should be paid
    • How to ensure high quality
    • Should we prioritize underserved populations?
  • Provides overview of the current P4P structure and contentions.

Significance to Literature:

Overview of the controversies in design elements of P4P programs.

 

(101) Holloway RG, Quill TE. Mortality as a Measure of Quality: Implications for Palliative and End-of-Life Care. JAMA. Commentary. 2007: 298(7) 802-804.

PMID: 17699014

Summary:

  • Mortality has a been a measure of quality for many years
  • Author cautions that the use of mortality as a quality measure has many drawbacks since death can be a patient choice.
  • “Mortality is a poor quality measure for the majority of patients with multiple chronic diseases who are near the end of their life, and may be engaged in preference-sensitive decisions that result in an earlier (or less delayed) death.”

Significance to Literature:

Caution for mortality as a quality measure in the context of end of life care.

 

(114) Holmboe ES, Lipner R, Greiner A. Assessing Quality of Care: Knowledge Matters. JAMA. Commentary. 2008: 299(3) 338-340.

PMID: 18212320

Summary:

  • Physicians’ medical knowledge is the starting point for quality care.
  • In order to assess quality of care, physicians must first make accurate diagnoses.
  • Physician quality reporting must be comprehensive and include medical knowledge as an essential component of the quality calculus.

Significance to Literature:

Comprehensive performance measurement must include measures of medical knowledge and clinical judgment.

 

(140) Measuring the Performance of Performance Standards. Archives of Internal Medicine. Editorial. 2008: 168(4) 347-348.

PMID: 18299486

Summary:

  • Article is a response to a finding suggesting that current pneumonia guidelines are not the safest treatments.
  • All performance measures will have potential negative consequences that must be addressed when identified.
  • Groups must also be willing to accept modifications when measures are shown to have adverse reactions.

Significance to Literature:

Performance measures must be based on up-to-date high quality evidence.

 

(147) Lilford RJ, Brown CA, Nicholl J. Use of process measures to monitor the quality of clinical practice. British Medical Journal. 2007: 335 648-650.

PMID: 17901516

Summary:

  • Authors believe outcome measurements are neither sensitive nor specific enough to be used as measures of quality.
  • Risk adjustment cannot sufficiently remove various problems of bias in physician rankings.
  • Lists four reasons why process based measurements should be used:
    • Reduction of case mix bias
    • Lack of stigma
    • Prompt wider action
    • Useful for delayed outcomes

Significance to Literature:

Process measures should be used to monitor quality.

 

(148) Delamothe T. What counts? British Medical Journal. Editor’s Choice. 2008: 336 (7638).

Link: http://www.bmj.com/cgi/content/full/336/7638/0

Summary:

  • Response to Nigel Hawkes “How do we get the measure of patient care?” article (149).
  • Process-based measures and patient reported outcomes are emerging as the way to measure quality in England.
  • More countries, including the United States, are studying the NHS to learn how to improve health care.

Significance to Literature:

The British are leading the case for process based measures and patient outcome surveys.

 

Key Article

(150) Rose J. Industry Influence in the Creation of Pay-for-Performance Quality Measures. Quality Management in Health Care. 2008: 17(1) 27-34.

PMID: 18204375

Summary:

  • Article examines which organizations influence the standards for P4P; article identifies the NCQA and the AMA-PCPI. Author says we must examine their motives, and potential for industry influence.
  • Article cites examples where drug companies have influenced clinical practice guidelines (CPGs) in the past, and how that can be problematic.
  • Author argues that experts who set the CPGs should not have ties to drug companies.
  • Mentions using Britain’s NICE program as a model for the government to establish a board that creates CPGs not influenced by industry.

Significance to Literature:

Successful P4P relies on choosing measures unbiased by industry influence.

 

(236) Satin DJ. A Conversation With David Satin M.D. Colleague Interview Metro Doctors. September/October 2008.

Link: http://www.ehcca.com/presentations/hcii1/1_06.pdf

Summary:

  • Family physician/ethicist answers questions about P4P.
  • Recommends against any metrics of ethical clinician behavior.
  • Recommends against public reporting of individual clinician performance rather than aggregate practice performance.
  • Defends the concept of risk-adjusted efficiency measures that truly measure cost/outcome.

Significance to Literature:

Clinician/ethicist examines controversial issues surrounding P4P in 2008.

 

Key Article

(239) Higashi T. et al. Relationship between Number of Medical Conditions and Quality of Care. NEJM. 2008: 356(24) 2496-2504.

PMID: 17568030

Summary:

  • Authors analyzed 7,500 patients from three separate studies on disease-specific technical quality of care. 956 of which had three or more conditions.
  • “Linear regression analysis showed that each additional condition was associated with a 2.2% increase in the quality score,” in one study, and a 1.7% increase in each of the other two studies.
  • The relationship between increased and the number of conditions was less significant if the patient had more than seven conditions, or did not see a specialist.

Significance to Literature:

“This finding does not provide support for the argument that incentive programs based on quality indicators of care processes will necessarily penalize providers who provide care to patients with multiple conditions."

 

(245) Shaneyfelt TM, Centor RM. Reassessment of Clinical Practice Guidelines: Go Gently Into That Good Night. JAMA. Editorial. 2009: 301(8) 868-869.

PMID: 19244197

Summary:

  • Author reports, “Most current ‘guidelines’ are actually expert consensus reports.” 48% of the time, these recommendations are based on Level C (expert opinion) evidence.
  • Guidelines are increasingly becoming bias due to indirect financial reimbursement to guideline authors.
  • Guideline development must become centralized, transparent, and flexible.

Significance to Literature:

P4P is heavily reliant on clinical practice guidelines, therefore the guidelines must be in the best interest of the patient.

 

(250) Werner RM, McNutt R. A New Strategy to Improve Quality: Rewarding Actions Rather Than Measures. JAMA. Commentaries. 2009: 301(13) 1375-1377.

PMID: 19336714

Summary:

  • Highlights limitations of the current approach to improving quality.
  • There is no agreement about the ultimate goal of improvement.
  • Current approaches assume generalized solutions to all clinicians.
  • Measures should focus on individualized improvement.
  • Local teams could identify problems and offer solutions, then implement solutions.

Significance to Literature:

P4P programs should offer incentives for quality improvement efforts rather than incentives based on comparison to other groups performance results.
 

(264) Landon BE, Normand SLT, Blumenthal D. et al. Physician Clinical Performance Assessment: Prospects and Barriers. JAMA. 2003: 290(9) 1183-1189.

PMID: 12953001

Summary:

  • Many important technical barriers prevent the use of physician clinical performance assessment for evaluating competency of individual physicians, including but not limited to:
    • Lack of Evidence-based measures for many specialties
    • Challenges in defining thresholds for acceptable care
    • Feasibility of standardized data collection process
    • Inadequate sample size
    • Patient confounders
    • Outcomes not attributable to the individual physician
    • Lack of representativeness
    • Feasibility and costs

Significance to Literature:

Author cites early problems with achieving ideal performance measures.

 

(346) Rajaram R, Barnard C, Bilimoria KY. Concerns About Using the Patient Safety Indicator-90 Composite in Pay-for-Performance Programs. JAMA.2015;313(9):897-898. doi:10.1001/jama.2015.52.

PMID: 25654581

Summary:

  • In 2003 the Agency for Healthcare Research and Quality (AHRQ) released 20 patient safety indicators (PSI) to measure adverse events
  • CMS uses PSI-90, a composite score of eight weighted PSI measurements, as a core metric in pay-for-performance programs
  • The author identifies five problems with the current PSI-90 measure including (1) flawed component measures, (2) clinical areas targeted, (3) accuracy of adverse events, (4) adequacy of risk adjustment, and (5) formulation of the composite measure.
  • The author provides several ways composite measures such as PSI-90 may be improved

Significance to Literature:

Author identifies problems and solutions to CMS’ use of PSI-90 in pay-for-performance programs

 

(414) Antos JR. If We Pay for Value, Will We Get It? J Ambul Care Manage. 2016 Apr-Jun;39(2):108-10. doi: 10.1097/JAC.0000000000000146.

PMID: 26945289

Summary:

  • An overview on the lack of agreement on what value in healthcare is, how to produce greater value, and how to identify when value has been achieved
  • Author points out how policy leaders often ignore the viewpoint of patients on what quality and value in healthcare looks like
  • This will become increasingly important as health insurance shifts more financial responsibility to consumers
  • Author also cautions against “measurement for measurement’s sake” due to the burden placed on providers and the potential shift in attention away from non-incentivized areas which may impact patient outcomes more significantly
  • The Incentivizing Health Care Quality Outcomes Act of 2014 attempts to combat this as it would “require Medicare to use a uniform outcome-based quality measurements system”, intending to “change the emphasis from measuring performance to producing value”

Significance to Literature:

It is becoming an increasingly important challenge to effectively “mesh the differing perspectives of payers, providers, and consumers on what constitutes value in healthcare”

 

(415) Berenson RA. Improving Performance, Not Just What’s Measured: Does the Inpatient Prospective Payment System Provide Useful Lessons? J Ambul Care Manage. 2016 Apr-Jun;39(2):111-4. doi: 10.1097/JAC.0000000000000141.

PMID:  26945290

Summary:

  • Response to article by Averill et al. (450) in the same issue
  • Author shares the ideas expressed in the paper regarding:
    • The necessity of shifting of payment policy focus from process to outcome measures, such as “potentially preventable events” (PPEs)
    • The emphasis of measurement at the organizational, not individual, levels
    • The success of the Inpatient Prospective Payment System (IPSS)
  • However, the author disagrees with Averill and colleagues’ notion that IPSS actually focuses on outcomes (rather is a bundled payment that could incentivize good outcomes) and argues that Averill too easily glides over the difficulties associated with using PPEs to compare hospitals

Significance to Literature:

Author argues “that the lessons of the successful Inpatient Prospective Payment System [Averill and colleagues] detail do not support their rationale for endorsing even a better version of pay-for-performance.”

 

(416) Nerenz DR. Challenges in Moving to “Pay for Outcomes.” J Ambul Care Manage. 2016 Apr-Jun;39(2):122-4. doi: 10.1097/JAC.0000000000000128.

PMID: 26945294

Summary:

  • Response to article by Averill et al. (450) in the same issue
  • Author provides questions that need to be answered regarding the simulation and payment approach presented, which include:
    • What about the costs of improvement?
    • Are the quality improvements real, or just coding changes?
    • Who or what is the accountable entity?
    • What about risk adjustment?

Significance to Literature:

The potential benefits of financial incentive programs tied to outcome measures are obvious, “but so are difficulties and challenges.”

 

(450) Averill RF et al. Rethinking Medicare Payment Adjustments for Quality. J Ambul Care Manage. 2016 Apr-Jun;39(2):98-107. doi: 10.1097/JAC.0000000000000137

PMID: 26945288

Summary:

  • Payment reforms have largely been focused on following process measures which has resulted in a highly complex attempt to measure value
    • IOM has observed that “thousands of measures are in use today to assess health and health care” but “their sheer number, as well as their lack of focus, consistency, and organization, limits their overall effectiveness in improving performance of the health system”
  • Authors indicate lessons from the Inpatient Prospective Payment System (IPPS) and its use of Diagnosis Related Groups (DRGs) should be used to guide quality measurement. These lessons include: focus on outcomes, set national standards, be clinically meaningful, create the right incentives
  • The Incentivizing Health Care Quality Outcomes Act of 2014 looks to replace Medicare’s disjointed quality measurement system focused on process measures with a coordinated outcome-based system for all health care delivery organizations
    • The Outcomes Act focuses on five potentially preventable events (PPEs) which represent the majority of preventable expenditures including: complications, admissions, readmissions, emergency department visits, and outpatient procedures and diagnostic tests
  • Many state Medicaid outcome-based payment reforms are consistent with the Outcomes Act have yielded real results
    • e.g. In the first three years of a potentially preventable readmission project by the Minnesota Hospital Association, readmissions have been reduced by 19%
  • Simulation was performed to estimate savings for reducing PPEs by different amounts
    • e.g. Lowering PPEs by 30% over five years resulted in an estimated 0.88% reduction in Medicare payments ($5.1 billion) and a 1.38% reduction in hospital operating cost ($7.9 billion)

Significance to Literature:

The Health Care Quality Outcomes Act of 2014, and similar state programs, hope to improve quality by focusing on outcomes (measured through bundled payments and potentially preventable events) instead of process measures.

 

(463) Gottlieb LM, Fracis DE, Beck AF. Uses and Misuses of Patient- and Neighborhood-level Social Determinants of Health Data. The Permanente Journal. 2018

PMID:  30227912

Summary:

Four case studies illustrate the strengths and limitations of using patient or neighborhood-level socio-economic data in health care

  • One case collected patient social determinants data by adding new responsibilities to the health care team and undertaking interventions at the individual level,
  • A second case relied on existing patient-level EHR data to shape a neighborhood-level intervention
  • A third case used alerts from the EHR to identify a child hospitalized from a high-risk neighborhood and subsequently developed individualized interventions.
  • A fourth case used neighborhood-level data on food security, safety, etc. to produce neighborhood-level interventions

Significance to Literature:

Studying the various ways socio-economic data can be measured and used can help healthcare organizations, clinicians, and policymakers better tailor research and interventions to the needs of patient populations.

 

(464) MacLean CH, Kerr EA, Qaseem A. Time Out – Charting a Path for Improving Performance Measurement. The New England Journal of Medicine. 2018

PMID:  29668361

Summary:

  • Analysis of the validity of 86 ambulatory general medicine-related quality measures on the Medicare Merit-based Incentive Payment System (MIPS)/Quality Payment Program (QPP) list
  • Using 5 criteria (Importance, Appropriateness, Clinical Evidence, Specifications, and Feasibility) developed by the American College of Physicians, authors assessed the validity of performance measures and found:
    • 32/86 (37%) passed all 5 criteria
    • 30/86 (30%) did not pass all 5 criteria
    • 24/86 (28%) uncertain as to whether they pass all 5 criteria
  • The most common criteria failed were insufficient evidence and inadequately specified exclusions that would result in a process or outcome occurring across broad groups of patients (including ones that might not benefit)

Significance to Literature:

Inconsistencies exist regarding the judgments of MIPS measures’ validity. Authors call for a “time-out” to reassess performance measurements.

 

Key Article

(467) Etz et al. A New Comprehensive Measure of High-Value Aspects of Primary Care. Annals of Family Medicine. 2019

PMID:  31085526

Summary:

  • Utilized crowd-sourced survey responses of patients, primary care clinicians, and payors to develop a parsimonious set of items that identifies which processes constitute “high-value primary care”
  • The resulting Person-Centered Primary Care Measure (PCPCM) contains 11 domains each represented by a single item, with domains including, but not limited to: accessibility, community context, coordination, health promotion, and goal-oriented care
  • Early validity testing found the PCPCM to be reliable, comprehensive, and not overly onerous

Significance to Literature:

The PCPCM may more adequately reflect the wide array of factors associated with quality primary care while simultaneously reducing the burden of reporting and data collection on practitioners and staff members.