For more than two decades, strong evidence has indicated variation in the quality of cancer care in the United States.119 As a result, measurements and audits are necessary to search for gaps in the quality of care. Toward this end, multiple professional organizations have developed condition-specific quality measures (QMs) to assess the clinical performance surrounding the patient-provider encounter.

Quantification of performance can identify variation and opportunities for improvement. If performance assessment is followed by performance comparison among peers (i.e., benchmarking) coupled with transparency among providers, physicians who find themselves in the lower tiers of performance can be motivated to improve, ultimately yielding better overall care at the population level, a phenomenon that recently has been reviewed and demonstrated by several programs.2026

This report aims to describe how the American Society of Breast Surgeons (ASBrS) ranked and defined measures of quality of care and subsequently provided benchmarking functionality for its members to compare their performances with each other. By separate investigations, the actual performance demonstrated by our ASBrS membership for compliance with nine breast surgeon-specific QMs are reported.

Founded in 1995, the ASBrS is a young organization. Yet, within 20 years, membership has grown to more than 3000 members from more than 50 countries. A decade ago, the Mastery of Breast Surgery Program (referred to as “Mastery” in this report) was created as a patient registry to collect quality measurement data for its members.27

Past President Eric Whitacre, who actually programmed Mastery’s original electronic patient registry with his son Thomas, understood that “quality measures, in their mature form, did not merely serve as a yardstick of performance, but were a mechanism to help improve quality.”28,29 Armed with this understanding, the ASBrS integrated benchmarking functionality into Mastery, thus aligning the organization with the contemporary principles of optimizing cancer care quality as described by policy stakeholders.2,19,25,30

In 2010, Mastery was accepted as a Center for Medicaid and Medicare Services (CMS) Physicians Qualified Reporting Service (PQRS) and then as a Qualified Clinical Data Registry (QCDR) in 2014, linking provider performance to government reimbursement and public reporting.31 Surgeons who successfully participated in Mastery in 2016 will avoid the 2018 CMS “payment adjustment” (2% penalty), a further step toward incentivizing performance improvement in tangible ways.

Methods

Institutional Review Board

De-identified QM data were obtained with permission from the ASBrS for the years 2011–2015. The Institutional Review Board (IRB) of the Gundersen Health System deemed the study was not human subjects’ research. The need for IRB approval was waived.

Choosing, Defining, and Vetting QM

From 2009 to 2016, the Patient Safety and Quality Committee (PSQC) of the ASBrS solicited QM domains from its members and reviewed those of other professional organizations.3239 As a result, as early as 2010, a list of more than 100 domains of quality had been collected, covering all the categories of the Donabedian trilogy (structure, process, and outcomes) and the National Quality Strategy (safety, effectiveness, efficiency, population health, care communication/coordination, patient-centered experience).40,41 By 2013, a list of 144 measures underwent three rounds of modified Delphi process ranking by eight members of the PSQC, using a RAND/UCLA Appropriateness Methodology, which replicated an American College of Surgeons effort to rank melanoma measures and was consistent with the National Quality Forum’s guide to QM development42,43 (Tables 1, 2). During the ranking, quality domains were assigned a score of 1 (not valid) to 9 (valid), with a score of 5 denoting uncertain/equivocal validity. After each round of ranking, the results were discussed within the PSQC by email and phone conferences. At this time, arguments were presented for and against a QM and its rank. A QM was deemed valid if 90% of the rankings were in the range of seven to nine.

Table 1 Instructions of the American Society of Breast Surgeons for ranking of quality measure domains
Table 2 Hierarchy of quality domains for breast surgeons after the 3rd round of modified Delphi ranking

After three rounds of ranking ending in December 2013, nine of the highest ranked measures were “specified” as described and required by CMS44 (Table 3). Briefly, exclusions to QM reporting were never included in the performance numerator or denominator. Exceptions were episodes in which performance for a given QM was not met but there was a justifiable reason why that was the case. If so, then the encounter, similar to an exclusion, was not included in the surgeon’s performance rate. If an encounter met performance criteria despite typically meeting exception criteria, the encounter was included in the performance rate. Per CMS rules, each QM was linked to a National Quality Strategy Aim and Domain (Table 3). The QMs also were assigned to a Donabedian category and to one or more of the Institute for Healthcare Improvement’s “triple aims.”40,45

Table 3 American Society of Breast Surgeons Quality Measure Specifications for participation in the Center for Medicaid and Medicare Services Qualified Clinical Data Registry55,56

Each of our QMs underwent vetting in our electronic patient registry (Mastery) by a workgroup before submission to CMS. During this surveillance, a QM was modified, retired, or advanced to the QCDR program based on member input and ASBrS Executive Committee decisions.

Patient Encounters

To calculate the total number of provider-patient-measure encounters captured in Mastery, we summed the total reports for each individual QM for all study years and all providers who entered data.

Benchmarking

Each surgeon who entered data into Mastery was able to compare his or her up-to-date performance with the aggregate performance of all other participating surgeons (Fig. 1). The surgeons were not able to access the performance metrics of any other named surgeon or facility.

Fig. 1
figure 1

Example of real-time peer performance comparison after surgeon entry of quality measures

Data Validation

In compliance with CMS rules, a data validation strategy was performed annually. A blinded random selection of at least 3% of QCDR surgeon participants was conducted. After surgeons were selected for review, the ASBrS requested that they send the ASBrS electronic and/or paper records to verify that their office/hospital records supported the performance “met” and “not met” categories that they had previously reported to the ASBrS via the Mastery registry.

Results

Hierarchical Order and CMS QCDR Choices

The median ranking scores for 144 potential QMs ranged from 2 to 9 (Table 2). The nine QMs chosen and their ranking scores were appropriate use of preoperative needle biopsy (9.0), sentinel node surgery (9.0), specimen imaging (9.0), specimen orientation (9.0), hereditary assessment (7.0), mastectomy reoperation rate (7.0), preoperative antibiotics (7.0), antibiotic duration (7.0), and surgical-site infection (SSI) (6.0). The specifications for these QMs are presented in Table 3. The mastectomy reoperation rate and SSI are outcome measures, whereas the remainder are process of care measures.

QM Encounters Captured

A total of 1,286,011 unique provider-patient-measure encounters were captured in Mastery during 2011–2015 for the nine QCDR QMs. Performance metrics and trends for each QM are reported separately.

Data Validation

The QM reporting rate of inaccuracy by surgeons participating in the 2016 QCDR data validation study of the 2015 Mastery data files was 0.82% (27 errors in 3285 audited patient-measure encounters). Subsequent reconciliation of discordance between surgeon QM reporting and patient clinical data occurred by communication between the ASBrS and the reporting provider.

CMS Acceptance and Public Transparency

The Center for Medicare and Medicaid Services accepted the ASBrS QM submitted to them for PQRS participation in 2010–2013 and for QCDR in 2014–2016. In 2016, they discontinued the specimen orientation measure for future reporting and recommended further review of the mastectomy reoperation rate measure. Public reporting of 2015 individual surgeon QCDR data was posted in 2016 on the ASBrS website.

Security

To our knowledge, no breaches have occurred with any surgeon-user of Mastery identifying the performance of any other surgeon or the identity of any other surgeon’s patients. In addition, no breaches by external sources have occurred within the site or during transmission of data to CMS.

Discussion

Modified Delphi Ranking of QM

To provide relevant QM for our members, the PSQC of the ASBrS completed a hierarchal ranking of more than 100 candidate measures and narrowed the collection of QMs to fewer than a dozen using accepted methods.42,43 Although not reported here, the same process was used annually to identify new candidate QMs from 2014 to 2017 for future quality payment programs and to develop measures for the Choosing Wisely campaign.46 Based on our experience, we recommend its use for others wanting to prioritize longer lists of potential QM domains into shorter lists. These lists are iterative, allowing potential measures to be added anytime, such as after the publication of clinical trials or after new evidence-based guidelines are developed for better care. In addition, with the modified Delphi ranking process, decisions are made by groups, not individuals.

After Ranking, What Next?

Of the nine QMs selected for submission to CMS, only four had the highest possible ranking score. The reasons for not selecting some highly ranked domains of care included but were not limited to the following concerns. Some QMs were already being used by other organizations or were best assessed at the institutional, not the surgeon, level, such as the use of radiation after mastectomy for node-positive patients.3236 Other highly ranked measures, such as “adequate history,” were not selected because they were considered standard of care.

Contralateral prophylactic mastectomy rates, a contemporary topic of much interest, was not included in our original ranking, and breast-conserving therapy (BCT) was not ranked high due to our concern that both were more a reflection of patient preferences and of regional and cultural norms than of surgeon quality. A lumpectomy reoperation QM was ranked high (7.5), but was not chosen due to disagreement within the ASBrS whether to brand this a quality measure.47,48 In some cases, QMs with lower scores were selected for use for specific reasons. For example, by CMS rules, two QMs for a QCDR must be “outcome” measures, but all our highest ranked measures were “process of care” measures.

There was occasional overlap between our QM and those of other organizations.21,3239 In these cases, we aimed to harmonize, not compete with existing measures. For example, a patient with an unplanned reoperation after mastectomy would be classified similarly in both the National Surgical Quality Improvement Program (NSQIP) and our program. In contrast to NSQIP, we classified a patient with postoperative cellulitis as having an SSI. Because excluding cellulitis as an SSI event has been estimated to reduce breast SSI rates threefold, adoption of the NSQIP definition would underestimate the SSI burden to breast patients and could limit improvement initiatives.49

Governance

Ranking and specifying QMs is arduous. Consensus is possible; unanimous agreement is rare. Therefore, a governance structure is necessary to reconcile differences of opinion. In our society, the PSQC solicits, ranks, and specifies QMs. A workgroup vets them for clarity and workability. In doing so, the workgroup may recommend changes. The ASBrS Executive Committee reconciles disputes and makes final decisions .

Reporting Volume

Our measurement program was successful, capturing more than 1 million provider-patient-measure encounters. On the other hand, our member participation rate was less than 20%. By member survey (not reported here), the most common reason for not participating was “burden of reporting.”

Benchmarking

“Benchmarking” is a term used most often as a synonym for peer comparison, and many programs purport to provide it.25 In actuality, benchmarking is a method for improving quality and one of nine levers endorsed by the National Quality Strategy to upgrade performance.21,23,30,50 Believing in this concept, the ASBrS and many other professional societies built patient registries that provided benchmarking.21,25,3235 In contradistinction, the term “benchmark” refers to a point of reference for comparison. Thus, a performance benchmark can have many different meanings, ranging from a minimal quality threshold to a standard for superlative performance.24,36

Program strengths

Our patient registry was designed to collect specialty-specific QMs as an alternative to adopting existing general surgical and cross-cutting measures. Cross-cutting measures, such as those that audit medicine reconciliation or care coordination, are important but do not advance specialty-specific practice. Furthermore, breast-specific measures lessen potential bias in the comparison of providers who have variable proportions of their practice devoted to the breast. Because alimentary tract, vascular, and trauma operations tend to have higher morbidity and mortality event rates than breast operations, general surgeons performing many non-breast operations are not penalized in our program for a case mix that includes these higher-risk patients. In other words, nonspecialized general surgeons who want to demonstrate their expertise in breast surgery can do so by peer comparison with surgeons who have similar case types in our program. In addition, a condition-specific program with public transparency allows patients to make more informed choices regarding their destination for care. In 2016, individual provider report-carding for our participating surgeons began on the “physician-compare” website.51

Another strength of an organ-specific registry is that it affords an opportunity for quick Plan-Do-Study-Act (PDSA) cycles because personal and aggregate performance are updated continuously. Thus action plans can be driven by subspecialty-specific data, not limited to expert opinion or claims data. For example, a national consensus conference was convened, in part, due to an interrogation of our registry that identified wide variability of ASBrS member surgeon reoperation rates after lumpectomy.52,53 Other program strengths are listed in Table 4.

Table 4 Strengths and limitations of the American Society of Breast Surgeons quality measurement program

Study Limitations

Although risk-adjusted peer comparisons are planned, to date, we are not providing them. In addition, only the surgeons who participate with CMS through our QCDR sign an “attestation” statement that they will enter “consecutive patients,” and no current method is available for cross-checking the Mastery case log with facility case logs for completeness. Recognizing that nonconsecutive case entry (by non-QCDR surgeons) could alter surgeon performance rates, falsely elevating them, one investigation of Mastery compared the performance of a single quality indicator between QCDR- and non–QCDR-participating surgeons.52 Performance did not differ, but this analysis has not been performed for any of the QMs described in this report. Surgeons also can elect to opt out of reporting QMs at any time. The percentage of surgeons who do so due to their perception of comparatively poor performance is unknown. If significant, this self-selected removal from the aggregate data would confound overall performance assessment, falsely elevating it.

Another limitation is our development of QMs by surgeons with minimal patient input and no payer input. As a result, we cannot rule out that these other stakeholders may have a perception of the quality of care delivered to them that differs from our perception. For example, patients might rank timeliness of care higher than we did, and payers of care might rank reoperations the highest, given its association with cost of care. We may not even be measuring some domains of care that are most important to patients because we did not uniformly query their values and preferences upfront during program development, as recommended by others.2,54 See Table 4 for other limitations.

Conclusion

In summary, the ASBrS built a patient registry to audit condition-specific measures of breast surgical quality and subsequently provided peer comparison at the individual provider level, hoping to improve national performance. In 2016, we provided public transparency of the 2015 performance reported by our surgeon participants.55,56 In doing so, we have become stewards, not bystanders, accepting the responsibility to improve patient care. We successfully captured more than 1 million patient-measure encounters, participated in CMS programs designed to link reimbursement to performance, and provided our surgeons with a method for satisfying American Board of Surgery Maintenance of Certification requirements. As public and private payers of care introduce new incentivized reimbursement programs, we are well prepared to participate with our “tested” breast-specific QMs.