Introduction

The Transplant Amendment Act of 1990 requires the Scientific Registry of Transplant Recipients (SRTR) to report program-specific transplanted organ survival rates to the public. These program specific reports (PSRs), first reported in 1992 [1], are currently posted every 6 months on the SRTR website as a result of the 2000 Health and Human Services Final Rule stipulating the biannual public reporting of statistics and analyses of transplant center outcomes. Data used by the SRTR to evaluate center outcomes is obtained from the Organ Procurement and Transplantation Network (OPTN), Centers for Medicare and Medicaid Services (CMS), and the Social Security Administration Death Master File [reviewed in 2•]. Expected transplant center 1-year and 3-year graft and patient survivals for each type of organ transplanted are calculated using regression models that are risk adjusted [3]. Greater details concerning the methodology used by the SRTR can be found on their website [4]. PSR data are supplied to the OPTN Membership and Professional Standards Committee (MPSC), and used to identify transplant programs that may be underperforming and in need of improvement. The SRTR also supplies PSRs to the University of Michigan Kidney Epidemiology and Cost Center (UM-KECC) which, under contract, submits this data to CMS for use in its own quality evaluation process, to determine ongoing transplant program Medicare and Medicaid participation. Additionally, private payers also use SRTR PSRs to evaluate transplant centers for inclusion in their network of transplant service providers.

The OPTN MPSC determines which transplant programs are flagged as requiring greater scrutiny due to the possibility of underperformance using separate criteria for small (<10 transplants in 2.5 year patient cohorts) versus large centers (all other centers). The criteria for large centers are: observed – expected number of patient or graft losses > 3; observed/expected number > 1.5; significance of the difference between the observed and expected number of losses < 0.05 (two-sided P). The criterion for small centers is at least one event in the 2.5 year cycle and another event in the next cycle. CMS also uses the large center criteria for center Medicare and Medicaid participation, except that the significance of the difference between the observed and expected number of losses is determined using a one-sided P value [5].

Transplant centers experience significant consequences as a result of being flagged as a potentially underperforming program. Flagging triggers a program review by UNOS and/or CMS, possibly resulting in corrective action by UNOS or potential de-certification by CMS. A decrease in patient volume can occur, due to program exclusion by CMS or private payers from their provider network, and possible loss of referrals if candidates seek alternative destinations for their transplant services after reviewing the center PSR on the SRTR website. Over time, as increasing numbers of transplant centers have been flagged as potentially underperforming, the SRTR PSR itself has come under greater scrutiny.

Intended and Unintended Consequences of PSRs

Improvement in Post-Transplant Patient and Graft Survival

The SRTR PSRs provide data that serves a number of important functions, all aimed at improving transplant outcomes. The OPTN MPSC relies on this data to focus their efforts on identifying those centers that are underperforming. Identification of these centers is the first step towards effecting performance improvement, resulting in improved outcomes. Center inability to improve theoretically could result in the loss of UNOS membership, additionally improving national outcomes by avoiding organ allocation to a poorer performing center. Similarly, CMS can leverage participation in Medicare and Medicaid to either remediate or discontinue transplant programs it identifies as underperforming, also potentially improving national survival outcomes. Transplant candidates can access PSR data on the SRTR public website and may seek their transplant services from a center with superior survival outcomes. However, Howard et al. studied a US cohort of nearly 60,000 kidney recipients (1999–2002) over five successive reporting periods and concluded that public availability of PSRs did not correlate with patient center selection; although a difference was noted specifically for young (18–40 years old), and college-educated candidates [6]. Private payers use SRTR PSR data to determine transplant center participation in their network of service providers. A significant correlation has been reported between poorer 1 year graft survival rates and a decline in candidate registrations (predominantly deceased donor kidney candidates), especially those with private insurance [7]. Taken together, candidate and private payer activities also could potentially improve national survival outcomes by collectively diverting candidates to better performing centers. However, recent investigations suggest that the availability and use of SRTR generated PSRs has resulted in the unintended consequence of reducing transplant organ supply at the expense of efforts to improve outcomes. Additionally, actions taken by the UNOS, CMS, private payer or transplant candidates, will only lead to true improvement in national outcomes if the PSRs accurately identify underperforming programs requiring remediation to improve their outcomes.

Unintended Consequences of Center Underperformance Flagging

The original purpose of the PSRs was to identify transplant programs to be further scrutinized to determine whether outcomes should be improved. Such a purpose is best served by an instrument that will identify all underperforming centers (more sensitive) at the expense of including many centers that are performing adequately (less specific). However, there may be a detrimental impact on practice patterns for centers that are flagged, regardless of the final determination of performance status. There is grave concern that low performance evaluations result in provider efforts to improve their program outcomes, by avoidance of high risk candidates they perceive as not adequately risk adjusted by Cox models used to determine expected survivals reported in PSRs. Schold and colleagues reported the results of a survey of attendees at the Transplant Management Forum at the 2009 UNOS meeting, and found that low or near low PSR performance outcomes significantly correlated with changes in clinical practice regarding increased recipient and donor selection criteria [8•]. A recent study examined the relationship between center kidney transplant volumes and low PSR performance evaluations [9••]. Centers with low performance evaluations were more likely to have reduced transplant volumes, with a relatively larger decline in the use of marginal donor kidneys and recipients with private insurance, compared to other centers. The number of kidney and liver transplants performed by all centers in the USA between 1998 and 2006 had been steadily increasing. However, since 2006 the rate has stagnated [10]. Many have speculated that this reduction is a direct result of changes in provider clinical practice. It is well documented that kidney transplantation is associated with improved patient survival compared to remaining on dialysis for all high risk patient populations, even those considered to be of high cardiovascular risk [11, reviewed in 12•]. Thus, use of PSRs to drive improvement in survival outcomes may have come at the unintended expense of reducing organ supply and restricting access to kidney transplantation for higher risk candidates, even though transplantation offers them a significant survival advantage over dialysis.

Limitations of Cox Proportional Hazard Model Risk Adjustment

Accurate risk adjustment when calculating center specific patient and graft survival for the PSRs is of critical importance to avoid inappropriately penalizing centers that perform transplants on higher risk patients. Failure to do so will result in unnecessary flagging of adequately performing centers as underperforming, with the resultant adverse consequences to the center as described above. It has been previously pointed out that the c statistic, or the ability of a model to discriminate between patients or grafts that survived and those that did not, are quite low for PSR survival outcomes models [13•], between 0.66 and 0.68, with 0.5 indicating no predictive value (coin toss) and 1.0 indicating perfect predictive ability; when compared to those reported for others diseases, between 0.83 and 0.87 [14], suggesting there are important risk factors not taken into account in the SRTR PSRs. Previous studies have examined the impact of including additional recipient comorbidities not taken into account in the SRTR PSRs on survival outcomes [15•, 16, 17, 18••, 19]. Jassal et al. found that the presence of comorbid conditions was associated with poorer patient survival in a population of Canadian transplant recipients [15•]. Analyses of a single center outcome by Wu et al. found that increasing comorbidity correlated with an increased risk of death [16]. However, in both studies the Cox regression models excluded many covariates included in the SRTR generated models. Machnicki et al. examined the addition of three different indices of pre-transplant recipient comorbidity to Cox regression models of graft and patient survival, and concluded that the resulting increase in predictive value was not of practical importance [17]. This study was limited to primary, deceased donor kidney recipients with Medicare as primary payer with 9 year outcomes as the end point. A somewhat similar analysis of both deceased donor and living donor kidney recipients by Weinhandl and colleagues reported quite different results [18••]. This study also included only recipients with Medicare as primary payer. However, their models examined a single comorbidity index and importantly included covariates parameterized to mimic the SRTR survival analyses reported for the January 1, 2005 through to June 30, 2007 patient cohort. This study went a step further and analyzed the impact that comorbidity adjustment would have on center identification as underperforming, based on CMS criteria. They found that comorbidity is an important predictor of graft failure. Also, there was an improvement in Cox model fit when adjusting for comorbidity, with comorbidity adjustment resulting in a fluctuation of 8–9 % in the number of underperforming centers identified. We studied the impact of cardiovascular comorbidity indices adjustment on Cox proportional hazard models of graft survival in a single center patient cohort [19]. We found improvements of 10–13 % in the c statistic for the 1 year comorbidity adjusted survival models, indicating improved model fit. Use of the adjusted models would have changed the underperforming performance status of living donor kidney recipient 1 and 3 graft survivals when compared to the SRTR baseline models. Schold et al. examined SRTR data (1995–2005) and correlated individual center kidney candidate mortality rates with recipient outcomes. They found a significant association between higher candidate mortality rates, reduced risk adjusted patient and graft survivals and likelihood of a low performance evaluation using CMS criteria [20••]. They concluded that their results indicate there are factors unrelated to patient care, and not included in the PSR risk adjusted survival analyses and resultant low performance determinations, influencing survival outcomes.

The methodology used by the UNOS, MPSC, and CMS to identify underperforming centers has also been under investigation. It has been pointed out that the statistical models employed ensure that the possibility of a center being flagged as underperforming by random chance alone can reach 1 in 20. Because there are over 200 kidney transplant centers a not insignificant number of centers will be flagged by chance alone. Massie et al. studied rates of flagging due to statistical artifact using simulated transplant centers comprised of actual, primary deceased donor kidney recipients transplanted in the USA between 2004 and 2010 [21••]. In general, the simulations found that 9–10 % of their well performing virtual centers were falsely flagged as poor performing, and less than half of poor performing virtual centers were flagged in a given reporting period. They also found that large centers were at greatest risk for false flagging.

As the limitations of Cox modeling have become a source of concern, other methodologies for assessing transplant center performance have been recently evaluated. Neuberger and colleagues reviewed the potential advantages and disadvantages of cumulative sum (CUSUM) charts, funnel plots, cross-validation, in addition to regression methods which are used in PSR analyses [22•], without endorsing any particular method. CUSUM chart and standard regression methods applied to SRTR data for liver and kidney recipients of deceased donor organs transplanted between 2004 and 2007 were compared by Axelrod et al. [23•]. They found the CUSUM method quite similar to the regression method for low performance center identification, but found the CUSUM chart method identified low performance programs sooner than the regression method for both liver and kidney programs. Baseline risk adjusted, expected graft failures used for the CUSUM analyses were obtained from SRTR PSR models. At present this method seems better suited for internal center use for real-time monitoring of their transplant outcomes. Bayesian statistical approaches for assessing transplant performance are currently under investigation by the SRTR [24]. This method has the advantages of providing both an estimate of center performance plus a degree of certainty as to the accuracy of that estimate. Use of such models in medical care evaluations has been reviewed by Christiansen et al. [25].

PSRs Unintended Impact on Transplant Innovation

It has been reported that up to 10 % of kidney, liver, and heart transplant programs will be identified as underperforming by CMS when examining the January 1, 2005 to June 30, 2007 SRTR PSRs [26••]. The authors postulate that, in addition to previously mentioned comorbid conditions not risk adjusted in the Cox models used for PSRs calculations, there are innovative approaches used by some centers; such as performing incompatible transplants requiring desensitization (highly sensitized or ABO incompatible recipients) that are likely to have poorer survival outcomes (although superior to survival on dialysis), that also are not accounted for when risk adjusting. These centers may be penalized for their innovative approach due to an increased risk of being flagged as an underperforming center. Segev and colleagues reported a multicenter study comparing incompatible transplants with same-center compatible transplants, and found the risk of patient death, as well as death-censored graft loss, to be 1.6-fold to 2.4-fold higher for sensitized recipients [27]. It seems quite likely that transplant centers will choose to avoid attempting innovative approaches in the future due to concern it may adversely impact their performance evaluations. Abecassis et al. suggest exclusion of recipients in institutionally approved experimental protocols from SRTR outcomes analyses as one approach to allow innovation to continue to thrive [26••].

Conclusions

There remains a critical shortage of organs available for transplantation. The number of candidates on the UNOS kidney waiting list continues to increase. Many approaches have been undertaken in the past to maximize the number of available organs to meet the high demand. Initiatives by the OPTN to maximize organ procurement and utilization are examples. Maximizing survival outcomes after transplantation, in addition to being in itself a laudable goal, would increase organ availability by reducing the use of organs for re-transplantation after graft loss, making these organs available to other candidates. However, current data suggests that turning the spotlight on PSR driven center performance evaluations has had the unintended consequence of reducing transplant volumes by changing transplant provider behavior, resulting in a decrease in kidney and liver transplant volumes. Improved risk adjustment of existing Cox regression models, exclusion of recipients in experimental protocols, and/or utilization of a different methodology for calculating expected center outcomes reported in PSRs may alleviate this conundrum. Ultimately, it may not be feasible to impose regulation to both maximize organ utilization and transplant outcomes.