Defining minimum volume thresholds to increase quality of care: a new patient-oriented approach using mixed integer programming

A positive relationship between treatment volume and outcome quality has been demonstrated in the literature and is thus evident for a variety of procedures. Consequently, policy makers have tried to translate this so-called volume–outcome relationship into minimum volume regulation (MVR) to increase the quality of care—yet with limited success. Until today, the effect of strict MVR application remains unclear as outcome quality gains cannot be estimated adequately and restrictions to application such as patient travel time and utilization of remaining hospital capacity are not considered sufficiently. Accordingly, when defining MVR, its effectiveness cannot be assessed. Thus, we developed a mixed integer programming model to define minimum volume thresholds balancing utility in terms of outcome quality gain and feasibility in terms of restricted patient travel time and utilization of hospital capacity. We applied our model to the German hospital sector and to four surgical procedures. Results showed that effective MVR needs a minimum volume threshold of 125 treatments for cholecystectomy, of 45 and 25 treatments for colon and rectum resection, respectively, of 32 treatments for radical prostatectomy and of 60 treatments for total knee arthroplasty. Depending on procedure type and incidence as well as the procedure’s complication rate, outcome quality gain ranged between 287 (radical prostatectomy) and 977 (colon resection) avoidable complications (11.7% and 11.9% of all complications). Ultimately, policy makers can use our model to leverage MVR’s intended benefit: concentrating treatment delivery to improve the quality of care.


Introduction
The positive relationship between treatment volume and outcome quality as well as its theory have been analyzed by researchers for four decades [1,2]. Empirical studies and systematic reviews discuss limitations such as the type of analyzed quality indicators (QI), methods for risk adjustment, varying data sources and patient data samples, and limited geographical scope, yet their results confirm the volume-outcome relationship across various procedures [3][4][5][6][7][8].
Thus, research focus has gradually shifted to finding the underlying reasons of the volume-outcome relationship [3,[9][10][11][12]. Moreover, studies investigate how the volume-outcome relationship can be translated into effective minimum volume regulation (MVR), in which minimum volume thresholds (MVT) are set normatively as precondition for hospitals to perform procedures and to claim reimbursement.

3
Approaches and findings of existing studies can be summarized as follows.
• Simulation of singular effects Strict MVR application affects several dimensions relevant for patient care. The most commonly discussed dimensions are patient travel time, hospital capacity, the level of centralization expressed as the share of affected hospitals and patients, and the change in system level outcome quality [13][14][15][16][17][18][19][20][21][22][23]. So far, studies have investigated the effect of strict MVR application on only one and in some cases up to three of the above dimensions. In contrast, we designed a model that simulates the simultaneous impact of strict MVR application on all of the above dimensions. • Assumptions for patient choice Once a low-volume hospital is excluded from supply due to strict MVR application, patients of this hospital must find a new hospital, i.e., they must be 'reassigned'. The assumption made for how reassigned patients choose a new hospital dictates the simulated change in patient travel time, hospital capacity utilization and outcome quality. When reassigning patients, it is usually assumed that reassigned patients choose a new hospital based on proximity [14,16,17]. In contrast, we propose a model that allows patients to make a hospital choice based on maximizing outcome quality, i.e., the expected utility of receiving treatment, as well as travel time. • Applied methods and potential pitfalls Generally, different statistical approaches such as the value of acceptable risk limit or risk gradient can be used to find suitable MVTs [20,24,25]. Nimptsch and Mansky [20], for instance, define the MVT as the number of procedures needed to achieve a complication risk below the observed average complication rate. Potential pitfalls of this and other statistical approaches are that the defined MVT remains somewhat arbitrary and that its definition is often one dimensional. For instance, it is very difficult to argue, why an MVT should be set to increase system level outcome quality to a level X (e.g., the observed average) and not Y. In addition, these approaches optimize only one dimension, outcome quality, and they are not tested towards their practical applicability. It remains unclear whether patients might be willing to choose other hospitals and potentially travel longer to receive treatment. It further remains unclear whether the remaining hospital treatment capacity after MVR application is sufficient to satisfy all treatment needs. • Application practice MVR has been introduced for a range of specialized (surgical) procedures in a number of countries [26,27]. Research investigating MVR application has shown, however, that existing MVR has not yet been applied strictly which is at least partially due to exceptions claimable by hospitals for non-compliance [26,[28][29][30][31][32]. Policy makers might have to weaken strict MVR application due to the dubiety of its effects [33].
With our model, we aim to clarify the effects of strict MVR application. To this end, we simulate the feasibility and estimate the utility of strict MVR application by answering the following questions.
1. Can MVR be strictly applied while retaining patient disutility in the form of patient travel time at an acceptable level? 2. After strict MVR application, is the remaining hospital capacity sufficient to provide treatment for reassigned patients? 3. If strict MVR application is feasible for a given procedure, how high is the potential system level outcome quality gain?
Our model unveils interrelations between the level of centralization, patient travel time, hospital capacity and outcome quality that have not been analyzed simultaneously before. However, understanding these interrelations is crucial to design MVR that are effective in practice. While we apply our model to the German hospital sector, its design provides a basis for similar simulations in other countries and can easily be adjusted to a different regional or political context.

Methods
To clarify our model's data need, we outline the used mixed integer programming model first. Second, the data input per data level (hospital vs. patient level) is described along with the used data sources.

Model
In essence, any MVR is characterized by the definition of the regulated procedure via procedure and/or diagnosis codes, the definition of treatment types, the set MVT, the application level and exceptions claimable by hospitals [26,27]. A description of MVR characterization and the definition of the MVR used in our model can be found in the "Appendix" ( Table 2).
In our model, we simulate strict MVR application for two complex procedures (colon-/rectum resection [CRR], radical prostatectomy [RPE]), for one procedure with medium complexity (total knee arthroplasty, [TKA]), and for one procedure with relatively low complexity (cholecystectomy [CHE]). For these procedures, MVR has been issued in Germany or in other European countries [26,27] and/or a positive volume-outcome relationship has been observed in the literature (e.g., for CHE [34]). Besides, we use QIs suitable for the medical context of the respective procedure including parts of the post-surgery, outpatient treatment phase (see Table 3 in the "Appendix").
The mixed integer programming model works as a single objective linear optimization model subject to four constraints and is run separately for each procedure. 1 The model's objective is to maximize outcome quality on system level, i.e., across all reassigned patients and hospitals. Outcome quality is measured by risk-adjusted QIs for each procedure, and therefore, maximized by minimizing QI complication ratios (see Table 4 in the "Appendix" for model notation): The term C describes the system level complication ratio and Q h describes the outcome quality of hospital h defined for all h ∈ H where A p r h = 1 means that reassigned patient p r is treated at hospital h ( A p r h = 0 otherwise). The following model assumptions regarding patient choice are thus inherent to the optimization operation.
• Quality transparency Reassigned patients know and understand the outcome quality of hospitals still supplying treatments. In addition, quality information is provided transparently, e.g., by outpatient physicians, by online public reporting platforms or by other sources. • Rational decision makers Reassigned patients maximize expected utility from receiving treatment, i.e., they will choose the hospital with the highest outcome quality.
Without constraints, these assumptions would lead to an inadvertent, infeasible situation: all reassigned patients would choose the hospital with the highest outcome quality. Thus, to ensure feasibility of strict MVR application, we define constraints for both patient travel time as well as hospital capacity. Moreover, we define constraints for the exclusion of hospitals not respecting the MVR from supply and the according reassignment of affected patients: Subject to For a description of the used software, see the last paragraph of the data matching and cleaning section of the "Appendix".

Data
For the objective function, data from the quality assurance with routine data program from the largest German health insurance fund (Allgemeine Ortskrankenkasse, AOK) from 2015 were used (a recent German study provides a description of this data source [36]). If no outcome quality data were available for a hospital, the average quality of the respective hospital case volume quintile was assigned to that hospital (see Tables 5 and 6 in the "Appendix"). In addition, the model constraints use the following data input (see Fig. 1).

Exclusion from supply
Case volume data from the publicly available structured quality reports of the external inpatient quality assurance program from 2015 and 2014 were used (a recent German study provides a description of this data source [37] To obtain patient zip codes for each case, the non-AOK cases were equally assigned to AOK-patients. AOKpatients, therefore, received a weight greater than 1 as they represented multiple cases. If there were no patient data available for a hospital, patients were assumed to live in the hospital's zip code area. This approximation was necessary for roughly 9% of total case volume ranging from 5% (CRR) to 13% (RPE). See Table 5 in the "Appendix" for case volume shares of all procedures.

4.
Hospital capacity Case volume per hospital was sourced from the structured quality reports.
Data from 2015 were used for all constraints (and additionally from 2014 for the exclusion from supply constraint as indicated above). For a detailed description of the data-matching process and model computation, see the "Appendix".

Results
In essence, our model evaluates the tradeoff between patient disutility in terms of travel time, hospital capacity, and potential system level outcome quality gain to define effective MVTs. To unveil this tradeoff, two sets of six calculations each were conducted per procedure. Each set follows the same feasibility criteria t , t max and v max and only the value for the MVT S is changed. This way, the marginal potential system level outcome quality gain of each MVTincrease can be compared within one calculation set. Moreover, the relationship of different feasibility criteria can be derived by comparing the two calculation sets. Besides, four sets had to be calculated for RPE as the MVR for calculation RPE I 5 proved infeasible. Setting of all parameter values was derived from existing MVR in selected European countries [27], other simulations found in the literature [18] and/or loosely based on requirements from German legislation [38]. Table 1 summarizes used parameter values and the main model output per procedure and calculation and reads as Fig. 1 Overview of data input, levels, sources, and matching variables. Annotations: (1) If a hospital in A could not be found in B, the average quality of the respective hospital case volume quintile was assigned to that hospital. (2) Each AOK-patient represents >1 case volume. (3) If for a patient in C no matching hospital could be found in A, the patient was excluded from the sample. Moreover, if no patient could be found in C for a hospital in A, all patients from that hospital in A were assumed to live in the same zip code area as the hospital's location follows: in the first two columns, values set for the MVT ( S ) and the travel time threshold ( t ) are shown. The third and fourth columns indicate the set and observed values for the additional share of reassigned patients that are allowed to travel longer than the travel time threshold ( t max ) and the maximum relative case volume gain per hospital ( v max ). For calculation of RPE I 2 , for instance, an additional 10% of reassigned patients were allowed to travel longer than 45 min. The observed share of reassigned patients was also equal to 10%. Similarly, hospitals were allowed to gain 100% of their initial case volume at the most and three hospitals actually did gain 100% of their initial case volume. Ultimately, the observed values for t max and v max answer research questions I and II, i.e., indicate MVR feasibility. Accordingly, if observed values for these parameters are indicated with (−), the given MVR is infeasible.
If an MVR proved to be feasible, the resulting level of centralization in the form of the number and share of excluded hospitals and the number and share of reassigned patients are given in the fifth and sixth columns. Regarding RPE I 2 , 155 hospitals did not meet the MVT S and were thus excluded from supply, representing 37.3% of the RPE hospital sample. Consequently, 1392 patients (6.1% of total) had to be reassigned.
Moreover, the resulting potential system level outcome quality gain is given first as an absolute and relative delta of the QI ratio and second as the number of avoidable complications in the last two columns. The number of avoidable complications was calculated by applying the relative outcome quality gain to the observed number of complications (see Table 3 in the "Appendix"). With respect to RPE I 2 , the complication ratio was 0.08 points or 7.1% lower after strict MVR application representing 174 avoidable complications, i.e., reoperations in the case of RPE. In essence, these last two columns answer research question 3, i.e., determine the potential system level outcome quality gain of feasible MVR. For a detailed overview of all model outputs, see Table 7 in the "Appendix".
To be effective, an MVT needs to balance feasibility and patient disutility with potential system level outcome quality gains. To this end, Fig. 2 visualizes relationships between marginal system level outcome quality gains (y-axis) and set MVTs (x-axis). For all procedures except RPE, the first three calculations I 1 , I 2 and I 3 yield high relative marginal outcome quality gains between 99 ( CR I 3 ) and 364% ( TKA I 1 ). For the fourth and fifth calculations, the slope starts to flatten. Moreover, disutility reaches a high level and realizing feasibility is more difficult (see Table 1 and Table 7 in the "Appendix"). Accordingly, the average absolute case volume gain per hospital becomes rather large and the additional number of minutes reassigned patients have to travel increases strongly (see Table 7 in the "Appendix"). Thus, in the case of CR, the MVT for I 5 and I 6 was increased less strongly than for previous calculations. Overall, MVT increases for calculations I 5 and I 6 are linked to higher patient disutility, more difficult realization of feasibility, and smaller marginal utility increases. Consequently, it can be deduced that the most effective MVR for all procedures except RPE requires an MVT close to I 5 . Annotations: All observed absolute numbers are rounded to full values. All observed percentages are rounded to one-digit decimals. I n means iteration/calculation number n, S denotes the investigated MVT, t denotes the travel time threshold, t max means the additional share of reassigned patients allowed to travel longer than t, v max means allowed relative case volume gain, QI means quality indicator The lighter gray line depicts the second calculation set (I 7 -I 12 ) for which at least one feasibility constraint per procedure was set more strictly. For all procedures, the potential system level outcome quality gain is only slightly lower for the second calculation set, especially for the first four calculations. This is another indication that the MVT should not be increased further than I 5 . Besides, as the second calculation set yields significant system level outcome quality gains with stricter feasibility criteria, it can be deduced that strict MVR application can create high patient utility even when feasibility is more constraint.
With respect to RPE, the MVR tested in I 5 proved infeasible (see Table 1). Therefore, feasibility criteria had to be relaxed for I 6 -I 8 , increasing patient disutility and making feasibility more difficult (see Table 7 in the "Appendix"). Counterintuitively, the potential system level outcome quality gain for I 8 was lower than for previous calculations of the same calculation set. The same occurred for I 12 . This negative relationship starting after an MVT of S = 40 is likely to be due to the distribution of QI values across our hospital sample (see Table 6 in the "Appendix") and the limited availability of outcome quality data (see "Discussion"). Overall, an MVT close to 32 treatments (I 4 ) seems to be most effective for RPE as patient disutility is higher and the realization of feasibility is more difficult for higher MVTs.

Discussion
The results of our model show that strict MVR application can effectively increase quality of care while controlling for patient travel time increases and the availability of hospital capacity. The flexibility of the model allows for the integration of different quality measures, the application to different procedures as well as the consideration of different geographical, political and infrastructural circumstances. Our model can, therefore, be used in diverse settings and different countries supporting health policy makers to define effective MVTs based on empirical evidence.
To put our results in perspective to prior research, we discuss assumptions and results regarding patient travel time, hospital capacity, and outcome quality. Lastly, limitations are discussed and concluding remarks are given.

Patient travel time
A common assumption is that reassigned patients base their hospital choice on proximity rather than outcome quality. Two previous studies on strict MVR application [16,17], for instance, compare patient travel time changes by estimating the average travel time of both all and only reassigned patients had they chosen the closest hospital to their home, respectively, the next closest hospital providing care.
In contrast, we assume that reassigned patients do not maximize their utility solely by minimizing incurred patient travel time. In our model, patients balance utility increases due to higher quality treatment and disutility increases due to longer travel times. Patient travel time is considered as a constraint: most patients (equal to P rt (1 − t max ) ) take the disutility of hospital choice into account yet only if it exceeds a certain level ( t ) while some patients do not value travel time at all (equal to P rt × t max ). This design of the expected utility function of patients was chosen as it is in line with empirical evidence concerning patients' preferences [23,[39][40][41][42][43][44]. Besides, the authors of previous studies also acknowledge that their data show that a considerable share of patients initially do not choose the closest hospital [16,17]. We found this fact confirmed in our patient level data (6-16% of reassigned patients initially travel longer than t , see Table 7 in the "Appendix").
Regarding results from patient travel time simulations, one study also investigates the procedure total knee arthroplasty, among others [17]. Not considering the above-discussed difference of the authors' travel time approach compared to our approach, patient travel time increases were moderate after strict MVR application. Accessibility was thus not restricted by the investigated MVT equal to 50. Both of these findings are in line with our results.

Hospital capacity
The use of case volume might seem unjustified and the level of allowed case volume gain unrealistic. Supporting arguments for our approach are as follows.
• Relationship between case volume and hospital beds Using the average length of stay, hospital beds required for additional case volume can be calculated. For RPE I 2 , for instance, average case volume gain per hospital amounts to 13 cases (see Table 7 in the "Appendix").
Assuming an average length of stay for a patient undergoing RPE, e.g., to treat prostate carcinoma, of 9.14 days 2 and 80% bed utilization, hospitals on average need an additional bed capacity of 0.42 beds per year to treat 13 additional RPE cases. Presuming an average hospital ward size of 35 beds, this amounts to an additional relative capacity need of 1.2%. • Hospital specialization It can be expected that to gain cases of complex procedures and to strengthen specialization, hospitals will forgo treatment of less complex procedures. After strict MVR application and patient reassignment, these hospitals might, therefore, clear capacity to treat more cases of the regulated, complex procedure.

Outcome quality
The number of avoidable complications per procedure might seem rather high. The interpretation of the simulation results should be clear, however: due to the assumptions for patient choice, the number of avoidable complications represent the maximum number of complications that can be avoided by strict MVR application, hence the term potential system level outcome quality gain. In this context, our results show that strict MVR application with MVTs of calculation I 5 can potentially reduce complications between 11.9 (colon resection) and 20.1% (total knee arthroplasty). Moreover, feasibility in terms of patient travel time and hospital capacity was given for 58 out of 60 calculations. Therefore, it can be concluded that strict MVR application is feasible and can yield high-outcome quality gains. In comparison, the authors of a previous study derived system level outcome quality changes in terms of population impact numbers, i.e., the number of patients for which one complication could have been avoided [20]. For the observed period from 2009 to 2014, MVTs for the procedures colorectal resection for carcinoma and for the procedure colorectal resection for diverticulosis were set to 82 and 44. Strict application of these MVTs would yield one in 197, respectively, one in 364 avoidable deaths. Linking these population impact numbers with the total case volumes of 331,000 and 179,000, for the procedure colorectal resection for carcinoma 280 lives and for the procedure colorectal resection for diverticulosis 82 lives per year could have been saved.
These two procedures together resemble more or less the procedure CRR investigated in this study. However, outcome quality changes for only one MVT were estimated, six data years were investigated jointly and a different QI (inpatient mortality) not considering inadvertent events after hospital discharge was used. For these reasons, the affected patient samples differ between the authors' study and our study. Moreover, the authors did not test MVR feasibility but implicitly assumed that there would be no conflicts in terms of patient travel time and hospital capacity. Thus, the question of feasibility and (political) practicability remains and, not surprisingly, the used MVTs are rather high compared to the MVTs used in our study.

Limitations
Our approach is limited by data availability and model assumptions. With respect to patient travel time, patient level data were available for merely a subset of cases. Thus, to be able to calculate patient travel time for all cases, patients were reassigned in groups. In addition, for some reassigned patients per calculation, no observed patient zip codes were available and these patients' hospital zip codes had to be used. As no complete set of patient level data is available to research in Germany, this approach had to be chosen. In addition, as centroids of zip codes were used and as the share of patients that zip codes were available for per procedure was comparatively high (see Table 4 in the "Appendix"), this approach should deliver acceptable results. Still, we might underestimate the actual average travel time per reassigned patient before strict MVR application (see Table 7 in the "Appendix"), as a certain number of reassigned patients is assumed to live very close to the treating hospital. Regarding our patient travel time constraint, this is a rather prudent assumption, however, as the constraint compares the number of reassigned patients traveling longer than t before and after strict MVR application and all patients that are assumed to live in the same zip code area as their respective hospital's location naturally do not travel longer than t.
Regarding outcome quality, the quality rating per hospital and procedure should ideally be based on several QI values and multiple data years. This way, outcome quality could be captured holistically and statistical chance would be reduced. Due to the limited data available for this study, outcome quality is based on merely one QI per procedure reported in 1 year, however.
Concerning model assumptions, a high degree of quality transparency was assumed which indubitably is an ideal state. Still, studies suggest that the degree of quality transparency is increasing [36,37,43]. In Germany, this is at least partially due to various quality transparency initiatives such as the external inpatient quality assurance program, the so-called White List (Weisse Liste) online platform and the quality assurance with routine data program brought forth by the AOK, to name just a few. In addition, it can be argued that outpatient physicians treating patients prior to their inpatient treatment recommend hospitals with higher outcome quality [45], adding to the degree of quality transparency. Besides, the quality transparency assumption was made deliberately as model results are to show the maximum outcome quality gain that can possibly be attained.
Apart from outcome quality, other parameters relevant for patients' hospital choice such as patient travel time, structural quality indicators, general hospital characteristics (e.g., number of beds), waiting time, other outcome quality indicators (e.g., patient reported), service quality, certifications, etc. could be added to the objective function. Gutacker et al. [46] and Kuklinski et al. [47] evaluate the influence of such parameters on patients' hospital choice. Their work could serve as a basis for developing an objective function apt to simulate patients' decision processes more realistically. This way, the change in outcome quality for a certain MVT could be assessed more precisely.
Moreover, immediate exclusion in case of MVR noncompliance was assumed. The authors of a previous simulation study, on the other hand, employ two different scenarios [17]: immediate and successive exclusion from supply. The successive exclusion scenario assumes that first the group of hospitals with the lowest number of treatments is excluded and patients of those hospitals are reassigned, subsequently the group with the second lowest and so forth. This approach gives hospitals that initially do not comply with the MVR a chance to gain enough case volume from reassigned patients to eventually comply with the MVR. Consequentially, those hospitals are not excluded from supply even though they would have been excluded in the immediate exclusion scenario. MVR in Germany issued by the Joint Federal Commission does grant a 2-year transition period in case of the introduction of MVR [38] for a new procedure which might justify a successive exclusion scenario. Evidence suggests, however, that during a transition period, only a relatively small share of hospitals "voluntarily" refrains from performing treatments [28,30]. Therefore, the immediate closure scenario was chosen.

Conclusion
There are studies simulating the effect of strict MVR application on the centralization of hospital services, outcome quality and/or patient travel time. No study exists that simultaneously simulates the effect of strict MVR application on all dimensions relevant for patient care, however. On the one hand, patient travel time and hospital capacity need to be considered to ensure feasibility and (political) practicability of strict MVR application. The degree of potential system level outcome quality gain, on the other hand, represents the utility of strict MVR application. Effective MVR balance feasibility and potential outcome quality gain.
Our model objectifies the discussion concerning the definition of MVTs. Health policy makers can use our model to demonstrate first that strict application for a certain MVR is feasible and second how strongly it can increase the quality of care. Health insurance funds as well as hospital networks can use our model to evaluate how MVR is likely to affect their insurees or hospital sites. Lastly, with our model results, patients are able to comprehend tradeoffs associated with MVR.
For Germany, we demonstrated that effective MVR can be designed and should thus be introduced (CHE, CRR, RPE), respectively, adapted (TKA) and strictly applied for all investigated procedures. The same can be tested for any procedure and country given the availability of hospital and patient level data. We deliver our model and results at an opportune moment to enable health policy makers to leverage MVR's intended benefit: concentrating care at highquality centers to improve the quality of care.

Characteristics of MVR
In essence, an MVR for a given procedure can be described by the following five characteristics [26,27,38].
1. Definition of procedure Procedure codes are most commonly used to determine the scope of a procedure. In Germany, codes from the Operation and Procedure Catalogue (Operationenund Prozedurenschlüssel, OPS) are used, for instance. Some organizations additionally use diagnosis codes, e.g., from the International Classification of Diseases (ICD), for their definition. Even if the names of procedures are similar, the codes used procedure definition might differ in number and level of detail, changing the effect of the MVR by enlarging or reducing the affected patient group. In our model, we define procedures solely by procedure codes. 2. Definition of treatment types within a procedure Procedures always consist of at least one treatment type and in many cases of more than one treatment type. Each treatment type is in turn defined by a subset of (procedure) codes that were used for the procedure definition. In practice, the specification of the above characteristics depends on the issuing organizations' differing regulation goals and methods. The Joint Federal Committee (Gemeinsamer Bundesausschuss) in Germany, an example for a legislative organization, uses MVR to prevent hospitals from opportunistic, occasional (surgical) treatment via withholding reimbursement for these treatments in cooperation with health insurance funds ( Table 2).

Data matching and cleaning
As can be seen in Fig. 1 in "Methods", we used three data sources to obtain all necessary data for modeling. To create valid data input, these three data sources were matched. To this end, the structured quality report hospital level data (A) served as the master data source, i.e., all hospital (B) and patient level data (C) were matched with A. In each data source, hospitals could be identified with a unique hospital identifier that describes one hospital, respectively, hospital group. In A, also single hospitals, i.e., hospital sites, could be identified as site numbers were given. As MVR in Germany is applied to hospital sites and not hospitals or hospital groups, it is vital to match data on hospital site level. With the term "hospital", we, therefore, always mean a single hospital site. Suitably, using the hospital site addresses and hospital names as additional matching variables, B could be matched with A on hospital site level. In a second step, as B and C were supplied by the same data source and hospital identifiers were the same, patients in C could be matched to hospitals in B and thus to A.
As the model was designed to simulate the effect of strict MVR application on the entire German hospital sector, no hospitals in A could be excluded from the sample. Thus, missing data points had to be estimated if for a hospital in A no information could be found in either B or C (see Table 5 below and Fig. 1 in the main text).
• Hospital in A cannot be found in B To make our model operational, each hospital needs a QI for each procedure. In some cases (17% of hospitals respectively 10% of cases overall, ranging from 16 to 18%, respectively, 8-15% per procedure), a hospital in A could not be found in B. In this case, the average value of the QI of the hospital's respective case volume quintile was assigned to that hospital. In a first step, total case volume of a procedure was divided in five equal parts (i.e., quintiles). In a second step, hospitals were ranked according to their case volume, smallest to largest. In a third step, the first case volume quintile was "filled" with the smallest case volume hospitals until the needed total case volume of the first quintile was reached. This was repeated for each quintile and for each procedure. For the distribution of hospitals per quintile and procedure, see Table 5 below.  • Hospital in A cannot be found in C To calculate patient travel time, each patient's location, i.e., the geographical center of the patient's zip code, is needed. If for the case volume of a hospital in A, no patient reference could be found in C, all patients of that hospital in A were assumed to live in the same zip code as the hospital's location. This occurred for 9% of the overall case volume, ranging from 5 to 13% per procedure.
Lastly, in two cases, data were excluded from modeling (see Table 4).
• Hospital in B could not be found in A QI values of these hospitals were disregarded as they could not be matched with case volume data in A. As case volume and hospital data in A were unaffected by this, model validity was  not harmed by the exclusion of these data. Besides, this occurred for only 1% of hospitals in B. • Patients in C that could not be assigned to a hospital in A These patients were excluded from modeling without deteriorating model validity as this did not affect overall case volume in A. Incidentally, this occurred for only 1% of patients in C.
The model algorithm was coded and run in a Python environment which was operated on a Linux terminal. Zip codes were geocoded using Google Maps APIs and average travel times were calculated using the OpenMaps tool backend with a local Open Source Routing Machine server. For the optimization problem, the Gurobi solver was used ( Table 6).  Base