There is a growing need for quality indicators which reliably and objectively measure the outcome quality of surgical performance and clearly identify high- and low-performing hospitals. Composite outcome measures combine different quality indicators in one score to increase the reliability of hospital performance assessment. There is already some valuable evidence from the literature that composite measures may be a more effective approach for capturing a hospital’s surgical quality [1,2,3,4,5,6].

Currently, most quality improvement projects focus on single-quality indicators, which may not sufficiently mirror overall quality of the surgical performance in the context of colorectal cancer care [2].

To test the recently introduced composite outcome measure MTL with regard to its suitability for profiling hospitals on surgical performance, we analyzed the complete dataset from the colorectal cancer study, documentation, and quality centre (StuDoQ) registers of the German Society of General and Visceral Surgery (DGAV). The main objective of the present analysis was to find out which time interval covered by MTL has the highest accuracy regarding severe complications defined as Clavien-Dindo grade ≥ 3. The time interval of 30 days was primarily chosen because a follow up of 30 days is traditionally used in the literature when short-term outcomes after surgery are assessed and reported. Therefore, the 30-day interval was established as a common threshold value and trigger point for the three components of MTL.


All patients undergoing colorectal resection for adenocarcinoma of the colon or rectum between January 2010 and February 2017, registered in the StuDoQ|ColonCancer Register or the StuDoQ|RectalCancer Register of the DGAV, were included in a database retrieval. After exclusion of resections without restoration of bowel continuity (no anastomosis), palliative surgery or surgery for recurrent colorectal cancer, and after excluding cases with incomplete datasets, 14,978 patients with complete datasets for the follow up duration of 30 days were further analyzed. According to data protection laws, the excluded patients were not analyzed and the retrieved datasets were reduced to the necessary outcome measurements. These included patients and tumor characteristics (e.g., height, weight, stage), procedure characteristics (e.g., surgical technique, duration of the surgical procedure) and postoperative outcomes (e.g., surgical complications graded according to the Clavien-Dindo classification, mortality, surgical site infection). Data quality was monitored on-site and verified by the administration particularly on M, T, and L for all hospitals submitting data to the StuDoQ registers in order to apply for certification as a colorectal center (45% of the included hospitals). These independent control mechanisms minimized the risk of detection and attrition bias. The remaining hospitals entered data into the register voluntarily as part of their institutional quality control. MTL30, a recently described composite outcome measure [7], is defined as (a) mortality and/or (b) transfer to another hospital (other than rehabilitation clinic) within 30 days after the index operation, and/or (c) postoperative length of hospital stay ≥ 30 days. Currently, MTL is only used as a quality outcome measure within the StuDoQ registers and within the framework of the voluntary colorectal cancer center certification process. MTL rates were calculated using the register data and compared to well-established single-outcome measures, including mortality, morbidity, and length of stay. For each postoperative day 15 to 30, contingency tables were used to calculate the accuracy of the Clavien-Dindo Classification ≥ 3. The day with the highest accuracy was then used for an alternative MTL classification.

The study was approved by the institutional ethics committee.


Patient characteristics were calculated using summary tables. Length of stay analysis was performed via product-limit survival estimates. Comparisons of the outcome measurements with the potential effects (surgical complications, stroke, myocardial infarction) were analyzed applying full model fitted multiple regression analysis. Because of its composite character, Clavien-Dindo was not included in the multiple regression analysis. MTL intends to measure the surgical quality in a hospital. Therefore, the funnel plots were performed at a hospital level, whereas the remaining figures and the contingency tables (Tables 1 and 2) show results at a patient level. Funnel plots and control limits were constructed according to Spiegelhalter et al. Control limits indicate a range in which the values of a quality indicator would statistically be expected to fall [8]. All statistical analyses were performed with SAS version 9.3 using SAS Enterprise Guide 4.2 (SAS, Cary, NC, USA).

Table 1 Contingency table for MTL30 versus Clavien-Dindo ≥ 3
Table 2 Contingency table for MTL22 versus Clavien-Dindo ≥ 3


The time interval of 22 days demonstrated the highest accuracy (87.4%) regarding surgical major morbidity (Clavien-Dindo ≥ 3) compared to all other tested time intervals, i.e., 20, 23, or 30 days, as shown by Fig. 1. As a subgroup analysis revealed, the accuracy of the time interval of 22 days did not differ between colonic and rectal resections (colon—accuracy 87.8%; rectum—accuracy 86.7%). The differences between MTL30 and 22 concerning accuracy, specificity, and sensitivity are shown in Table 1 and 2. As demonstrated in this contingency table, MTL22 has a sensitivity of 58.8% (specificity 94.3%) regarding the depiction of severe surgical complications (Clavien-Dindo ≥ 3), whereas the sensitivity of MTL30 is only 40.3% (specificity: 97.2%). Hence, MTL22 performs better than MTL30 in depicting surgical major morbidity. Figure 2 illustrates that the odds ratios for almost all—especially surgical—complications are higher for MTL22 than for MTL30. Moreover, Fig. 2 and the corresponding data in the appendix show that MTL22 is not only better than MTL30, but above all better than all its single components. With MTL22, it is possible to map all complications (Appendix Table 4 (a–f)).

Fig. 1
figure 1

Accuracy for MTLx regarding surgical major morbidity (Clavien-Dindo ≥ 3) depending on the time interval for length of hospital stay (x). The y-axis shows the accuracy, the x-axis the postoperative day

Fig. 2
figure 2

af Full model fitted multiple regression analysis of the outcome measurements. If 95% confidence limits cross 1 in either direction, the result is not significant. Otherwise, findings are significant. SSI: surgical site infection; p/o: postoperative. p-values can be found in the appendix.

Data from 14,978 patients in 144 hospitals registered in the two colorectal cancer StuDoQ registers was analyzed. Baseline patient and procedure characteristics are presented in Table 3. Length of stay was significantly prolonged if postoperative complication(s) occurred (p < 0.0001): 35% of patients, who experienced at least one postoperative complication, were still in hospital on postoperative day (POD) 22 compared to only 4% of patients with an uneventful postoperative course (Fig. 3). Thirty days after the index operation, the percentage of inpatients among those with a complicated course was still as high as 24% compared to 1% in the group without complications. Among the investigated complications, postoperative myocardial infarction (p = 0.0001), postoperative ventilation (p = 0.0001), postoperative renal failure (p = 0.0001), postoperative pulmonary embolism (p = 0.0001), postoperative pneumonia (p = 0.0001), ileus (p = 0.03), and anastomotic leakage (p = 0.03) were significantly associated with 30-day mortality (Fig. 2). Likewise, the transfer to another hospital (other than rehabilitation clinic) mainly resulted from cardiovascular or pulmonary complications: postoperative myocardial infarction (p = 0.0001), postoperative pneumonia (p = 0.0001), postoperative stroke (p = 0.0001), and overall medical complications (p = 0.0003) were significantly associated with transferring a patient to another hospital. In contrast, there was no significant association between surgical morbidity caused by anastomotic dehiscence, ileus, bleeding, or surgical site infection and transfer to another hospital. Of 288 patients, who were transferred to another hospital, 107 patients (37%) had no complication documented before transfer. Prolonged length of stay was triggered by both surgical and medical morbidity, except for stroke, pulmonary embolism, and renal failure (still an in-patient at POD 22, i.e., MTL22 becomes positive), or stroke, renal failure, and myocardial infarction (still an in-patient at POD 30, i.e. MTL30 becomes positive). Both MTL30 and 22 were significantly associated with all investigated complications. These findings are shown in Fig. 2.

Table 3 Characteristics of 14,978 patients undergoing surgery for colorectal cancer registered in the StuDoQ|ColonCancer and StuDoQ|RectalCancer Registers between 2010 and 2017
Fig. 3
figure 3

Influence of postoperative complications on length of stay—rates of hospitalized patients (p < .0001).

The two funnel plots (Fig. 4a, b) illustrate the hospital variation by hospital volume in percentage of patients with MTL. For MTL22, 10 hospitals (7%) were located above the upper 99.8% control limits and 17 hospitals (12%) lay above the upper 95% control limits (Fig. 4b) compared to 10 hospitals (7%) exceeding the upper 99.8% control limits and 14 hospitals (10%) the upper 95% control limits for MTL30 (Fig. 4a). Seven hospitals (5%) were located below the lower 99.8% control limits and 12 hospitals (8%) below the lower 95% control limits for MTL22 (Fig. 4b), whereas, regarding MTL30, 2 hospitals (1%) lay below the lower 99.8% control limits (Fig. 4a) and 10 hospitals (7%) below the lower 95% control limits.

Fig. 4
figure 4

a, b (funnel plots): hospital variation in percentage of patients with MTL 30 and 22 scenarios. Dotted lines represent the upper/lower 99.8% control limits, dashed lines the upper/lower 95% control limits. Single crosses represent individual hospitals.


The present study investigated MTL, a composite outcome measure for profiling hospitals on surgical performance, using data from 14,978 patients from the colorectal StuDoQ registers of the DGAV. Length of stay was significantly prolonged if postoperative complications occurred. Thirty-day mortality as well as the indication to transfer a patient to another hospital (other than rehabilitation clinic) mainly resulted from cardiovascular and/or pulmonary complications. According to our analysis, MTL occurs significantly more often than any of its components or any of the other established single-outcome parameters. The time interval of 22 days demonstrated the highest accuracy (87%) regarding surgical major morbidity (Clavien-Dindo ≥ 3) compared to all other tested time intervals, i.e., 20, 23, or 30 days. Moreover, MTL22 has a sensitivity of 65% (specificity 92%) regarding the depiction of severe surgical complications (Clavien-Dindo ≥ 3), whereas the sensitivity of MTL30 is only 40% (specificity: 97%), which is, of course, insufficient. Therefore, MTL22 performs better than MTL30 or any other time interval covered by MTL in depicting surgical major morbidity.

However, the results of this study should be viewed in the context of certain limitations. The present analysis was not adjusted for differences in case-mix because—as mentioned in the introduction—in the first step, we aimed at regarding the marker MTL in isolation. As Wiegering et al. recently showed using data from the StuDoQ|ColonCancer Register, MTL30 correlated with the UICC (Union Internationale Contre le Cancer) stage. Accordingly, MTL should probably be implemented as a risk-adjusted QI in order to minimize confounding of the surgical quality measurement by patient- or tumor-related factors [7].

Since the investigated sample consisted exclusively of hospitals submitting their data to the StuDoQ registers, and most of them with the intention either to achieve the certification as a colorectal cancer center or at least benchmark its own performance, this sample is potentially not representative of the average German hospital, which may limit the external validity of the results presented here.

Even though the evidence regarding MTL as a composite measure is scarce and the published evidence only consists of one recent analysis of four organ-specific StuDoQ registers and the data presented here, there is broad evidence of the superiority of composite outcome measures over single parameters regarding a higher discriminatory power and the suitability to reliably identify outliers in the context of quality measurement. Gooiker et al. investigated the internal consistency and construct validity of nine single quality indicators (QIs) for colorectal cancer surgery using data from 85 Dutch hospitals participating in the Dutch Surgical Colorectal Audit in 2010 and found that single QI indeed provide complementary information. However, due to the lack of inter-QI correlation (indicating insufficient internal coherence), individual QI are not suitable as a surrogate for the quality of surgical colorectal cancer care. Therefore, the authors advocate more complex QI or composite outcome measures in order to meet the requirements regarding internal consistency and construct validity [2].

Dimick et al. developed composite outcome measures for four surgical procedures—colectomy, ventral hernia repair, abdominal aortic aneurysm repair, and lower extremity bypass surgery—by combining the established QI morbidity, reoperation, and length of stay and compared the standard American College of Surgeons-National Surgical Quality Improvement Program (ACS-NSQIP) approach for assessing hospital rates of risk-adjusted morbidity with their new composite approach using data from patients undergoing the four studied procedures registered in the ACS-NSQIP between 2008 and 2009. For all four procedures, the composite measure explained a higher proportion of systematic hospital-level variation and performed better at predicting future hospital performance [1].

In summary, composite outcome measures improve the reliability of benchmarking and decrease the misclassification of hospitals by their higher discriminatory power. Hence, composite measures are also advancing in colorectal surgery [3, 4].

A specific advantage of MTL is the combination of the two single components “transfer to another hospital” (T) and “length of stay” (L). MTL is deemed as fulfilled if either the T- or the L-criterion occurs. Large hospitals usually treat postoperative high-grade morbidity in-house, whereas smaller hospitals often transfer patients with postoperative major complications to another hospital with more expertise and infrastructure to manage complicated cases. MTL guarantees more fairness because in both settings—no matter whether the patient is transferred or has a prolonged stay in the primary hospital—MTL becomes “positive” which indicates a complicated course. Transfers to other hospitals due to overcrowding or a lack of resources in the primarily treating hospital do not occur in the German health care system.

Moreover, MTL reflects the complete spectrum of postoperative complications, whereas its single components only depict partial aspects. MTL as well as its single components were mainly triggered by medical complications, such as myocardial infarction and pulmonary embolism. Therefore, it seems rational to continue to assess the “traditional” surgical quality measures, such as anastomotic leakage, in addition to MTL. However, medical complications can quite possibly provide an indication of the surgical quality and especially reflect the indication quality.

Low hospital caseloads and low event rates usually limit the precision of single outcome measures, such as mortality, and result in little hospital variation. MTL as a composite outcome measure of three more or less “sentinel-event” indicators becomes significantly more frequently positive than its single components. As a result, compared to individual outcome parameters, MTL has a better discriminatory power and is suitable to reliably identify outliers and mirror surgical outcome quality even if hospital caseloads and event rates of the classical single outcome measures are low. As visualized in the funnel plots (Fig. 4a, b), MTL even detects hospitals which may have a quality problem if caseloads are well below 100.

Another decisive advantage of MTL, e.g., compared to postoperative complications classified according to the Clavien-Dindo classification, is the fact that MTL does not require separate documentation, but can be derived from routine administrative data. Since the validity of data recorded in clinical databases is often low, MTL must be more valid already in principle than outcome measures depending on additional clinical data collection. According to a study by Dindo et al., clinical databases are managed by residents alone in the majority of European centers which results in a high risk of underreporting of surgical complications as residents fail to document up to 80% of complications, including such severe ones as death [9].

The data presented here speak for the use of MTL22 as an ideal outcome measure for the quality of colorectal cancer surgery within the German health care system. However, the time frame covered by MTL should be adjusted to fit the specific surgical and sociocultural setting. This means that the number after “MTL” may vary according to procedure and health care system. MTL may as well be used in surgical areas other than colorectal cancer surgery, such as hepatic, pancreatic, or bariatric surgery. The present study suggests a methodology of how to calculate the optimal value for a given procedure and health environment in order to achieve best representation of various complications within this single measure.

Although the generalisability of the results presented here may be limited by the fact that this sample of hospitals submitting their data to the StuDoQ registers is potentially not representative of the average German hospital, the advantages of MTL compared to individual surgical outcome measures should hold true for other populations and procedures: its derivation from routine administrative data, its higher discriminatory power, and its suitability to reliably identify outliers and mirror surgical outcome quality. The next step will be the evaluation of the validity of MTL to assess whether MTL measures what it is intended to measure: the quality of surgical care..