Volume, outcomes, and quality standards in thyroid surgery: an evidence-based analysis—European Society of Endocrine Surgeons (ESES) positional statement

Introduction Continuous efforts in surgical speciality aim to improve outcome. Therefore, correlation of volume and outcome, developing subspecialization, and identification of reliable parameters to identify and measure quality in surgery gain increasing attention in the surgical community as well as in public health care systems, and by health care providers. The need to investigate these correlations in the area of endocrine surgery was identified by ESES, and thyroid surgery was chosen for this analysis of the prevalent literature with regard to outcome and volume. Materials and methods A literature search that is detailed below about correlation between volume and outcome in thyroid surgery was performed and assessed from an evidence-based perspective. Following presentation and live data discussion, a revised final positional statement was presented and consented by the ESES assembly. Results There is a lack of prospective randomized controlled studies for all items representing quality parameters of thyroid surgery using uniform definitions. Therefore, evidence levels are low and recommendation grades are based mainly on expert and peer evaluation of the prevalent data. Conclusion In thyroid surgery a volume and outcome relationship exists with respect to the prevalence of complications. Besides volume, cumulative experience is expected to improve outcomes. In accordance with global data, a case load of < 25 thyroidectomies per surgeon per year appears to identify a low-volume surgeon, while > 50 thyroidectomies per surgeon per year identify a high-volume surgeon. A center with a case load of > 100 thyroidectomies per year is considered high-volume. Thyroid cancer and autoimmune thyroid disease predict an increased risk of surgical morbidity and should be operated by high-volume surgeons. Oncological results of thyroid cancer surgery are significantly better when performed by high-volume surgeons.


Introduction
Interpreting volume and outcome correlation in thyroid surgery is complex. First of all, looking at volume, the concept of "practice makes perfect" (numbers) rivals the concept of a selective referral to the (naturally) superior surgeon, thus generating higher volume. In addition, minimum numbers and centralization are influenced by diverse and conflicting players in the health system, namely insurances, government, and politics [1]. Data provided by the diverse players must always be suspected to serve a certain purpose, and oftentimes large data drawn from registries or health insurance companies is not risk stratified when providing outcome data, and the selected quality indicators may not be applicable or ideal. In the aim to improve outcome in thyroid surgery, the intent to reach perfection appears as applicable as minimizing the risk of avoidable harm. In general, the available literature published on volume and outcome aspects for thyroid surgery is compromised by low evidence and is dominated by USAbased data sources that may only partly be transferable to European health care systems. The dimensional range of surgeon volume provided in the current literature is considerate and the studies compare a wide range of definitions that in effect make comparison and meta-analysis error-prone. Also, when looking at hospital volume in thyroid surgery, it must be considered that this may but must not necessarily correlate strongly with surgeon volume. Keeping these premises in mind, it must be understood that to define minimal numbers in thyroid surgery remains scientifically low evidence-based and will not be accepted uncontested and may not apply to all types of health care systems.
For the present analysis, a literature search of PubMed was performed (31 December 2018) to retrieve articles published between 1 January 1990 and 31 December 2018. The following terms were used in the search text fields: thyroid surgery AND volume and outcome relationship OR benign thyroid surgery AND recurrent laryngeal nerve injury OR thyroid cancer surgery AND recurrent laryngeal nerve injury AND hypocalcemia OR hypoparathyroidism AND postoperative hemorrhage OR thyroid cancer surgery AND infection AND volume and outcome relationship AND thyroid cancer surgery.
Published observational and interventional studies that reported on volume and outcome relationship in thyroid surgery in humans were included. Reviews, letters, commentaries and editorials, articles with fewer than 100 patients, articles whose full text was not available in the English language, animal studies, irretrievable articles, and articles published before 1990 were excluded. Outcome measures included recurrent laryngeal nerve injury (transient and permanent), hypoparathyroidism (transient and permanent), bleeding, infection, completion thyroidectomy, local disease control, and recurrence rate.
The titles and/or abstracts of all retrieved articles were screened, and the full text of relevant articles reviewed for possible inclusion. Data collected included study and participants' characteristics, type of surgery, and volume-outcome relationship. Missing data were excluded from the analysis. In the event of duplicate publication, only the publication with the most recent and complete data was included. Evidence levels (EL) and recommendation grades (RG) of the analyzed literature were assessed using adapted Sackett classification. In this, evidence levels (EL) descend from highest level of randomized controlled trials (ELI) and corresponding recommendation grades (RG), highest RGA, to lowest ELV and RGD for case reports or uncontrolled studies with little confidence in the estimated effect [2,3].
The results of the analysis were presented and discussed at the 8th Conference of the European Society of Endocrine Surgeons (ESES), "Volumes, Outcomes, and Quality Standards in Endocrine Surgery" (Granada, Spain, May [16][17][18]2019). This paper incorporates the results of the analysis of the literature and the outcomes of the discussions.

Results
Results are overviewed in Table 4 Recurrent laryngeal nerve injury in relation to surgeon volume Risk of recurrent laryngeal nerve (RLN) injury in thyroid operations for various conditions depending on individual surgical volume (high-volume vs. low-volume surgeons) in different published series is shown in Table 1.
In a statewide analysis of thyroidectomy performed in Maryland between 1991 and 1996, Sosa et al. reported that the individual surgeon's experience rather than the hospital's experience significantly correlated with complication rates. In this study that comprised 5860 patients, surgeons were categorized by volume of thyroidectomies over the 6-year study period: A (1 to 9 cases), B (10 to 29 cases), C (30 to 100 cases), and D (> 100 cases). Multivariate regression was used to assess the relation between surgeon caseload and in-hospital complications, length of stay, and total hospital charges, adjusting for case mix and hospital volume. After adjusting for case mix and hospital volume, highestvolume surgeons had the shortest length of stay (1.4 days vs. 1.7 days for groups B and C and 1.9 days for group A) and the lowest complication rate (5.1% vs. 6.1% for groups B and C and 8.6% for group A). Interestingly, prevalence of RLN injury was significantly reduced in operations performed by high-volume surgeons (groups C+D) who performed more than 30 operations per 6-year study period when compared with the lowest-volume surgeon (group A) who performed less than 10 operations per 6-year study period (OR 0.42; 95% CI 0.22-0.80; p = 0.009) [4] (ELIII, RGD).
Gourin et al. analyzed 21,270 thyroid surgical procedures performed by 1034 surgeons at 51 hospitals in Maryland between 1990 and 2009. High-volume surgeons (25 or more thyroid operations per year) were more likely to perform total thyroidectomy (OR 2.50; p < 0.001) and neck dissection (OR 1.86; p < 0.001), had a shorter length of hospitalization (OR 0.44; p < 0.001), and had a lower incidence of recurrent laryngeal nerve injury (OR 0.46; p = 0.002), hypocalcemia (OR 0.62; p < 0.001), and thyroid cancer surgery (OR 0.89; p = 0.01). In addition, compared with intermediate-and low-volume surgeons, high-volume surgeons were significantly more likely to perform total thyroidectomy procedures and neck dissection. High-volume surgeons were associated with increased case complexity scores, reflecting the presence of advanced comorbid disease, intensive care unit utilization, and decreased length of hospitalization [6] (ELIII, RGC).
Morbidity following thyroid surgery (permanent RLN palsy) was also less frequent among patients operated on by endocrine-dedicated surgeons vs. general surgeons in a prospective cohort study published by Gonzalez-Sanchez et al. ( Exact threshold number of cases defining a "high-volume" thyroid surgeon remains inconsistent in published series. Adam et al. used a multivariate logistic regression model with restricted cubic splines to estimate the association between annual surgeon volume and incidence of the complications. This statistic model draws a flexible function with robust behavior at the tails of predictor distributions. The use of this model was aimed to fix the range of annual surgeon volumes corresponding to a change in the relative log odds of experiencing any postoperative complication. This is the first study in which a statistical method has been used in order to establish activity volume threshold. Adam et al. addressed this issue in a study comprised 16,954 patients undergoing total thyroidectomy and including 47% patients with thyroid cancer and 53% with benign thyroid disease.  [11] (ELIII, RGC). Nouraei et al. analyzed recently national trends, outcomes, and volume-outcome relationships in thyroid surgery in a large cohort of 72,594 patients who underwent elective thyroidectomy in the UK between 2004 and 2012. Most patients in this study underwent hemithyroidectomy (51%) or total thyroidectomy (32%). Patients underwent surgery for benign (52.5%), benign inflammatory (21%), and malignant (17%) thyroid diseases. Increased surgeon volume significantly reduced lengths of stay: the proportion of length of stay outliers fell from 11.8% for patients of occasional thyroidectomists (< 5 per year) to 2.8% for patients of high-volume surgeons (> 50 thyroidectomies a year). High-volume surgeons had a reduced incidence of vocal cord palsy (OR 0.68; 95% CI 0.52-0.88; p = 0.01), and volumes > 30 were consistently protective. Hence, based on outcomes of this study, high-volume surgeon was defined as a surgeon who performs 50 or more thyroidectomies per year and achieves lower complications and shorter lengths of stay [13] (ELIII, RGD).
Similar conclusions were recently drawn by Melfa et al. who published a systematic review focused on surgeon volume and hospital volume in endocrine neck surgery with special emphasis put on how many procedures are needed for reaching a safety level and acceptable costs. Some differences in outcomes between these investigated categories were underlined: best results of the high-volume surgeon were evident especially in terms of complications, and on the contrary, best outcomes of a high-volume center were mainly economics, such as lower hospital stay and general costs of the procedures. A cut-off of 35-40 thyroidectomies per year for single surgeon appears reasonable for identifying an adequate high-volume practice [16] (ELIII, RGD).

Recurrent laryngeal nerve injury in relation to hospital volume
Minimum hospital volume needed for reaching a satisfying level of safety is less clear and ranges from 20 to 200 thyroid operations per year in published series. However, a cut-off of 90-100 thyroidectomies for a single center appears reasonable for identifying an adequate high-volume practice [16] (ELIII, RGD).
Mitchell et al. retrospectively analyzed 395 reoperative thyroid and parathyroid surgeries at a tertiary care hospital from 1999 to 2007. In this study, public discharge data were used to classify hospitals as low-volume hospitals (below 20 cases per year) or high-volume hospitals (20 or more cases per year). Operations performed at low-volume hospitals resulted in RLN injury in 9% of patients which was significantly higher than seen after operations performed at high-volume hospitals, where 3% of patients suffered RLN injuries (p < 0.05) [15] (ELIV, RGD).
Lifante et al. retrospectively analyzed data of 20,140 patients who underwent thyroid surgery in hospitals in the Rhône-Alpes area of France between 1999 and 2004, including 4006 procedures for thyroid cancer. Compared with hospitals performing a high volume of procedures for all thyroid diseases (100 or more operations per year), the risk of a unilateral procedure for thyroid cancer increased by 2.46 (95% CI 1.63-3.71) in low-volume hospitals (less than 10 operations per year) and by 1.56 (95% CI 1.27-1.92) in medium-volume centers (10-99 operations per year) [17] (ELIV, RGD).
Thomusch et al. analyzed risk factors for postoperative complications of benign goiter surgery in a prospective multicenter study comprised 7266 patients operated on in 45 East German hospitals from January 1 through December 31, 1998. In this study, outcomes were stratified to the institutional annual volume: high-volume providers (> 150 operations per year) performed 69% (5042/7266), intermediate-volume  operations per year) providers 27%, and low-volume (< 50 operations per year) providers 4% (258/7266) of operations. In logistic regression analysis, hospitals with an operative volume of fewer than 50 operations per year had a 1.3-fold increased risk for transient RLN palsy (p < 0.05) in spite of a trend that small-volume providers tended to perform more operations for uninodular goiter and high-volume providers treated more patients with Graves' disease and recurrent goiter [5] (ELIII, RGC).
Liang et al. retrospectively analyzed a cohort of 125,037 thyroidectomy patients treated at Taiwan hospitals from 1996 to 2010 with special focus paid on relationships between hospital/surgeon volume and patient outcomes using propensity score matching. In this study, both high-volume hospitals and high-volume surgeons were associated with significantly shorter length of stay and lower costs compared with their low-volume counterparts (p < 0.001). However, different volume groups had similar in-hospital mortality rates. Unfortunately, morbidity was out of the scope of this study [12] (ELIII, RGC).

Recurrent laryngeal nerve injury in operations for thyroid cancer vs. benign thyroid disease
Risk of RLN injury in thyroid surgery for cancer vs. other benign conditions in different published series is shown in Table 2. Surgery for thyroid cancer was a predictor of increased risk of RLN injury in many published large cohort series and this increased risk was estimated to vary between 1.8-fold and 2.9-fold (p < 0.001) 5 (ELIII, RGC), 6 (ELIII, RGD), 11 (ELIII, RGD), 17 (ELIII, RGC).
Kandil et al. used the nationwide inpatient sample to identify all patients who underwent total thyroidectomy between 2000 and 2009 and identified 46,261 procedures. In this study, the effects of surgeon volume and hospital characteristics on predicting patient outcomes were analyzed. In addition, univariable and multivariable analyses were used to examine the effects of the indication for surgical care on postoperative outcomes. Patients with Graves' disease had the highest postoperative complications (17.5%) compared with patients undergoing total thyroidectomy for other benign (13.9%) and malignant (13.2%) thyroid disease (p < 0.001). After stratification by surgeon volume, Graves' disease was found to be a significant predictor of postoperative complications in surgeries performed by low-volume and intermediate-volume surgeons (OR 1.39; 95% CI 1.08-1.79; p = 0.01 and OR 1.34; 95% CI 1.06-1.69; p = 0.02, respectively). However, Graves' disease was not a significant predictor of postoperative complications when performed by high-volume surgeons (OR 1.07; 95 %CI 0.62-1.83; p = 0.81). Hospital volume had an inconsistent and marginal protective effect on postoperative outcomes [9] (ELIII, RGC). Unfortunately, the issue of RLN injury was not stratified to surgical indications for surgery in this study. However, prevalence of postoperative vocal fold paralysis was significantly associated with surgical volume and both low-volume and intermediate-volume surgeons had higher risk of vocal fold paralysis than high-volume surgeons (1.5% vs. 1.2% vs. 0.8%, respectively; p = 0.009) [9] (ELIII, RGC). Hence, authors concluded that surgery for Graves' disease is associated with a higher risk for complications when performed by less experienced surgeons. This finding should prompt recommendations for increasing surgical specialization and referrals to high-volume surgeons in the management of Graves' disease.
Thomusch et al. published recently a prospective multicenter European study focused on prevalence thyroid surgeryspecific complications among patients with AITD (n = 2488) vs. multinodular goiter (n = 16,467) and utilizing logistic regression analysis to evaluate risk factors for transient and permanent RLN palsy and hypoparathyroidism. The rate of temporary and permanent vocal cord palsy ranged from 2.7 to 6.7% (p = 0.623) and from 0.0 to 1.4% (p = 0.600) among institutions involved, respectively. In logistic regression analysis of transient and permanent vocal cord palsy, AITD was not an independent risk factor for RLN injury (data shown in Table 3). Hence, it is evidence-based that surgery for AITD is safe in comparison with surgery for multinodular goiter in terms of general complications and RLN palsy [19] (ELIII, RGC). However, it should be underlined that surgeons who contributed to this study were recruited among endocrine surgeons practicing in 68 hospitals in 6 European countries, and all of them were high-volume thyroid surgeons (100 or more thyroid operations per year). Hence, outcomes reached in this study cannot be universally extrapolated to results of thyroid surgery at hands of low-volume and intermediate-volume surgeons.
Propositional summary recurrent laryngeal nerve injury and volume 1. Surgeon volume and outcome relationship exists in thyroid surgery with respect to prevalence of RLN injury. A

Hypocalcaemia/hypoparathyroidism in relation to surgeon volume and experience and hospital volume
In a statewide analysis of thyroidectomy performed in Maryland between 1991 and 1996, hypoparathyroidism rate was low, 0.3% in average [4] (ELIII, RGC). Incidence of postthyroidectomy hypocalcaemia was not significantly different across volume groups [4]. The authors interpreted such unexpected finding as a consequence of the fact that calcium levels were not measured during hospitalization in all the patients, and most cases of hypocalcaemia were diagnosed after discharge and were not included in the analysis. In a prospective multi-institutional study including 45 hospitals and 7266 patients operated for benign goiter, Thomusch et al. (ELIII, RGC) found that transient and permanent hypoparathyroidism occurred in 6.4% and 1.5% of the cases, respectively. There were no significant differences among hospital groups with different operative volumes [5] (ELIII, RGC).
In the study of Kandil et al. (ELIII, RGC) over 46,261 procedures, the effects of surgeon volume and hospital characteristics on predicting patient outcomes were analyzed. High-volume surgeons (> 100 operations for year) had a lower rate of post-thyroidectomy hypocalcaemia (4.7%) with respect to low-volume surgeons (< 30 procedures per year) (hypocalcaemia rate 12.1%) and to intermediate-volume surgeons (9.4%) (p < 0.01) [9].
Al-Qurayshi et al. (ELIII, RGC) performed cross-sectional analysis of adult inpatients who underwent thyroidectomy in US community hospitals using the Nationwide Inpatient Sample for the years 2003 through 2009 and identified 77,863 individuals who served as a study group. Thyroidectomy performed by a high-volume surgeon was a predictor of diminished risk of hypocalcaemia when Overall, 6% of the patients experienced complications. The likelihood of complication decreased with increasing surgical experience up to 26 cases per year (p < 0.01), while 81% of the patients were operated by low-volume surgeons (< 25 thyroidectomies/year) [11]. Patients operated by low-vs. highvolume surgeons were more likely to experience endocrinerelated complications (namely hypocalcaemia) (2.3 vs. 1.6%; p = 0.01) [11].
Besides surgeon volume, the CATHY study group found that surgeon's experience, length of practice, and age play a fundamental role in post-thyroidectomy complication rate [14] (ELIII, RGD). In a prospective cross-sectional multicentric study from five academic French hospitals, including 28 surgeons and 3574 thyroid procedures, 20 years or more of practice was associated with increased probability of both recurrent laryngeal nerve palsy (OR 3.06 (1.07 to 8.80); p = 0.04) and permanent hypoparathyroidism (OR 7.56 (1.79 to 31.99); p = 0.01). Surgeons' performance had a concave association with their length of experience (p = 0.036) and age (p = 0.035); surgeons aged 35-50 years had better outcomes than their younger and older colleagues [14].
Hallgrimsson et al. (ELIV, RGD) retrospectively compared a series of 128 patients with Graves' disease and 81 patients with MNG over a 10-year period (1999-2009). Symptoms of hypocalcaemia were more common in patients with Graves' disease (p < 0.001; OR 95% CI 3.26;1.48-7.14), but the frequency of biochemical hypocalcaemia, postoperative levels of parathyroid hormone (PTH), and treatment with calcium and vitamin D did not differ between groups of patients [22].
Kandil et al. (ELIII, RGC) used the nationwide inpatient sample to identify all patients who underwent total thyroidectomy between 2000 and 2009 and identified 46,261 procedures. In this study, the effects of surgeon volume and hospital characteristics on predicting patient outcomes were analyzed. In addition, univariable and multivariable analyses were used to examine the effects of the indication for surgical care (benign disease vs. thyroid malignancy vs. Graves' disease) on postoperative outcomes. Patients with Graves' disease had the highest postoperative complications (17.5%) compared with patients undergoing total thyroidectomy for either benign (13.9%) or malignant (13.2%) thyroid disease (p < 0.001). After stratification by surgeon volume, Graves' disease was found to be a significant predictor of postoperative complications in surgeries performed by low-volume and intermediatevolume surgeons (OR 1.39; 95% CI 1.08-1.79; p = 0.01 and OR 1.34; 95% CI 1.06-1.69; p = 0.02, respectively). However, Graves' disease was not a significant predictor of postoperative complications when performed by high-volume surgeons (OR 1.07; 95% CI 0.62-1.83; p = 0.81), although hypocalcaemia was not specifically stratified to surgical indications for surgery in this study [9].
In a systematic review and meta-analysis of predictors of post-thyroidectomy hypocalcemia, Edafe et al. (ELIII, RGB),  [18]. Surprisingly, Riedel`s thyroiditis had the second highest prevalence of hypocalcemia and no hypoparathyroidism. However, incidence of Riedel`s thyroiditis is so low that this aspect may not be concluded. In logistic regression analysis, surgery for Graves' disease and Hashimoto's thyroiditis were found to be independent risk factors for transient hypocalcemia (hazard ratio 2.76 and 8.19, respectively; p < 0.001). Graves' disease was found to be independent risk factor for definitive hypoparathyroidism (hazard ratio 1.57; p < 0.001) [19].
Nouraei et al. (ELIII, RGC) recently analyzed national trends, outcomes, and volume-outcome relationships in thyroid surgery in a large cohort of 72,594 patients who underwent elective thyroidectomy in the UK between 2004 and 2012 with 51% hemithyroidectomy and 32% total thyroidectomy for benign (52.5%), benign inflammatory (21%), and malignant (17%) thyroid diseases. In this series,  RGD) recently analyzed the rate of hypocalcaemia after total bilateral thyroidectomy in the BAETS registry, over 90,000 endocrine procedures in the interval July 2010 to June 2015. It seems apparent that following surgery for Graves' disease, there is a higher early requirement for calcium supplementation than after a similar extent of thyroid resection for benign disease. By 6 months, however, much of this effect is lost. In contrast, surgery for cancer is significantly more likely to lead to hypocalcaemia, both early and at 6 months, and much of this effect seems to be related to any concomitant central nodal dissection [26].
In a nationwide study based on the data obtained from the South Korea Health Insurance Review and Assessment Review database, data of 192,333 patients who underwent thyroidectomy over a 6-year period were reviewed, 52,707 who underwent thyroidectomy alone and 139,626 who underwent total thyroidectomy plus central neck dissection [27] (ELIII, RGC). The incidence of permanent hypocalcemia was greater in the group of patients who underwent concomitant central neck dissection (5.4% vs. 4.6%, p < 0.001). Central neck dissection did not raise the rates of permanent hypocalcaemia in institutes with a low volume of annual cases (< 200), whereas permanent hypocalcaemia was more common in the total thyroidectomy plus central node dissection group than in the total thyroidectomy alone group (3.5% vs 2.9%; p = 0.002) in institutes with a large volume of annual cases (≥ 800) [27]. However, overall, complication rate in high-volume centers was significantly lower with respect to low-volume centers, even in the case of the association of central neck dissection [27].
Propositional summary postoperative hypoparathyroidism and volume 1. Surgeon volume and outcome relationship exists in thyroid surgery with respect to prevalence of hypocalcemia/ hypoparathyroidism. A cut-off value of 50 thyroidectomies for a single surgeon per year appears reasonable for identifying a high-volume surgeon. 2. Hospital volume and outcome relationship is less clear than surgeon volume in thyroid surgery with respect to the prevalence of hypocalcemia/hypoparathyroidism. However, a cut-off value of 100 thyroidectomies for a single center per year appears reasonable for identifying an adequate high-volume unit. 3. Surgery for thyroid cancer is a predictor of increased risk of hypocalcemia/hypoparathyroidism in the pooled cohort of surgical volume, but the rate is lower for highvolume surgeons. 4. Surgery for AITD is a predictor of increased risk of hypocalcemia/hypoparathyroidism in the pooled cohort of hospital volume, but the rate is lower for highvolume units. 5. Thyroid surgery for thyroid malignancy performed by low-volume surgeons is associated with an increased risk of recurrence. Reoperative surgery is associated with an increased risk of hypocalcemia/hypoparathyroidism. 6. Current evidence suggests that total thyroidectomy for thyroid cancer or AITD should be undertaken by highvolume surgeons who have lower rates of hypocalcemia/ hypoparathyroidism than low-volume surgeons.

Postoperative hemorrhage
Postoperative hemorrhage represents the one emergency event in thyroid surgery. It is potentially life-threatening, can lead to irrevocable hypoxic brain damage, and may further result in recurrent nerve palsy, hypoparathyroidism, or wound infection. Despite identification of risk factors, from the surgeon`s perspective, the event of bleeding remains largely unpredictable. Moreover, continuous efforts to refine the technique of thyroid dissection, advances in technology, and application of coagulation-assisted devices as well as local hemostyptica failed to improve rates of postoperative hemorrhage. Only indirectly, a minor improvement can be assumed when considering that the extent of resection has considerably increased in the past decades, and rates of bleeding remain stable. Logistic structures of postoperative care are of paramount importance for the outcome in the event of hemorrhage following thyroid surgery. Thus, due to prevalent surgically unswayable risks, bleeding rate is a feature that while representing a quality indicator, must also be assessed in its own outcome quality. Incidence of postoperative hemorrhage over all types of thyroid disease and surgery spans broadly from 0.3 to 3.5%. Moreover, the boundaries for defining surgeon and hospital volume categories are mixed. Low volume starts from 1 to 9 vs. < 50 vs. < 100 per annum vs. overall experience, while high volume may be defined from 20 to > 300 thyroid surgeries per annum vs. overall experience. Thus, comparison of already heterogenous studies is compromised. Publications specifically addressing postoperative hemorrhage and outcome volume are scarce, and in some instances are included in studies that look at all over outcome-volume aspects. These are predominantly of retrospective design and generally exhibit low evidence levels III and recommendation grades C-D.
Although all studies refrain from stating a minimum number for certain procedures, a specifically surgeon-dependent raised bleeding rate by 7-fold is described by Promberger et al. in an exclusively high-volume surgeon group with > 300 thyroid procedures in all [41]. The majority of studies state a significantly to clearly improved (p < 0.0001-p < 0.58) outcome in favor for the high-volume category respectably defined or an inverse correlation of complications with surgeon volume, while only Adam et al. [11] identify improved outcome at ≥ 25 procedure/year/surgeon ( Table 6).

Postoperative hemorrhage in thyroid surgery and hospital volume
A smaller number of studies describe correlation of bleeding rates with hospital volume, directly or indirectly in alliance with all over complications. Hospital volume categories range from 1-99, < 25, and < 50 thyroid procedures/ year to > 76, > 200, and > 300 thyroid procedures/year. Bleeding rates herein range for most from 1.4 to 4.7% [4,7,9,12,19,34,43,48,49] (ELIII, RGD) [50] (ELII, RGC) ( Table 6). Godballe et al. [45] describe a wide range of bleeding rates between centers from 1.9 to 14.3% at a median rate of 4.2%. Few of the available studies clearly identified hospital volume to be positively associated with postoperative hemorrhage while all over complications demonstrate differences in the respective hospital categories. Hauch et al. [39] show a significantly improved postoperative hemorrhage rate for high-volume centers of 1.24% vs. 1.54% (p = 0.0027). Weiss et al. [45] (ELIII, RGD) show an improved bleeding rate for high-volume hospitals (OR 0.71) at all over bleeding rate of 1.25%, while patients that incur postoperative hemorrhage have higher mortality than patients without (1.34% vs. 0.32%; p < 0.001). In some studies, postoperative hemorrhage is thereby included in the outcome analyses. High-volume centers performing same day discharge for thyroid procedures are likely to bias results regarding postoperative hemorrhage due to preselection of patients. Studies that demonstrate improved outcomes for high-volume centers vs. low-volume are Sosa et al. [37] (7.7% vs. 11.1%; p = 0.05), Wang et al. [29] (5.6% vs. 8.7%), and Loyo et al. [6] (OR 0.72; p = 0.031); Mitchell et al. [15] state hospital volume to be inversely correlated with perioperative complications (bleeding rate 0.597%). Pieracci et al. [51] describe a linear association of increasing volume with decreasing overall complications (bleeding p = 0.01). Other studies show inconsistent or marginal effects of hospital volume on outcome [12,19,38,40,45,47,48,50] (see Table 6). HT hemithyroidectomy, LOS length of hospital stay, TT total and near-total thyroidectomy, proc. procedure thyroid/±parathyroid resection, NA not assessed or not provided *All over postoperative complication without specifically detailing hemorrhage § Meta-analysis data additionally provided Propositional summary postoperative hemorrhage and volume 1. Correlation of surgeon volume with bleeding rate is conflicting, and albeit high-volume surgeons generally perform better. No evidence exists for a minimal number of procedures. Bleeding rate < 2% at 50 annual procedures performed appears reasonable. Other studies showed inconsistent or marginal effects of hospital volume on outcome [5, 18, 37-39, 43, 47, 48] (see Table 6). 2. Hospital volume and bleeding rate is less clear than surgeon volume in thyroid surgery. However, a cut-off value of 100 annual thyroidectomies for a single center per year appears reasonable for identifying an adequate high-volume unit. 3. Outcome relationship for the event of bleeding in regard to the specific complications thereof and with surgeon/ hospital volume is severely underreported. 4. Outcome quality of bleeding is expected to correlate more with hospital volume than with surgeon volume by structural logistic impact. 5. Patients`risk factors of bleeding are mainly uninfluenced by surgeon and hospital volume.

Volume and wound complications
Few authors have specifically investigated the influence of volume on the prevalence of wound infection, hematoma, and seroma (Table 8). Neck hematoma has been found to be inversely related to surgeon volume. However, most of the authors agree that wound seroma is not influenced by the number of thyroid procedures performed. It is not clear whether the prevalence of wound infection is related to surgeon volume.
Prevalence of wound complications after thyroid surgery is low, rating from 0.3 to 1.7%. In an Italian retrospective multicentric review of 14,000 patients, Rosato et al. [52] (ELIII, RGD) found that wound infection occurred after 0.3% of all operations, accounting for 2.0% of all the complications. Half of the surgeons applied antibiotic prophylaxis and 17% antibiotic therapy; 33% did not apply any prophylaxis or therapy. Despite these differences, the incidence of infections was not different between these groups.
In another Swedish multicentric study of 1157 patients operated for Graves' disease, Hallgrimson et al. [22] found 1% of wound infection which was a risk factor for vitamin D treatment at discharge.
A cross-sectional analysis performed by Hauch et al. [39] using the American Nationwide Inpatient Sample (NIS) database identified 62,722 thyroid procedures for benign disease (57.9% were total thyroidectomies). Although low-volume surgeons were more likely to have overall postoperative complications, wound complication (not specified) was not different between low-(< 10 thyroid surgeries performed), intermediate- , and high-volume (> 99 thyroid procedures) surgeons. Neck hematoma was inversely related to surgeon volume: low volume (1.72%), intermediate (1.15%), and high (0.55%) p < 0.0001. No differences were seen between groups regarding neck seroma.
Using the same American database, Adam et al. [11] found more prevalence of non-specified wound complications in low-(1.1%) vs. high-(0.7%) volume surgeons but without reaching statistical significance (p = 0.05). Bleeding, however, was more frequent if the patient was operated by a lowvolume surgeon than by a more experienced one (1.6 % vs. 1%; p = 0.006). In the same way, in a German prospective multicenter study assessing complications after thyroid surgery for benign goiter, Thomusch et al. [5] did not find differences in wound infection and wound dehiscence between high-(> 150), intermediate- , and low-(< 50) volume surgeons.
On the other hand, other authors did find an inverse relationship between surgeon volume and wound infection: Meltzer et al. [33] assessed complications including surgical site infection, deep incisional surgical site infection, wound disruption, neck swelling, and seroma. Lower rate of site infection was found in high-volume surgeons (0.3%) than in low-volume surgeons (1%) after applying a propensity matching score. Likewise, Kandil et al. [9] found significantly more prevalence of wound infection and neck hematoma in low-volume surgeons than for intermediate-and high-volume surgeons but no substantial differences regarding neck seroma.
Propositional summary postoperative wound complications and volume 1. The prevalence of postoperative neck hematoma has been found to be inversely correlated to surgeon volume. A cut- Surgeon volume, extension of surgery, recurrence, and mortality in papillary thyroid cancer Surgery is the most important treatment modality for papillary thyroid cancer, and the relationship between type of operation and clinical outcomes has been studied extensively. Surgeonrelated factors (annual operation volume, years of experience) also play an important role in short-and long-term clinical outcomes of papillary thyroid cancer (PTC) ( Table 9). There is a compelling amount of data about the inverse relationship between surgeon volume and postoperative complications after thyroid surgery. Most studies, however, focus on short-term complications rather than the more important long-term oncological outcomes such as structural recurrence, subsequent distant metastasis, and cancer-specific mortality of PTC. As a result, there is a lesser amount of research on the most of the published data that are only for patient populations in the USA and Europe [12].
On the other hand, consistent and robust studies have found associations between surgeon factors and cancer-specific mortality or tumor recurrence for many other cancer types. The relationship between surgeon-related factors and long-term oncological outcomes in PTC received less attention. This issue is critically important, given that adequately performed thyroidectomy is the cornerstone of long-term disease-free survival and furthermore, patients with advanced PTC have a high likelihood of poor prognosis.

Volume, complexity, and cancer
Several works by Sosa et al. [4,34,38], a pioneer in thyroidectomy outcome research, emphasize that surgeon volume is much more important than hospital volume. Interestingly, in a cross-sectional analysis where 21,270 thyroid surgical procedures performed in Maryland (USA) were analyzed, thyroid cancer surgery was less likely to be performed by highvolume surgeons despite an increase in surgical cases. Further investigation is needed to identify factors contributing to this trend [4,6,29].

Completeness of resection
Completeness of resection after total thyroidectomy for papillary thyroid cancer is a crucial factor for biochemical, structural recurrence, and, in some studies, even survival. In a pioneer study to examine margin status after total thyroidectomy for PTC in a nationwide database (National Cancer Database of USA), 31,129 patients who had undergone total thyroidectomy for cancer were analyzed focusing on the relationship between completeness of resection and recurrence and survival. After multivariable adjustment, both microscopically and macroscopically positive margins were associated with compromised survival. Of particular interest is that reception of surgery at a high-volume facility (OR 0.72; p = 0.01) emerged as a protective for cancer-related death [56] (ELIII, RGD).
At least two studies identified, in the US health system, several vulnerable patient populations with a higher risk of incomplete resection after thyroidectomy for PTC as lowincome patients treated low volume and community hospitals are less likely to undergo total thyroidectomy. These differences in practice patterns likely reflect disparities in access to health care, medication, and comprehensive cancer centers [56,60] (ELIII, RGD) (ELIII, RGC).
Radioiodine (RAI) remnant uptake has been used as a surrogate parameter to assess the completion of resection and thus seems useful to assess the completeness of a total thyroidectomy. Remnant uptake is a useful postoperative oncologic quality indicator that can predict a patient's risk of disease recurrence and indicate the completeness of resection [59] (ELIII, RGC). Oltmann et al. [57] (ELIII, RGD) observed that completion thyroidectomy carried out by high-volume surgeons (≥ 30 procedures/year) was associated with much lower remnant uptake at first RAI treatment (0.06 vs. 0.22% for low-volume surgeons; p = 0.04) [57] (ELIII, RGD). Yap et al. [58] (ELIII, RGD) reported a significant relationship between the number of thyroidectomies and total body uptake of 131 I in pre-ablative scans. They demonstrated the incompleteness of tumor resection by RAI uptake and concluded that low surgeon volume was associated with incompleteness of tumor resection, although there was no analysis of tumor recurrence [58].
A single study combined several indicators as initial extent of the operation together with three markers of complete resection including uptake on I 123 prescan, thyrotropinstimulated thyroglobulin levels, and I 131 dose administered to assess the adequacy of thyroidectomy for differentiated thyroid cancer and relate them to surgeon volume. They conclude that surgeons who perform ≥ 30 thyroidectomies/year are more likely to undertake the appropriate initial operation and have more complete initial resection for differentiated thyroid cancer patients. Surgeon volume appeared in this study an essential consideration in optimizing outcomes for differentiated thyroid cancer patients, and even higher thresholds (≥ 50 thyroidectomies/year) may be necessary for patients with advanced disease [44].

Recurrence
In a retrospective recent study, Kim et al. [54] (ELIII, RGD) analyzed 1103 patients operated on for papillary thyroid cancer with N1b extension. They compared the results from lowvolume surgeons (< 100 procedures/year) with high-volume surgeons and stratified them further into more and less experienced subgroups. After 150 months of follow-up, the recurrence-free survival obtained by the high-volume and more experienced surgeons was statistically superior to the remaining groups of surgeons with high-or low-volume and less or more experience. Treatment by a high-volume surgeon was associated with half (14.7% vs. 27%) structural recurrence rate in patients with N1b PTC. This association was maintained after adjustment for age, sex, and conventional risk factors for recurrence of thyroid cancer. High-volume surgeons had a significantly lower rate of resection margin positivity and stimulated thyroglobulin (Tg) level above 10 ng/ml at the time of first radioiodine treatment, representing the completeness of surgical resection. Neither distant metastasis nor cancer-specific mortality was modified by surgeon volume. The study concluded that by far the best combination was provided by experienced surgeons with high volume. Also, this imaginative study emphasizes that low volume seems to cancel the potential advantages of more experience and vice versa [54].
Several other studies have shown the importance of surgical volume in reducing rates of recurrence and complications for thyroid cancer patients undergoing total thyroidectomy [36,60]. Surgeons who perform > 20 thyroid surgeries a year have lower permanent complication rates and leave less remnant thyroid tissue behind to treat with RAI [59]. The analysis of data from the National Cancer Database showed that having total thyroidectomy performed at an institution that performed less than 12 thyroidectomies a year was associated with compromised survival [56].
Also, several studies have shown that patients operated on by specialized endocrine surgeons with > 10-year experience and/or at institutions with a focus on thyroid cancer have unsurprisingly less remnant thyroid tissue to ablate.

Reoperations
Mitchell et al. [15] analyzed the relationship between hospital volume and endocrine avoidable operations and demonstrated that operations for thyroid cancer led to avoidable reoperations more frequently if performed at low-volume centers, mainly compromised by a mix of "judgement errors" and "technical errors". In this retrospective analysis of reoperated patients, it appears obvious that the avoidable operations are a result not only of technical errors, easily attributable to limited exposure and practice, but also to "judgement errors" stemming from a lack of surgical knowledge in low-volume centers [15].
Propositional summary extension of surgery, recurrence, and mortality in papillary thyroid cancer and volume 1. Low-volume surgeons perform less complete operations for thyroid cancer. 2. Patients with PTC operated on by low-volume surgeons or low-volume centers suffer more recurrences and have to be reoperated more often. 3. No direct relationship between volume and survival of patients with PTC/DTC has been clearly demonstrated. 4. A surgeon performing less than 25 total thyroidectomies per year can be considered a low-volume thyroid surgeon.

Discussion
The working group in volume outcome following thyroid surgery aimed to evaluate the impact of surgeon and hospital volumes on outcome of thyroid surgery, based on a review of the existing literature. The analysis of the literature confirmed that surgeon's volume and experience play a significant role in the rate of post-thyroidectomy complications. The role of hospital volume, especially when adjusted for surgeon volume, is more limited or inconsistent [5,9,36]. This underlines once more, if still necessary, the need of an appropriate training and subspecialization of surgeons dealing with thyroid surgery. Surgical skill and experience are particularly necessary when dealing with difficult cases, such as autoimmune thyroid disorders (including Graves' disease and Hashimoto's thyroiditis) and thyroid cancer, necessitating extended resection and neck lymph node dissection. Indeed, it has been reported that high-volume surgeons are able to reduce the increased disease-related risk of post-thyroidectomy complications [9].
Moreover, it has also been demonstrated that high-volume surgeons and high-volume hospitals are associated with a significantly shorter hospital stay and costs [12]. One paper showed how more extended and radical thyroidectomies (i.e., total thyroidectomy plus central neck dissection) are associated with an increased rate of complications, namely prolonged hypocalcemia, in high-volume but not in lowvolume centers in South Korea. However, overall, complication rate in high-volume centers was significantly lower with respect to low-volume centers, even in the case of the association of central neck dissection [27].
Several risk factors have been described for postthyroidectomy hypocalcemia, related to the disease (including autoimmune thyroid disease, retrosternal goiter,