Neuromonitoring of the laryngeal nerves in thyroid surgery: a critical appraisal of the literature
- 590 Downloads
One of the most significant complication of thyroid surgery is injury of the recurrent laryngeal nerve. Injury of the external branch of the superior laryngeal nerve is a less obvious but occasionally significant problem. Recently, neuromonitoring during thyroidectomy has received considerable attention because of literature encouraging its use, but there is no consensus about its advantages and utility. A critical assessment of the literature on neuromonitoring was conducted in order to define its effectiveness, safety, cost-effectiveness and medical-legal impact. Available data does not show results superior to those obtained by traditional anatomical methods of nerve identification during thyroid surgery. Data about cost-effectiveness is scarce. The literature shows inconsistencies in methodology, patient selection and randomization in various published studies which may confound the conclusions of individual investigations. The current recommendation for use in “high risk” patients should be assessed because definition heterogeneity makes identification of these patients difficult. As routine use of neuromonitoring varies according to geography, its use should not be considered to be the standard of care.
KeywordsThyroid gland Laryngeal nerve injuries Neuromuscular monitoring Thyroidectomy Tracheostomy
Operations on the thyroid gland are the most frequently performed endocrine procedures worldwide. Improvements in technique have decreased the risk of injury to adjacent structures to minimal levels [1, 2, 3]. One of the most significant complication of thyroid surgery is injury of the recurrent laryngeal nerve (RLN). The external branch of the superior laryngeal nerve (EBSLN) is also at risk. Most injuries are unilateral and temporary and improve within a few weeks after surgery. Nonetheless, some are permanent, and if uncompensated, may produce incapacitating dysphonia. Bilateral RLN injury, fortunately infrequent, produces airway obstruction. Tracheotomy and corrective surgery to widen the glottis may be required. Permanent RLN injury may have devastating consequences to the patient’s health and quality of life . The risk of RLN injury is approximately 1 %, but it can increase to 12 % in reoperative cases [1, 2, 3].
In order to avoid these injuries, surgeons have designed many strategies, including anatomical and electrophysiological tools for neuromonitoring. Anatomical exposure of the whole RLN significantly reduces the global rate of postoperative palsy. Hermann et al.  studied the effect of nerve exposure on the incidence of RLN paralysis in 19,443 primary surgeries for benign thyroid diseases. The extent of nerve dissection varied; thus, the rate of permanent RLN paralysis averaged 0.9, 0.3, and 0.1 % for surgeons who only localized, partially exposed, and completely dissected the RLN, respectively. On the other hand, neuromonitoring techniques have received a lot of attention because of recent literature encouraging their use . However, information about neuromonitoring has been contradictory, and there is no consensus about its advantages and utility. The aim of this review was to critically assess the literature about the use of neuromonitoring, focused into specific questions: What are the effectiveness? What is the safety? Which are the indications? Is it cost-effective? And what is the medical-legal impact of its use?
The concept of neuromonitoring in thyroid surgery
The concept of nerve monitoring in thyroid surgery has a long history. Monitoring is defined by the Cambridge dictionary as “watching and checking a situation carefully for a period of time to discover something about it” and can be applied to maneuvers designed to avoid an injury. It is possible to include in this concept the time-honored recommendation of searching for and identifying the RLN before commencing the capsular dissection of the gland. This concept was developed and published by Lahey and Hoover  in 1938 and confirmed in the 1970s by Mountain et al.  and Riddell . An extension of the monitoring concept is the concept of verification defined as “proving that something is true or correct” which involves evaluation of nerve function before, during and after thyroid resection. Functional nerve verification (FNV) was initially accomplished by palpation of laryngeal muscle contraction during stimulation of the nerve and evolved to the currently employed techniques of electrode neurostimulation with electromyographic (EMG) and evoked potential monitoring. It is important to clarify and assess the efficacy of any procedure employed for verification, for example, whether accurate nerve monitoring has produced a decrease in the risk of nerve injury. The accuracy of any method of verification should be carefully evaluated before recommending it for routine use. According to Timon and Rafferty , a good verification method, should be harmless, non-invasive, reliable, and have the capability to accurately distinguish the nerve from surrounding tissue.
Methods of functional nerve verification
There are various methods and devices for FNV. Many older methods were based on pressure changes on an inflated balloon located at vocal fold level when the nerve was stimulated [11, 12], but were abandoned because of lack of reliability. Visualization methods have also been described, using an endoscope for direct observation of vocal fold movement before and after resection . Later, the methods of electrical stimulation came into use. These methods were based on the observation or palpation of contraction of the effector muscles after stimulating the motor nerve with electricity. In 1985, James et al.  proposed a method of palpation with stimulation. Echeverri and Flexon  and Randolph et al.  reported their experience with 70 and 449 patients using this method. Recently, EMG methods came into use. These methods can be divided in two groups: those that use needle electrodes directly inserted in the effector muscle and those that register potentials with surface electrodes located over the effector muscle. Both used a mechanism of recording of the depolarization wave and a sound that confirms the integrity of the electrical circuit. The first ones are not different than current methods of EMG used in other body parts. The electrodes are inserted directly into the vocal folds in the preoperative period with an endoscope or through the cricothyroid membrane at the beginning of the procedure . Flisberg and Lindholm  and Spahn et al.  reported the first clinical experiences using an intralaryngeal electrode and observing the movement of the needle after nerve stimulation. More recently, some authors published their experience with this device [20, 21, 22] but with the addition of electrophysiological variables and evoked potentials. Other authors [23, 24, 25] employed an electrode placed directly on the vocal folds by endoscopy. The latter is a modification of the needle method, with indirect measurement of the contraction of the effector muscle. A method of attaching a skin electrode to the tracheal tube [25, 26, 27] or over the posterior cricoarytenoid muscles independent of the tracheal tube has been described . Barwell et al.  introduced the use of a nerve integrity monitor (NIM)® device and Timon and Rafferty  assessed 21 consecutive patients using the Neurosign 100® device. Both devices were compared and differences in sensitivity or specificity for RLN palsy rates were not found, but Neurosign® had a lower cost . Dackiw et al.  proposed a modification to the stimulating probe attaching the cords directly to a hemostat to allow continuous verification during dissection. The most recent modification is continuous vagal verification, which puts the electrodes on the vagus nerve. The expected advantages of this method include continuous recording of the nerve signal that allows more specific control of the manipulation during all the steps of the surgery, indicating when the nerve is mobilized or touched, to avoid dangerous maneuvers [32, 33, 34].
Although there have been a lot of improvements in technical details, it is unclear which of these methods offer a greater advantage. Tschopp and Gottardo  made a comparison between intralaryngeal surface and intra and extralaryngeal needle electrodes and found that the surface electrodes had a lower reliability due to movement of the tracheal tube and displacement of the electrodes. Cavicchi et al.  compared palpation with EMG techniques in a randomized controlled trial (RCT) with 250 patients. They could not find any difference in temporary or definitive RLN injuries between both techniques. Sensitivity and specificity were 66 and 93 % for EMG versus 33 and 97 % for palpation. Friedrich et al.  and Koulouris et al.  compared intermittent and continuous monitoring in non-randomized trials and found no differences in RLN palsy between groups. There were no adverse effects on sympathetic and parasympathetic activity related to the continuous stimulation of the vagus nerve. More data are necessary about this new technique before making any recommendation. In conclusion, the most reliable method is the EMG with direct electrodes, but it is not popular. At present the most used is the surface method using a tracheal tube with embedded electrodes.
Authors who favor FNV claim that its use allows faster and easier identification of the nerve and fewer nerve injuries with consequent lower palsy and tracheostomy rates . As it is accepted that the best method to assess effectiveness of an intervention in health is a RCT, we conducted a literature search in the Pubmed/Embase database with the terms thyroid, thyroid*, nerve, laryngeal, monitor, and neuromonitoring to find actual evidence about this subject. We found 389 references with only six randomized clinical trials (RCTs) [36, 40, 41, 42, 43, 44] and one systematic review . Most studies regarding this subject were case series with small sample sizes or observational studies from single centers or registries. Three RCTs [40, 43, 44] assessed the effect of FNV on EBLSN injuries, while the others assessed the RLN. One of the trials compared two techniques of neuromonitoring without a control arm .
Superior laryngeal nerve injury
Few observational studies have exclusively evaluated the use of FNV for EBSLN. Jonas and Bahr  assessed a device with an electrode inserted in the cricothyroid muscle and identified 97 % of RLN and 37 % of EBSLN without any nerve injury. Most studies combine the evaluation of EBSLN and RLN. Three RCTs assessed FNV in EBSLN. Lifante et al.  assessed EBSLN activity with electrodes located directly on the cricothyroid muscles and compared visual identification plus electrical guidance in the intervention group with visual identification. The final outcome was assessed with the voice handicap index-10 and the reflux symptom index. They studied 47 patients (22 for the intervention group vs. 25 in the control group) and an analysis was made for nerves at risk (NAR). 65 % of nerves in the intervention group vs. 21 % in the control group were clinically identified, and there was more vocal impairment at 3 months in the control group, but there are no data about the rate of EBSLN palsy. An RNL injury that should be recorded as a failure was not included in the analysis, and there are no data indicating to which group this patient belonged. The study did not report the methods of randomization, concealment, and blinding. There also were non-statistically significant differences in the number of total thyroidectomies between groups (59 % in the intervention group vs. 32 % in the control group), as well as in surgical times, and thyroid weights, that suggest weaknesses in the procedure of randomization. An intention-to-treat analysis was not made, as all patients with laryngeal nerve palsy were excluded from the analysis, and the investigators performed neither a postoperative stroboscopy (a more objective assessment of EBSLN injury than the methods employed), nor a comparison between groups that should demonstrate the real difference of the use of the FNV.
Barczyński et al.  assessed EBSLN activity using a tracheal tube with integrated surface electrodes. In the intervention group the device helped identify the nerve at the upper pole, but in all cases the surgeon conducted a meticulous dissection of the vessels. Outcome was determined by the rate of EBSLN identification, but the investigators also made vocal assessment with stroboscopy and the “grade, roughness, breathiness, asthenia, strain” (GRBAS) scale. 210 patients (105 patients in each group) were included and an analysis was made for NAR. Investigators found a larger rate of nerve identification in the intervention group (83 vs. 34 %). They also found 14 % of 2A and 2B types of EBSLN according to the Cernea classification . The rate of temporary palsy was lower in the monitoring group (5 vs. 1 %) as was the percentage of abnormal stroboscopic findings at 3 weeks, but these differences disappeared after 3 months. There was no difference in permanent palsy rate. The authors noted that in all patients stimulation of the nerve was perceived in the cricothyroid muscles, but this activity was only confirmed by EMG findings in 73 % of cases. They also described three patients whose nerves were identified, but with loss of signal after mobilizing the gland, even under direct visualization and careful protection of the nerve during ligation. They did not make an “intention to treat” analysis, but the number of patients lost from the study is too small to invalidate the final results.
Khaled et al.  assessed the EBSLN, but did not describe the method of monitoring. In the intervention group the device helped identify the nerve at the upper pole, but in all cases the surgeon made a meticulous dissection of the vessels. The outcome was defined as the rate of EBSLN identification and injury measured by laryngoscopy and subjective interviews. 42 patients were included in the study (21 patients in each group). Only one patient in the control group incurred injury of the EBSLN. The study was randomized but details about concealment and blinding, a small sample size, and the lack of more information on the article impedes offering more consistent conclusions.
Inabnet et al.  published his series of patients randomized to thyroidectomy under locoregional anesthesia with or without nerve monitoring. In the nerve monitoring group electrodes were impaled into both cricothyroid muscles. Superior laryngeal nerves were identified and stimulated. Using the voice handicap index-10, a subjective advantage was demonstrated in the monitored group. In conclusion, available RCTs have significant weaknesses and small sample sizes that make conclusions prone to bias. FNV methods do not appear to avoid temporary or permanent EBSLN injuries though they increase the rate of nerve identification.
Recurrent laryngeal nerve injury
Non-comparative case series
There are many case series studies with 19–499 patients that report rates of temporary (0.6–6 %) and permanent palsy (0–1 %) that are similar to those reported from most specialized centers [16, 26, 49, 50, 51, 52, 53, 54]. In one study, IRM was not only used to confirm the RNL, but also to locate and identify 23 % of the nerves before surgical exposure . Some other relevant data include a report of patients who underwent minimally invasive surgery , with a 6 % rate of lack of nerve identification or failure of the device  and a 26 % rate of difficulty in identifying the nerve even after using the stimulating probe . According to Chiang’s et al.  study on 113 patients, although the rates of RLN palsy are not decreased, the use of neuromonitoring provided valuable information by ascertaining where and how the RLN has been injured (transection, inadvertent clamping, overstretching at the region of Berry’s ligament, etc.). However, the design of these studies makes it difficult to draw conclusions about effectiveness and only represents local experiences with FNV. Some weaknesses are that definitions of difficulty were subjective, there were no basal rates for comparison, and there was no measurement of other variables that help to determine if the use of the FNV makes the RLN easier to find and decrease operative time and injury frequency.
Non-randomized comparative trials
We also found many studies comparing cases that employed FNV with cases that did not. Witt  reported on 136 patients with 190 NAR and used the monitor only at the end of the procedure to determine functional integrity of the nerve. Rates of temporary and permanent RLN injury of 3.7 and 1.7 %, respectively, were reported. However, there was no statistically significant difference between patients with or without FNV. Two cases of permanent palsy had adequate EMG response at the end of the procedure. Snyder and Hendricks  reported on 103 patients and 185 NAR, without differences in the rate of temporary or permanent RLN injury. Chan et al.  included 639 patients with 1,000 NAR (316 patients in the intervention group vs. 323 in the control group). They found a higher risk of RLN injury in patients classified as high risk, but there was no significant difference in postoperative transient and permanent paralysis rates between the neuromonitored and control patients in the overall group. In subgroup analysis, however, the postoperative RLN palsy rate was higher during reoperative thyroidectomy (19 vs 4.6 %; P = 0.019) in the control group than in the neuromonitoring group (7.8 vs 3.8 %; P > 0.05). Terris et al.  reported on a cohort of 137 patients with 176 NAR who underwent minimal invasive thyroidectomy, without nerve stimulation but with FNV used as an alert. There were no differences in the rate of temporary palsy between the monitored (4 %) and the control group (6 %). Atallah et al.  studied 261 high-risk patients with 421 NAR (112 in the intervention group vs. 149 patients in the control group) in a before and after design. There were no differences in temporary (5 vs. 5.4 %) and permanent palsy rates (3.9 vs. 3.8 %). Chiang et al.  evaluated 289 patients with 435 NAR to study the effect of standardization of the technique in a before and after design. They demonstrated a statistically significant difference in the rate of permanent palsy from 6.4 to 0.8 %. As a byproduct of their study, the authors determined that the method of dissection of the RLN posed a greater risk of nerve injury irrespective of whether monitoring was employed. After the initial period of the study they changed their surgical strategy from a dissection directed to Berry’s ligament to a dissection starting at the level of the inferior thyroid artery, with improvement in the rate of nerve injury. Barczyński et al.  assessed the use of FNV in 302 patients who underwent thyroidectomy with central neck dissection in a before and after design. They found a statistically significant difference in the rate of temporary palsy (1.3 vs. 3.3 %) but not in the rate of permanent injury. Chiang et al.  included 506 NAR and assessed the effect on patients with extensive dissection of the RLN (101 for the monitored group vs. 405 in the control group). There were no differences between groups regarding temporary (2 vs. 2.2 %) or permanent injury (0 vs. 0.2 %). Duclos et al.  reported on 686 patients in a comparative trial with expert endocrine surgeons based on availability of the device. They found no difference in the rate of RLN temporary palsy (7.6 % for intervention vs. 4.7 for control). Parmeggiani et al.  studied 440 patients (120 in the intervention group vs. 320 in the control group) and found no differences in temporary (1.4 vs. 2.8 %) or definitive (0 vs. 1.4 %) injury. Thomusch et al.  reviewed 4,382 patients who underwent thyroidectomy for goiter recorded in a German registry of data from 45 hospitals. The rate of transient and permanent RLN palsy based on NAT was, respectively, 1.3 and 0.3 % with intraoperative neuromonitoring. These rates were significantly lower compared with intraoperative visual RLN identification without intraoperative neuromonitoring which resulted in rates of 2.1 and 0.8 %, respectively. A multivariate logistic regression analysis confirmed that the use of intraoperative neuromonitoring decreases the rate of postoperative transient and permanent RLN palsies as an independent factor. The authors concluded that intraoperative neuromonitoring of the RLN in thyroid surgery is recommended because of significantly lower rates of transient and permanent RLN palsy rates in comparison with conventional RLN identification. Dralle et al.  in an extended analysis of the German registry with 16,448 patients with 29,998 NAR found rates of permanent RLN injury of 3.6 and 5 % in reinterventions for benign goiter and malignancy, respectively. An unexpected finding from this study was a higher risk of RLN injury in central neck dissection (2.61 95 % CI 1.34–2.89) and recurrent goiter surgery 4.45 (2.43–6.48) among the FNV group. The authors felt that nerve monitoring is a promising method for nerve protection, particularly in cases undergoing extended thyroid resection procedures, although they did not find statistically significant differences between groups with or without FNV with respect to RLN injury. They attributed the lack of significance to the very low incidence of RLN paralysis in the cohort.
Lang and Wong  published their series of nerve monitoring in 60 patients undergoing transaxillary thyroidectomies demonstrating its feasibility with an adapted nerve monitoring probe. A 6.6 % rate of nerve palsy in 76 NAR was demonstrated. The data are limited for neuromonitoring in transaxillary thyroidectomies with few case series and no prospective or randomized series having been published. As this technique becomes more popular and available the use of nerve monitoring and the specific challenges specific to this method will need to be studied as well.
As these studies are observational, conclusions should be accepted with caution due to the effects of confounding factors that are not measured such as population case-mix, surgeon’s experience, and lack of standardization of the method of monitoring. Most studies could not identify differences in the rates of temporary and permanent RLN injury with or without nerve monitoring. A trend towards the greatest effect in reducing injury rates appeared in reoperative cases, but was not statically significant.
Randomized controlled trials
Dionigi et al.  assessed RLN activity using a tracheal tube with integrated surface electrodes for video-assisted thyroidectomy. In the monitored group the FNV helped identify the nerve, but in all cases the surgeon conducted a meticulous dissection of the nerves. Outcomes were measured by operating time and rate of complications evaluated with laryngoscope and Voice Handicap Index. 72 patients were included (36 patients in each group). Analysis was made for NAR. There were no differences in temporary or permanent RLN injuries or in the rate of nerve identification. There was a difference in rate of identification of EBSLN which was higher for the FNV group (83 vs. 42 %), but there was no difference in the rate of injuries. In this study the method of randomization was not satisfactory, because time of admission was used as the basis. The authors described only one case of RLN palsy in the FNV group that caused the procedure to be terminated. In the control group there were three instances of RLN palsy, but there are no data about the types of procedures employed. Tracheostomy cases were not reported.
Barczyński et al.  assessed RLN activity with electrodes inserted in the laryngeal muscles through the cricothyroid ligament. In the monitored group the FNV helped identify the nerve, but in all cases the surgeon conducted a meticulous dissection. The outcome was determined by the rate of RLN injury. Rates of identification and laryngoscopic findings were also assessed. 1,000 patients were included (500 patients in each group) and NAR analyzed. The rate of temporary RLN injury was lower for the FNV group (1.9 vs. 2.4 %), a statistically significant difference, but the permanent injury rate was similar 0.8 versus 1.2 % in both groups. The authors conducted a subgroup analysis by risk showing that among low-risk patients there was no difference for transient RLN palsy (1.8 vs. 2.8 %) but significant differences appeared in high-risk patients (2.0 vs. 4.9 %). However, there were no differences in permanent palsy or RLN identification. Unfortunately, the study did not define what is considered low and high risk, but, it is noteworthy that 50 % of patients were classified as high risk, while only 15 % of patients had retrosternal goiter and 4 % had massive goiters. Also noted in this study was the identification of 33 % more branching RNLs in the monitored group raising the possibility that monitoring helps to better define aberrant and complex anatomic variations.
Higgins et al.  conducted a systematic review and meta-analysis of 14 case series studies; nine were comparative non-randomized trials and only one was a randomized controlled trial. In multiple comparisons, the authors failed to demonstrate a difference in the rate of temporary or permanent RLN injuries between groups. However, this meta-analysis by including mainly observational trials fails to meet the usually accepted criteria for meta-analytic technique.
Safety and limitations
There are few reports about complications derived from the use of FNV devices. Rare cases of tracheal tube obstruction because of overinflating  have been described. Birkholz et al.  assessed laryngeal injuries in 127 patients comparing the use of FNV devices and found no differences, using postoperative endoscopic methods. Overall, few adverse effects have been reported with the use of FNV. Thus the techniques may be considered to be safe.
While monitoring may help identify the nerve, the issues related to technical pitfalls and losses of signal are of concern. Each procedure is different and false-positive and false-negative results may be recorded. Most nerve injuries are not due to direct transection of the nerve but, in difficult cases, occur during attempts to control bleeding in the region of Berry’s ligament or in other technically difficult situations. The question whether the incidence can be reduced by routine nerve monitoring remains unresolved.
Sensitivity and specificity of functional nerve verification as a predictor of postoperative recurrent laryngeal nerve function
Otto and Cochran  assessed sensitivity and specificity in 60 patients. They used the method of palpation of muscle contraction and assessed all patients with endoscopy in the 1st week. The nerve could not be identified during the procedure in five patients (8.3 %). The method predicted postoperative palsy with a sensitivity of 75 % and a specificity of 92 %. Dackiw et al. , in a multicenter trial including 117 patients, reported a specificity of 92 % but sensitivity was not reported and was impossible to calculate from their data. Hermann et al.  included 328 patients and 502 NAR using an electrode inserted through the cricothyroid ligament, with postoperative assessment by endoscopy. They found an overall sensitivity of 44 and 57 % for permanent and temporary palsy, respectively, with specificity of >96 %. In 14 of 21 patients with preoperative palsy an EMG response could not be elicited, even when the nerves were anatomically intact. Beldi et al.  studied 288 patients and determined sensitivity of 40 % and specificity of 98 %. Barczyński et al.  found specificity >97 % with sensitivity <71 % in a study of 1,000 nerves. Tomusch et al.  evaluated 8,534 patients with 15.403 RLN at risk. Sensitivity was 29–33 % and specificity 97–98 % for temporary paresis and sensitivity was 42–45 % with specificity of 96–98 % for permanent paralysis. Nerves were tested by indirect or direct stimulation. For various reasons, the authors excluded 73 of 431 (17 %) events of temporary palsy. Tomoda et al.  included 1,376 patients with 2,197 NAR using FNV with the palpation method at the end of surgery. They found rates of temporary and permanent RLN palsy of 3.6 and 1 %, respectively. Sensitivity and specificity for temporary RLN palsy were 69 and 99 %, and for the permanent one, 85 and 97 %. Unfortunately, this study did not report the differential sensitivity and specificity between goiter and cancer patients. Chan et al.  in 501 NAR using electrodes attached to the tracheal tube found a sensitivity of 52 % and a specificity of 94 %. Parmeggiani et al. [30, 65] among 880 patients found a sensitivity of 12 % and specificity of 90 %. Alesina et al.  found a sensitivity of 37 % and specificity of 95 % in reoperative surgery.
In a prospective study Chan and Lo  validated the ability of intraoperative neuromonitoring to predict postoperative RLN outcomes in 171 patients with 271 NAR during thyroidectomy. There were 241 true-negative (positive signal and no cord palsy), 15 false-positive (negative signal but no cord palsy), 8 true-positive (negative signal and cord palsy), and 7 false-negative (positive signal but cord palsy) results, as correlated with the postoperative assessment. The sensitivity, specificity, and positive and negative predictive values (PPV and NPV) were 53, 94, 35, and 97 %, respectively. For the high-risk group, the sensitivity value and PPV increased to 86 and 60 %, respectively. The authors conclude that there are many pitfalls associated with this technique, precluding its routine application except for selected high-risk patients. The ability to elicit RLN responses by low-threshold stimulation after thyroid resection is not an infallible predictor of postoperative function, since it may be recorded despite lack of postoperative nerve function, and high threshold may be obtained despite good outcome. Inconsistency between low threshold and poor postoperative function is attributed to the presence of sporadic fibers that are physiologically intact and depolarized in response to the stimulus in patients in whom the majority of nerve fibers had undergone axonal injury or intraoperative events (i.e. nerve edema) subsequent to the final threshold measurements. On the other hand, the recording of a high threshold despite a good outcome may result from nonuniform injury to RLN fibers or the presence of fluid in the operative field acting to shunt the stimulating current away from the nerve fibers. Finally, Cernea et al. , in a prospective study with 447 patients and 868 NAR, found a NPV of 100 % and PPV of 40 %.
In conclusion specificity of the FNV is >90 % in most studies, but sensitivity is highly variable, from 12 to 75 %. However, these results are also influenced by the high heterogeneity of the method, the patients, and the experience of surgeons.
Discussion of cost is inadequate in most articles. Hemmerling et al.  noted that the device cost US $5,000 in 2001, with an added cost of US $25 for each tube. Loch-Wilkinson et al.  state a cost of AUD $40,000 for the equipment and AUD $500 for disposables used in each case, when surface electrodes and EMG recording is used. The cost can decrease by 70 % when the palpation method is used. Dionigi et al.  in an economic analysis compared the use of FNV devices with routine practice. The additional cost of using FNV was €$72–272. Globally, cost was increased by about 5–7 %.
Besides the cost of the device (capital cost) and the disposables, it is important to consider the associated costs with the use of technology, which were not included in these studies and are necessary to make a cost analysis. Due to the design of the endotracheal tube and the necessity of exact localization of the electrodes over the vocal folds some authors suggest that intubation should be done with optical devices to guarantee the correct location (Glidescope®, fiberoptic laryngoscope) . No study considered the costs of an unnecessary delayed reoperation in false-positive cases or the costs of treatment of a nerve palsy. Also, the duration of surgery may be shortened with the use of nerve monitoring and this was not considered in the cost of the surgical procedure. With the current data it is reasonable to suggest that use of FNV increases costs, but this must be studied thoroughly. The combination of similar effectiveness and high cost suggest that FNV is not cost-effective (particularly for “low risk” cases), but more research is necessary to confirm this conclusion.
Patterns of use
A survey in Germany, reported by Dralle et al. , included a cohort of 83,577 thyroidectomies. Nerve monitoring was employed in 99.3 % of patients, and thus is now considered—as the standard of care in Germany. Routine vagal stimulation, as recommended by the guidelines, was employed in 49 % in the pre-resection phase and 73 % in the post-resection phase of the surgery. EMG findings were recorded in 54 % during the pre-resection phase in comparison with 72 % in the post-resection phase. Low-volume centers used the device less than did the high-volume centers. Surgeons were questioned as to their willingness to suspend surgery of the contralateral side in planned bilateral resections if a loss of signal occurred on the first side. More than 70 % agreed with this approach. The response rate to the survey was 47 % of surgeons questioned. Horne et al.  surveyed patterns of use by American surgeons. Most respondents reported performing fewer than 25 thyroidectomies each year. Only 28.6 % of respondents (159) reported using intraoperative monitoring for all cases. Respondents were 3.14 times more likely to use intraoperative monitoring if they had used it during their training. Surgeons currently using intraoperative RLN monitoring during thyroidectomy were 41 % less likely to report instances of permanent RLN injury in their experience. Only 0.7 % of respondents who experienced a nerve injury incurred a lawsuit. None of these used the monitor or changed their practices after the event. Reasons for not using the FNV device were reliance on anatomy (25 %), “not needed” (25 %), and the high rate of false positives (20 %). Cost and non-availability was reported by 21 %. To the contrary, those who used monitoring believed that it improves security (34 %), is helpful in high risk cases (33 %), and protects against medical legal issues (22 %). 22 % use it because it is available. This study had only a 43 % rate of response. The authors concluded that the majority of American surgeons do not use nerve monitoring, and that its use is influenced by surgical background and training.
Sturgeon et al.  included 117 endocrine surgeons in their survey, most from North America. 37 % reported using FNV, mostly (23 %) in a selective manner. 14 % reported that they abandoned its use after trying it. Most surgeons who used the device were in the 35- to 44-year-old age group, worked in community hospitals classified as performing more than 100 procedures per year, and had immediate availability of the device. Monitoring was used most often for patients who requested it, as well as for reoperations and cases of malignancy. 76 % responded that FNV procedures do not improve the safety of thyroidectomy (56 % of users vs. 90 % of non-users). 45 % of users and 90 % of nonusers reported that non usage will not have any effect on liability. There were no differences in use or rate of RLN injuries according to fellowship training. The response rate to the survey was 41 %. Duclos et al.  in a survey of American surgeons showed that FNV was used most often in operations for Graves’ disease, malignancy, and for bilateral procedures. Many surgeons modified their technique of nerve dissection after using the device. Singer et al.  surveyed 170 American surgeons: 49 % used monitoring in all cases of thyroid surgery while 35 % of respondents never used tube-embedded electrodes, and 28 % used it selectively. There was a significant use difference according to the background and specialty of the respondent (43 % of otolaryngologist vs. 17 % for general surgeons). The device was used most often by younger surgeons. 65 % of uses of monitoring were for reoperations. The response rate to the survey was 18 %.
In conclusion, use of intraoperative RLN monitoring is highly dependent on geographical practices. German surgeons use the method routinely much more often than Americans. Use is also related to preferences of individual surgeons and institutional characteristics. Unfortunately, response rates to this survey were low and the results are prone to bias. In addition, these studies assess only the willingness of surgeons to use monitoring, but not the real effect of its use on the rate of RLN injury.
Considerations about effectiveness
Supporters of routine FNV claim that the use of these devices can aid in the identification and preservation of the RLN and EBSLN because it can allow detection before visualization, can provide information about function during surgery, and can show anatomical variants that increase the risk of injury . However, there are also disadvantages in its use including technical failures that undermine confidence in reliability, such as displacement of electrodes and the high rate of false-positive results .
Any RCT designed to identify a decrease in the number of RLN injuries should recognize the fact that the number of such injuries is very low (0.5–5 % of injuries in large databases). Therefore, to demonstrate an equivalence of the new intervention in this setting it is necessary to have enormous samples sizes. A simple sample size calculation assuming equivalence with an acceptable clinical difference of 1 % between interventions will need more than 14,000 patients . On the other hand, if superiority is going to be demonstrated and assuming a rate of 2 % of RLN injury in expert hands and an expected decrease of 50 % with the new device (1 %), it would be necessary to include at least 4,500 patients in a RCT. Nonetheless, most studies are underpowered case series and observational studies, which are very prone to bias. The largest studies from the German registry [66, 67] have a lot of confounding variables that make it difficult to obtain reliable results. Even those studies with sample size calculations , have numbers far less than needed to obtain statistically significant results. Thus, to demonstrate effectiveness in this scenario is very difficult. To overcome this difficulty some authors have chosen the nerve as the analysis unit; thus, immediately increasing the sample size of the studies. This approach has its problems. First, the device is used to avoid laryngeal nerve injury in a specific patient. It makes no difference to the patient if he has an injury of the left or right nerve when both are at risk, so the analysis unit should be the patient and not the nerve. Second, if we accept the predictive effect of the FNV result in the first nerve on risk, and the recommendation of stopping the procedure in the case of no signal for the first dissected nerve, the second nerve will not be dissected, and, therefore, the risk of injury for this second nerve will be null, making calculations unfair and non-exact. Third, the sudden increase of the sample size could artificially show a statistically significant difference that is not clinically relevant. Finally, when absolute and not relative numbers are examined, surgeons should think about the clinical relevance of decreasing the rate of injury in magnitudes of 0.5 % or less.
Populations at risk
Most authors [58, 84] conclude that the patients who obtain the greatest benefit from the use of FNV are those at high risk for RLN injury during thyroidectomy and that there is little controversy concerning the use of FNV for these patients. However, the definition of high risk varies from author to author. Chan et al.  defined high-risk patients as those who underwent reoperation, or surgery for malignancy, retrosternal or toxic goiter. Atallah et al.  defined high-risk patients as those with large (>70 cc by ultrasonography measurement for one lobe and/or more than 160 cc for both lobes), retrosternal and recurrent goiters, Graves’ disease, and malignant disease. Barczyński et al.  considered patients who underwent central neck dissection as high risk and Hermann et al.  defined high-risk patients as those with thyroid cancer or who underwent reoperation. The concept of high risk is derived from studies that identify risk factors for RLN injuries and assumes that all patients with one risk factor automatically became high-risk patients. This generalization is inappropriate, because not all factors have the same weight, and risk factors compete between them, making some combinations more risky than others. Up to now, there is not an accepted risk classification for patients, so the concept of high risk should be clarified before applying it to clinical practice and patient selection. We do have some information about selected populations. Chiang et al.  compared patients who needed extensive nerve dissection with those who did not and found no differences in the rate of temporary or permanent palsy [66, 67]. Alesina et al.  reported on 250 patients who underwent reoperation (89 with FNV vs. 157 without) and found no difference in temporary (6.2 % for monitored vs. 2.5 % for control) or permanent palsy (0 vs. 0.6 %). The rates of temporary and permanent RLN palsy were 3.9 and 0 % for first operation benign cases and 11 and 4.9 % for malignant or reoperation cases. For these high-risk cases, the sensitivity of FNV was 25 and 57 %, respectively, for temporary and permanent postoperative palsy. Yarbrough et al.  assessed FNV in 111 patients and 151 NAR who underwent re-intervention of the thyroid region. More than 70 % of cases had disease adherent to the nerve. He found no differences in temporary (19 % for the intervention group vs. 17 % in the control group) or permanent (1.9 vs. 1.7 %) injury. Most available studies were conducted for goiter, while thyroid carcinoma and reoperations are underrepresented in the various patient cohorts. Consequently, the percentage of “high risk” patients is very low , making an extrapolation of overall conclusions to this group of patients dangerous. Another concern about the risk-group classification is its wide range of description. The definition of malignancy is very unreliable. It is obvious that the risk of a T1 tumor is not the same as that of a T3, but both are malignancies and will be classified as high risk. The same occurs when terms as such as giant or retrosternal goiter and reoperation are used, because they are too subjective to define a clear category of risk.
The authors also suggest that FNV could be a good tool for low-volume surgeons, but this assumption is not supported with actual data, and most studies have been conducted among “high-volume” practices with experienced surgeons . These “high-volume” surgeons are the ones who more commonly employ nerve monitoring, so extrapolation to novice surgeons should not be made. As a rule, nerve monitoring should not be considered to be a substitute for experience. Further, an inexperienced surgeon may have a false sense of security with nerve monitoring, thus making surgery even more dangerous than without the device.
Definition of outcomes
There are other concerns related to measurement of outcomes. It is clear that conducting a double-blind trial of nerve monitoring is impossible. Other methodological approaches should be used to overcome the bias introduced by the lack of blinding but were not implemented in most studies. The use of nerve identification as the final outcome of monitoring is not an adequate end point. It may be assumed that when the surgeon has the device, he will use it insistently to identify the nerve, going beyond the usual measures employed to find it. Snyder and Hendricks  demonstrated this phenomenon in their series, where surgeons indicated that the identification of the RLN became highly dependent on the use of the device (4 % before vs. 27 % after). This fact introduces a bias because the final outcome is dependent on the use of the device that is being evaluated, which is applied differently in each group. Another subject to consider is the clinical effect of identification, which really corresponds to an intermediate outcome. For studies assessing EBSLN, the use of FNV is only useful for patients that have non favorable anatomy. In the Barczyński’s et al. study , Cernea type 2A and 2B nerves were identified in 36 % of cases in the intervention group versus 22 % in the control group, but there are no data about how many of the injured nerves belonged to these high-risk groups. Yet, another issue is the clinical relevance of identification for patients. In the case of EBSLN injuries, these injuries are clearly disabling for people who use their voices professionally. Although the objective of surgery is to preserve all anatomical structures so as to avoid adverse sequelae, the lack of data about the utility of FNV in an outcome that has low relevance for ordinary people is a subject that should be carefully assessed. Other intermediate outcomes have also been defined. Some authors  use the anatomical distribution and localization of the nerve as an outcome that should offer an advantage for the patient. Other authors have suggested that the FNV use decreases the rate of 131I in the thyroid bed  because it makes easier to remove the Zuckerkandl tubercle. However, this type of outcome represents only a surgical finding that has not demonstrated any final modification of the risk of nerve injury or adjustments in the surgical techniques that are commonly used in other centers that do not routinely use FNV.
Functional nerve verification in intraoperative decision making
Most studies have demonstrated that sensitivity of FNV is around 60 %. It is very difficult to decide what to do when a loss of signal appears. Stopping the procedure after unilateral resection in such cases, reassessing vocal mobility and waiting until mobility recovers before operating on the contralateral side, is often recommended. Such cases are considered as successful use of FNV and this decision is easily made in cases of benign disease. However, in order to obtain a fair conclusion, comparisons should be made with cases where FNV has not been employed, the surgical procedure continues to the end, and the vocal palsy is found in the postoperative period. How many of these patients will suffer bilateral vocal fold paralysis and eventually require tracheostomy in comparison with the FNV group? Data are very scarce. Goretzki et al.  in a very difficult to interpret paper assessed this scenario. He evaluated 48 patients with total thyroidectomy who had loss of nerve signal during dissection of the first lobe. In 22 (45 %) cases, surgical strategy was changed to conduct a smaller resection. In four cases, a more experienced surgeon was called to complete the bilateral procedure as planned. The remaining patients had their operations terminated, and the contralateral side removed at a later date. There was no bilateral nerve palsy in the delayed group vs. 17 % for the non-delayed group. This rate of bilateral paralysis is exceedingly high. Data presented are very difficult to understand and it was impossible to review calculations to confirm data. Melin et al.  in a study with 64 patients with intraoperative loss of signal, found a lower risk of temporary bilateral RLN palsy when surgery stopped (16 vs 0 %) but with a higher frequency of permanent RLN palsy in the stopped surgery group (0 vs. 10 %); without differences in the number of tracheostomies and with an increased number of delayed surgeries (0 vs. 45 %). Recently, Sitges-Serra et al. , in a study of 295 patients found an intraoperative loss of signal in 16 patients with spontaneous recovery during surgery in 15, making clear the low PPV of FNV. Comparisons of time, resources, and anxiety between patients who incurred nerve palsy during a complete surgical procedure and those who underwent a second procedure are lacking.
Other authors have also discussed the usefulness of FNV in terms of predictive values [89, 90]. FNV have shown low and heterogeneous PPV and high NPV, and had supported the utility of neuromonitoring in the high and reliable NPV. At the current frequency of RLN injury of 1 %, and assuming a NPV of 99 % and PPV of 50 %, a negative result (integrity of signal) of the neuromonitoring will decrease the risk of injury to 0.53 %, which is clinically negligible and a positive result (lack of signal) will increase the risk to 10 %.
Various questions may be raised concerning appropriate procedure in cases of malignancy: whether the operation should be terminated after unilateral thyroidectomy, with dissection of the contralateral side postponed until a future date, or whether to proceed with the planned single stage bilateral procedure may present a dilemma. However, various factors may influence this decision. In most cases of thyroid cancer the contralateral lobe is not extensively involved with gross tumor or extrathyroidal spread of disease, and a competent surgeon should be able to dissect the nerve on the contralateral side with confidence that he will not injure it, particularly if that side has not been previously dissected. On the other hand, there is little evidence as to whether performing the completion (contralateral) thyroidectomy at a later date will or will not interfere with the functional or oncologic outcome, although single stage total thyroidectomy is less expensive, less time consuming, and easier for the patient. Thus a decision can be made on the basis of the surgeon’s confidence and experience, and the actual extent of disease as found on pre- and intra-operative evaluation.
All these questions raise ethical and medical-legal issues that should be analyzed carefully before considering FNV as “standard of care”, because it places physicians who do not routinely use it in legal jeopardy . Kern  and Lydiatt  have shown that 30–40 % of lawsuits in thyroid surgery are due to RLN injury. Abadin et al.  found 33 lawsuits in a 20-year period, where 46 % corresponded to RLN injuries. In five cases a bilateral palsy occurred, but only one lawsuit favored the patient and nerve monitoring was not mentioned in any of the cases. Dralle et al.  reported 75 lawsuits in a 15-year period in Germany where 60 % concerned RLN injuries and 22 patients suffered bilateral injuries. In six cases FNV was used. Decisions in favor of the plaintiff were based on lack of vagus nerve identification, or the surgeon ignoring a lost signal and then proceeding with contralateral lobectomy and incurring bilateral vocal fold paralysis. Angelos  in a recent review expressed many ethical issues that are important when using this device, particularly in cases when its use may suggest better outcomes, although these suggestions are not supported by available evidence. He also reiterates that many medical–legal problems may be avoided by a deeper preoperative discussion of expected outcomes and possible complications with the patient, rather than with the use of technology.
The current literature on neuromonitoring, or FNV, has not proven that routine monitoring produces results superior to those obtained by traditional anatomical methods of nerve identification during thyroid surgery, although it may be helpful in difficult cases. No monitoring system can substitute for the surgical skill to locate and carefully dissect the RNLs. As the incidence of permanent clinical vocal fold paresis is low, it is hard to amass statistically significant numbers of patients for evaluation of the method. There are also inconsistencies in methodology, patient selection, and randomization in various published studies which may confound the conclusions of individual investigations. While results of various studies are inconsistent, evaluation of outcomes, as defined by the presence or absence of temporary or permanent nerve palsy, has shown that adverse outcomes could be affected only in “high risk patients.” The definition of “high risk” varies from study to study, but generally includes patients undergoing reoperation and those with massive or substernal goiter, Graves’ disease, or advanced cancer. Routine use of FNV varies according to geography. In Germany, it tends to be employed for almost all thyroid surgery, while in the United States it is used mainly for “high risk” cases. The training and experience of the surgeon also influences use of nerve monitoring. As the number of false-positive results from FNV is high, questions arise as to how to proceed in cases of planned total thyroidectomy, when loss of signal occurs during dissection of the first side. More detailed evaluation should help define the cost-effectiveness of using this method.
- 3.Karamanakos SN, Markou KB, Panagopoulos K et al (2010) Complications and risk factors related to the extent of surgery in thyroidectomy. Results from 2,043 procedures. Hormones (Athens) 9:318–325Google Scholar
- 25.Dimov RS, Doikov IJ, Mitov FS et al (2001) Intraoperative identification of recurrent laryngeal nerves in thyroid surgery by electrical stimulation. Folia Med (Plovdiv) 43:10–13Google Scholar
- 39.Durán Poveda MC, Dionigi G, Sitges-Serra A et al (2012) Intraoperative monitoring of the recurrent laryngeal nerve during thyroidectomy: a standardized approach part 2. World J Endocr Surg 4:33–40Google Scholar
- 87.Melin M, Schwarz K, Lammers B et al (2011) Two-stage thyroidectomy and patient satisfaction. Langenbecks Arch Surg 396:1301Google Scholar