With the increasing availability of videolaryngoscopes in airway management, many comparative analyses of videolaryngoscopy (VL) with direct laryngoscopy (DL) have been published. Many of these publications indicate that Macintosh-style videolaryngoscopes (Mac-VL) deliver a superior view of the larynx than that achievable by traditional Macintosh DL.1,2 In a multicentre trial, Kaplan et al.3 compared the directly sighted peroral view with the indirectly sighted videoscopic view using Mac-VL in 865 patients. They reported that the videoscopic view was at least one Cormack-Lehane (C-L)4 grade better in 41.5% of subjects. Subsequent studies by Byhahn et al.5 and Raimann et al.6 have also reported a substantial improvement in the videoscopic view compared with direct peroral sighting with Mac-VL, although the effect was most pronounced in patients with difficulty artificially induced by application of a stiff cervical collar. In contrast, our clinical impression is that incremental improvement is minimal with the indirect videoscopic view compared with the direct peroral view with Mac-VL in most patients. Unfortunately, many of the available systematic reviews comparing videolaryngoscopy with DL include studies of both Mac-VL and videolaryngoscopes with hyper-angulated or hyper-curved blades.7,8It remains unclear whether the results of these meta-analyses apply to both Mac-VL and hyper-angulated types of videolaryngoscope blades or only one.

In this human cadaver equivalence study, we assessed how the directly sighted peroral laryngeal view compares with the videoscopic view obtained with single-use Macintosh blade C-MAC®S (Karl Storz, Tuttlingen, Germany) and GlideScope® Spectrum™ (GS; Verathon, Bothell, WA, USA) videolaryngoscopes. The primary outcome of interest was the difference between concomitantly observed directly sighted peroral and indirect videoscopic views for each system. Our working hypothesis was that there would not be a clinically significant difference in the two views for either device. Subsequent tracheal intubation was not performed or studied.

Methods

This study’s protocol was approved by the Nova Scotia Health Authority’s Research Ethics Board (file no. 1022400; start date July 28, 2017) and approved by the Human Donation Program and the Skills Centre for Health Sciences at the Queen Elizabeth II Health Sciences Centre in Halifax, NS, Canada. Four human cadavers were used, procured by the Dalhousie University Human Body Donation Program. Cadavers were lightly embalmed using the Halifax clinical cadaver preparation technique, a method that leaves the non-perfused specimens mechanically similar to an anesthetized patient. Cadavers were removed from refrigeration two hours prior to use.

With their informed consent, four experienced clinicians (three anesthesiologists and one emergency physician) participated as study investigators. These clinicians, all study investigators, alternated as laryngoscopists and assessors (“scorers”) of the videoscopic view. All clinicians were experienced in Macintosh DL and the use of Mac-VL, and had over seven years of clinical experience in airway management before the study’s inception.

For the purpose of this study, a Macintosh direct laryngoscope is defined as a device with a gently curved blade designed to enable direct visualization of the larynx (i.e., without video capability). A Mac-VL refers to a device with a similarly gently curved blade designed to view the larynx directly and indirectly on a video display via a camera located on the laryngoscope blade.

All clinicians were briefed prior to the study, the purpose of which was to outline study procedures, ensure consistency in the grading of the laryngeal view, and to provide an opportunity for questions. Investigators were educated on the use of a new visual analogue scale (VAS) to document the degree of laryngeal exposure during laryngoscopy. The VAS consisted of a vertical line superimposed on an image of a typical larynx (Fig. 1). At the point of laryngeal exposure (or immediately after, in the case of the laryngoscopist), a horizontal line was drawn by the observer across the image on the VAS at a level that corresponded to the maximum extent of glottic anatomy visualized (Fig. 2). During analysis, the horizontal line drawn on the VAS was measured from the bottom of the vertical line on the VAS as between 0 (no glottic view) and 1.0 (full glottic view).

Fig. 1
figure 1

Visual analogue scale (VAS) of laryngeal view. Scorers were instructed to draw a horizontal line across the image to indicate the point below which the glottis was visible. A full view was represented by a horizontal line drawn across the top of the vertical line, and for an epiglottis-only view, a horizontal line was drawn at the bottom of the vertical line

Fig. 2
figure 2

Example visual analogue scale of a laryngeal view (A) with the horizontal line representing the amount of glottic opening visible on the corresponding laryngoscopic view (B). A score of 0.33 (33%) was assigned to this view (calculated by height from base of vertical line to horizontal line, divided by height of the vertical line)

Based on the percentage of glottic opening (POGO) scoring system,9,10,11the VAS was designed to improve the inter-rater reliability of scoring laryngeal exposure compared with both the POGO and C-L scoring systems or the latter’s modifications.12,13,14 The VAS was assessed for inter-rater reliability by two scorers independently and simultaneously scoring the view visible on the videolaryngoscope’s monitor.

Two types of videolaryngoscope with single-use blades were used during the study: the C-MAC®S system with Macintosh #3 and #4 blades, and the GlideScope® Spectrum™ (GS) with the DirectView Macintosh (DVM) S3 and S4 blades. To simulate the range of possible laryngeal views encountered in clinical practice, we asked the laryngoscopists to deliberately obtain and maintain one of 16 predetermined randomized views for each blade on each of the four cadavers (data available at https://osf.io/xdraq/). Thus, for each C-MAC®S device, 16 laryngoscopies were performed with the MAC #3 blade and another 16 with the MAC #4 blade, for 32 for each of the four cadavers, or 128 in total. Similarly, 128 trials were performed in identical fashion for the GS device with the DVM S3 and S4 blades. These produced 256 laryngoscopy views for analysis. Sample size was primarily pragmatic. Given the time, personnel, and resources, a maximum of 256 views could be collected. Confidence intervals (CI) were calculated and presented to provide reasonable bounds of uncertainty to evaluate the precision. For each of the 16 laryngoscopies, the randomly assigned laryngoscopist was instructed to obtain and hold one of the following views on each of two occasions: greater than 66% of the glottic opening; 33–66% of the glottic opening; 0–33% of the glottic opening; 0–33% of the glottic opening visible via a direct epiglottic lift; only the corniculate tubercles (four trials); only the inter-arytenoid notch; and only the epiglottis. A list of trials was created on a Microsoft® Excel spreadsheet, numbered 1 to 256, with columns for designated cadaver, assigned laryngeal view, videolaryngoscope type and blade, laryngoscopist, and the two scorers. The order of the laryngoscopist, scorers, and sequence number was randomized using an internet based random number generator (www.random.org).

For each trial, the designated laryngoscopist performed laryngoscopy, tasked with obtaining the assigned directly sighted, peroral view with the designated videolaryngoscope and blade. The laryngoscopist was blinded to the videoscopic view by having the videolaryngoscope screen placed behind them. Once the assigned view was obtained, while maintaining the view, the laryngoscopist said “now”. At this point, the view on the video monitor (“videoscopic view”) was independently rated on separate scoring sheets by each of two scorers using the VAS. Each scorer was blinded to the direct view assigned and obtained by the laryngoscopist, as well as the rating of the other scorer. Once the two scorers had completed their ratings, the laryngoscopist removed the laryngoscope and, using an identical scoring sheet, also rated the view they had obtained and held during laryngoscopy. If the assigned view could not be obtained for a particular trial because of anatomic constraints, the laryngoscopist was instructed to obtain and hold a view as close as possible to the assigned view, using external laryngeal manipulation if necessary.

All data were recorded on identical numbered, standardized forms for each trial laryngoscopy: one for each of the two scorers and a third for the laryngoscopist. Four data points were gathered on the forms: the study number, the cadaver examined, the laryngoscope blade type, and the laryngeal view, drawn directly on the VAS.

The study’s primary outcome measures were the differences in the VAS ratings of the directly sighted peroral and videoscopic views obtained with the C-MAC®S and GS videolaryngoscopes, each with both blade sizes analyzed together. Laryngeal views were compared by measuring the distance from the bottom of the image to the line drawn by the observer. The score assigned to the laryngoscopist was then compared with the mean score of the two videoscopic observers. A result was considered statistically significant when the mean difference of the VAS ratings was greater than zero and the CI of the mean difference did not also include zero. The result was arbitrarily deemed to be clinically significant when the mean difference was greater than 0.2 (20% of the total distance from the anterior commissure anteriorly to inter-arytenoid notch posteriorly), and the CI did not contain values between ±0.20 (i.e., an equivalence test). The latter was felt to be the minimum improvement in view that might make intubation easier or that might result in moving from one grade to the next using Cormack-Lehane grading or its Yentis12 or Cook13,14 modifications. Secondary outcome measures included analysis of directly sighted peroral vs videoscopic view for each blade size of both devices, and the inter-rater reliability of the VAS scale.

Data were analyzed using linear mixed models using the lme4 package in R15 with asymptotic standard errors with view (direct vs videoscopic), cadaver (A, B, C, and D), and laryngoscopist (clinicians 1, 2, 3, and 4) entered as predictors with fixed slopes and random intercepts. Since there was only a single random effect (i.e., the intercept), there was no further covariance structure to describe. Only the effect of view was reported; other effects were included only as covariates. These analyses were conducted for each of the four blades separately (64 views each), both C-MAC®S blades (128 views) and both GS® blades (128 views). Each equivalence test is essentially two one-sided tests, where we compare the observed mean difference to the upper and lower bounds of our a priori bounds for clinical significance (±0.20 in our case). Thus, we corrected our familywise error rate using a Bonferroni correction assuming 12 tests with an original one-sided alpha of 0.05. This correction results in an alpha of 0.0083 or a 99.17% CI, hereafter rounded to 99%. Instead of P values, hypotheses were tested using 99% CIs; when the interval does not cross zero, the mean difference is significantly different from zero at P< 0.05 (i.e., superiority). When values in the 99% CI fell between ±0.20, the mean difference was not clinically significant (i.e., clinical equivalence). If mean difference exceeds 0.20, and values in the 99% CI did not contain ±0.20, the results were considered clinically significant. Inter-rater reliability for VAS ratings between the two videoscopic view scorers was tested using intra-class correlations (absolute agreement) with 95% CIs.16 We also report Bland-Altman plots as a supplementary assessment of agreement.

Results

In total, 256 laryngeal views were obtained: 64 per blade and 128 per device. One female cadaver and three male cadavers were used for the study. Two cadavers were edentulous and two had full sets of teeth. The cadavers’ airway anatomy presented a range of difficulty in performing DL, ranging from being easy to obtain a full view to more difficult.

Results are summarized in Fig. 3. The C-MAC®S videoscopic view (both blade sizes analyzed together) revealed approximately 0.9% more of the laryngeal inlet than direct peroral sighting (99% CI, -2.5% to 4.3%). The GS videoscopic view (both blade sizes analyzed together) revealed 6.7% more of the laryngeal inlet than the direct view (99% CI, 2.3% to 11.0%). Analyzing blade sizes individually, there were estimated differences of -1.1% for the C-MAC®S Mac #3 blade, 2.9% for the C-MAC®S Mac #4 blade, 7.3% for the GS DVM S3 blade, and 6.1% for the GS DVM S4 blade, when comparing videoscopic views to their directly sighted counterpart views. None of these results were clinically significant. Figure 4 contains the Bland-Altman plots as a supplement to our planned analyses. These plots also show considerable similarities between the direct and videoscopic views.

Fig. 3
figure 3

Mean differences between laryngoscopist and scorers on a visual analogue scale (VAS), analyzed by blade size and device. Squares represent the mean difference, and lines represent the 99% confidence interval around the mean difference, controlling for laryngoscopist and cadaver. Zero represents no difference between the VAS scores obtained via the videoscopic view and the direct view. Negative mean difference values indicate a superior direct view; positive values indicate a superior videoscopic view

Fig. 4
figure 4

Bland-Altman plots. The x-axis represents the mean visual analogue scale score of direct and videoscopic views. The y-axis represents the mean difference between direct and videoscopic views. Negative mean difference values indicate a superior direct view; positive values indicate a superior videoscopic view. The top and bottom lines indicate +1.96 and -1.96 standard deviations, respectively. The middle dotted line is the average mean difference, which corresponds directly to values presented in Fig. 4. The coloured bands indicate 95% confidence intervals

Inter-rater reliability of the VAS scoring between the two videoscopic view scorers was very high, with an intra-class correlation of 0.91 (95% CI, 0.89 to 0.93). Because of the very concordant ratings by both independent scorers, VAS scores for scorers 1 and 2 were averaged together prior to analysis of the laryngeal views.

Discussion

Our study indicates that the videoscopic view obtained by the Macintosh blades of the C-MAC®S and the GlideScope® Spectrum™ videolaryngoscopes were equivalent to the concomitant directly sighted peroral view. Although the GS results indicated a significantly improved videoscopic view, they did not reach our a priori threshold for a clinically significant improvement.

Other published clinical trials are incongruent with our findings. In 2006, Kaplan et. al. published the results of a comparable study in patients.3 In that prospective, multicentre trial, directly sighted and videoscopic laryngeal views were compared using the Macintosh videolaryngoscope (Karl Storz Endoscopy, Culver City, CA, USA), a precursor to the modern-day C-MAC®. In that study, a single laryngoscopist first performed optimized DL, rated the view, and then optimized laryngoscopy to obtain and rate the videoscopic view. Using the Yentis and Lee modification of the Cormack-Lehane grading scale12 they reported an improved laryngeal view of at least one grade in 41.5% of the 865 undifferentiated surgical subjects. The results maybe at odds with ours for a number of reasons. First, we tested different Mac-VLs. While very similar in shape to the Mac-VL blade used in the Kaplan study, it is possible that subtle differences in blade shape, camera location, or lens features in the single-use videolaryngoscope blades we tested may have resulted in different findings. Second, we blinded the two clinicians scoring the videoscopic view to the directly sighted view obtained and rated by the laryngoscopist, which eliminated observer bias. Third, the laryngoscopist and videoscopic scoring clinicians rated the view at the same time and not during two sequential optimizations of laryngoscopy, as permitted in the Kaplan study. Fourth, we used a new rating scale that showed good inter-rater reliability. In contrast, the C-L grading scale and its derivatives have poor inter-rater reliability.9,10,11,12,13,14,17

Byhahn et al.5 also compared directly sighted and videoscopic views obtained in the same patient. In this study, using the C-MAC® VL, separate, blinded raters compared the directly sighted and videoscopic views in 43 elective surgical patients, with and without a cervical collar. Also using the Yentis12 scale, a significant improvement in view from directly sighted to videoscopic views was reported in the cervical collar group. Raimann et al.6 also studied a small population of elective surgical patients with a cervical collar fitted, sequentially performing directly sighted laryngoscopy with the C-MAC reusable Macintosh blade, then indirectly sighted VL with the same blade. With and without “backwards-upwards” rightwards pressure (BURP), the videoscopic view was better than the directly sighted view in a moderate but statistically significant number of patients (defined as a change from Cormack-Lehane Grade 3 or 4 to Grade 2a, 2b or 1): six of 32 without applied BURP; seven of 25 with applied BURP; and 14 of 18 with BURP applied only during Mac-VL.

Although perhaps not directly comparable to the present study, other trials have compared DL with Mac-VL. Piepho et al.1 studied 52 elective surgical patients who had presented C-L grade 3 or 4 views during DL. With subsequently performed C-MAC® Mac-VL, an improvement in laryngeal view by at least one C-L grade occurred in 49 patients (94%): by one grade in 16, two grades in 32, and three grades in one. In ten of the 52 cases, C-MAC® VL involved a direct lift of the epiglottis. Aziz et al.2 randomized 300 elective surgical patients to laryngoscopy and intubation using Macintosh or C-MAC® Mac-VL. The primary study outcome was first attempt tracheal intubation success; laryngeal views were reported as a secondary outcome. A C-L grade 1 or 2 view was reported in 139 of 149 (93%) of the C-MAC® Mac-VL cases and in only 119 of 147 (81%) of the analyzed DL cases.

Our results differ from these previous findings in that we found a mean improvement in the observed extent of the laryngeal inlet of only 0.9% on the video monitor of the C-MAC®S system compared with the concomitant directly sighted view, and a 6.7% improvement with the GS system. Although the GS results achieved statistical significance, this would be unlikely to improve the C-L rating, even using the Yentis modification (e.g., from 2b to 2a), and was probably not clinically significant. There are a number of potential explanations for our differing findings: first, our study used human cadaveric specimens, which may not exactly mimic conditions in live, anesthetized patients. Second, we used the single-use Macintosh blade options of both the C-MAC® and GS systems in contrast to the standard reusable blades used in the previous studies. As mentioned, it is possible that small differences in blade shape or camera positioning on the blade could result in significant differences in direct or videoscopic views. Indeed, this is likely to have accounted for the slightly different results of the C-MAC® and GS blades in this study, although this was not the primary outcome of interest. Third, our study methodology differed from some of the earlier trials by attempting to compare direct and videoscopic views at the same moment in time, with ratings done by different clinicians (the laryngoscopist and two independent videoscopic scorers), each blinded to the results recorded by the others. Fourth, only the Kaplan et al.3 Byhahn et al.5 and Raimann et al.6 studies compared the direct and videoscopic views obtained with the same Mac-VL; the Piepho and Aziz1,2 studies documented videoscopic views obtained using the Macintosh-style C-MAC® VL and those obtained in other patients undergoing Macintosh DL. Although unlikely, it is possible that the view obtained during regular DL may not equate exactly to direct peroral sighting using a Mac-VL, rendering a comparison of the present study with those studies invalid. Regardless, our findings at least suggest the need for further objective studies on human patients to assess the utility of Mac-VL in patients in whom DL is expected or known to be difficult.

We elected to use a new VAS to assess laryngeal views. This followed from known weaknesses in using the C-L scale to assess and record Grade 1 and 2 views,18 even if partially addressed by the Cook13,14 and Yentis12 modifications. Similarly, the POGO scale9,10,11 can be variably interpreted, as the actual glottic opening represents only about three quarters of the total distance from anterior commissure to the inter-arytenoid notch posteriorly. Our VAS simply requires a horizontal line to be drawn across the maximum extent of glottic anatomy visualized. The high inter-rater reliability of the VAS between two scorers in this study, blinded to each other’s rating, suggests this is a valid way to record laryngoscopic views of C-L grade 1 and 2 situations, or POGO views > 0. If other studies confirm its validity, it could be an option for future studies of laryngoscopy.

Our study has several limitations. First, we used four clinical-grade cadavers that were embalmed to mimic real tissue. The cadavers selected for this study were donated to the Human Body Donation Program, Dalhousie University, so could not be chosen based on desired airway anatomy. By chance, the four cadavers used in our study presented a range of laryngoscopy difficulty, ranging from being very easy to somewhat difficult to obtain a full laryngeal view. Although results varied between “easy” and “difficult” specimens, we contend that the variable conditions reflected the real-life situation. Second, human cadaver tissue can deteriorate after multiple airway instrumentations, although this should have been reflected in both the directly sighted and videoscopic views. Third, the authors were the study investigators and could not be blinded to the device used or the cadaver examined. This could have resulted in bias. Nevertheless, the two scorers were blinded to laryngeal view both assigned to and obtained by the laryngoscopist and each other. Fourth, the study was undertaken using the C-MAC®S with its Macintosh #3 and #4 blades, and the GlideScope® Spectrum™DVM S3 and S4 blades for multiple laryngoscopies. These blades are designed for single use and may not have been tested for alterations in image quality or light intensity over multiple laryngoscopy attempts. Fifth, laryngoscopy was performed by experienced personnel; therefore, whether these findings would apply to less-experienced clinicians is unknown. Finally, our study did not analyze the success of tracheal intubation. Many prior studies have correlated the view of the larynx during Macintosh DL and the ease of tracheal intubation.8,14 The blade geometry of a Mac-VL is derivative of the Macintosh blade and is designed to be used with the same technique.19 As such, the laryngeal view obtained by a Mac-VL using either direct peroral or videoscopic sighting should be strongly correlated to the ease of tracheal intubation.8,13,14,18

Even if our findings of minimal improvement of the videoscopic view translate to real life, Mac-VL is still a useful resource. First, the enlarged image on the video display may help optimize performance of laryngoscopy and is certainly a useful teaching tool for the supervision of novice laryngoscopists. Second, the location of the camera towards the distal end of the laryngoscopy blade overcomes any potential “framing” issues caused by facial hair, lips, or teeth that may obscure a direct view. Third, there is potential benefit, particularly in the event of difficulty, for assistants and other team members to observe the intubation process. This can allow for a shared mental model to help optimize maneuvers such as external laryngeal manipulation that may improve the laryngeal view or success of tracheal intubation. Fourth, a wide range of videolaryngoscopes have recording capability. This provides the additional benefit of more accurate clinical documentation, as well as advantages with regard to quality improvement, self-improvement, research, and teaching. Fifth, regardless of whether videoscopic visualization is improved over direct sighting, a higher first attempt2 and ultimate20 success rate have been reported with use of Mac-VL compared with DL. Finally, within the emergency setting, there is some evidence of a lower recognized esophageal intubation rate with the use of Mac-VL compared with DL.8

The intention of this study is not to negate the appropriateness of Mac-VL use in airway management. Nevertheless, given that laryngoscopy attempts should always be minimized,20,21,22 the results of this study suggest that when Macintosh DL fails to provide sufficient laryngeal visualization, defaulting to a Mac-VL might not offer a clinically better view. We again acknowledge that this study addresses only laryngeal exposure and does not address subsequent ease or difficulty of tracheal intubation.

In summary, using a methodology that included blinded observers and a rating scale that showed high inter-rater reliability, this cadaver study of Mac-VL found that amongst experienced laryngoscopists, the direct peroral view was clinically equivalent to the videoscopic view. These findings differ from published findings from live patients,3,5,6 suggesting that further studies in live human patients may be warranted with this class of device, using similarly objective study conditions.