FormalPara Take-home message

Direct laryngoscopy remains frequently conducted in intensive care units, but the impact of Macintosh blade size on first-attempt success is unknown. In the present retrospective study of more than 2000 intubations, Macintosh blade size No 3 was associated with improved first-attempt success rate compared to No 4 blades without any difference in complications rates.

Introduction

Control of upper airways and ventilation of intensive care unit (ICU) patients is a daily issue worldwide. Orotracheal intubation might be associated with complications [1]. Endotracheal tube is mainly introduced after direct laryngoscopy (DL). In case of anticipated difficult intubation, or ineffective DL, recommendations urge to use alternatives (stylets, bougies or videolaryngoscopes) [2, 3]. However, DL remains performed in the first place, in millions of cases worldwide each year, either by physicians, residents, or nurses. First-attempt success is associated with fewer complications rates, either severe (hypoxemia, cardiovascular collapse, cardiac arrest, or death) or moderate (tooth or laryngeal injury, esophageal intubation, operator reported aspiration, agitation, or arrhythmia) [4]. To perform DL, clinicians need a laryngoscope handle, which is also a light source, and a blade [2]. Blade size choice is left to clinician’s discretion (mainly based on habits and experience, since no data are available to date). According to a recent international randomized-controlled trial, Macintosh blade No4 is mainly used [5]. However, there is still controversy and uncertainty about the efficacy and safety of blades size, since large studies do not consider blade size in their design. Only two limited studies are available to date: a randomized manikin study (n = 200); and one on edentulous patients (n = 35), both in favor of smaller blades [6, 7]. A third experimental study is available, investigating theoretical basis of blade shapes based on two angular measurements representing eyeline displacement and forward space the blade occupies at the level of the mandible [8].

To explore Macintosh blade size impact on DL success of ICU patients, we have conducted the MacSize-ICU retrospective observational multicenter study. We hypothesized that the use of blade No3 would significantly increase first-attempt success rate in comparison with No4. This study is intended to be hypothesis generating, and serve as a basis to estimate sample size required to prepare a future trial.

Materials and methods

Setting and study design

We retrospectively analyzed data on intubation procedures using three databases from published prospective randomized and observational studies [1, 5] and one prospective observational study registered at ClincalTrial.gov (NCT05059067) (Electronic supplementary material). This last observational study was approved by central French ethics committee for anesthesiology and intensive care (CERAR IRB 00010254, No 2021-016). Institutional Review Board approval was obtained for each other study. If applicable, patient’s inclusion was carried out after delivering a clear information on the study, and on his right to refuse the research, and written consent was obtained if indicated by study design [9]. Only studies and patients with information on Macintosh blade size used for first DL have been incorporated. Data surrounding tracheal intubation were otherwise collected. Data were collected between September 1st, 2011 and January 31st, 2012 in first observational study [1], between October 1st, 2019 and March 17th, 2020 in the randomized trial [5], and between February 1st and August 31st, 2021 in the last observational study. This study was performed in accordance with the Strengthening the Reporting of Observational studies in Epidemiology (STROBE) statement [10] (Electronic supplementary material).

Patients

The study was conducted in 48 French ICUs. Patients were eligible if they were older than 18 years old, were admitted in ICU whatever the medical or surgical condition, and required tracheal intubation (urgent or planned) using Macintosh blade for first DL. Patients were excluded if they required first-attempt intubation with a fiberscope, a videolaryngoscope, or other techniques.

Procedure

All procedures were conducted following habits of physicians in charge of patients without any restriction despite the mandatory use of Macintosh blade for first-attempt DL. Decisions regarding Macintosh blade and endotracheal tube sizes, choice of hypnotics and neuromuscular blockade agents, and preoxygenation technique were left to the discretion of attending clinicians according to local practices and clinical expertise. Cricoid pressure (Sellick maneuver) and head positioning were not standardized, but international recommendations were encouraged. In case of difficult intubation (as defined as at least two failed attempts of intubation with DL), rescue techniques described in international algorithms were encouraged (use of stylets or long bougies, backward upward rightward laryngeal pressure (BURP) or external laryngeal pressure, use of supra-glottic devices or videolaryngoscopes, or call for a second operator…) [3, 11].

After intubation, management of intubated patients was at the discretion of attending clinicians.

Outcomes

Primary outcome was first-attempt intubation success rate according to Macintosh blade size (No3 vs No4) used for DL. Prespecified secondary outcomes were complications rates, either moderate [12, 13] or severe [14] (see Electronic supplementary material for details). Patients intubated for cardiac arrest were not considered for severe complications. Patients meeting a criterion before DL start were not evaluated in the outcome calculation, because this was a preexisting but not a peri-intubation event. Exploratory clinical outcomes included glottis view (Cormack–Lehane score), number of DL by operators (defined as any entry and exit of any intubation device into patient’s mouth) and operators’ qualification (anesthesiologist or not, junior or senior physician, or nurse anesthesiologist), and other techniques use such as videolaryngoscope, fiberscope, long bougie, stylet, or change of Macintosh blade size. Predictive criteria of difficult intubation evaluated by Mallampati score, obstructive sleep apnea syndrome, reduced mobility of cervical spine, limited mouth opening, coma, severe hypoxia, and non-anesthesiologist as operator (MACOCHA) score [14] were collected. Conditions of intubation were also reported: hypnotics, opioids, and neuromuscular blockers use, preoxygenation condition (non-invasive ventilation or not), use of Sellick maneuver, and external laryngeal pressure to improve glottis view.

Power and sample size

Power calculation was performed a priori to ensure that the study would be sufficiently powered to detect clinically important differences, assuming a power of 0.80, a significance level of 0.05, and a 5% difference of primary outcome (80 vs 75% first-attempt success rates between blades No3 and 4, respectively) [5, 15]. An estimated sample size of 2184 patients would be required, with a two-sided test.

Statistical analysis

Statistical analysis was performed including all patients (intention to treat population). Tests were two-sided, with type I error set at α = 0.05. Categorical data were expressed as number of patients and associated percentages, and quantitative variables as mean ± standard deviation or median [interquartile ranges 25–75%], according to statistical distribution (assumption of normality studied by Shapiro–Wilk test). Comparisons of independent groups were performed by Chi-square or Fisher’s exact tests for categorical variables, and Student’s t test or Mann–Whitney U test when t test conditions were not respected (normality and homoscedasticity studied by the Fisher–Snedecor test) for quantitative variables. A multivariate analysis of the main outcome was performed using logistic regression models by stepwise approach using variables identified in univariate analysis (with p < 0.15) or variables considered to have clinical relevance to search for risk factors of first-attempt intubation success and to select the final model. Center effect and database effect have been assessed using mixed effect logistic model, considering them as random effects. Results were expressed with odds ratios (OR) and 95% CI and intra-class correlation coefficients (ICC) for random-effects. Number needed to treat have been calculated from attributable risk using Wilson score.

As groups (Macintosh blade No3 versus 4) were different for patients’ characteristics, a propensity score (PS) has been calculated to adjust for these differences [16]. Inverse probability of treatment weighting (IPTW) was carried out by assigning to each participant an inverse weighting of the probability of first-attempt DL being conducted with Macintosh blade No3 or 4 estimated by the PS. The PS corresponds to the probability of first-attempt DL being conducted with Macintosh blade No3 or 4 according to their characteristics. Considering baseline characteristics of subjects, the PS model included the following variables: patient’s size, sex, intubation indication, anticipated difficult intubation (MACOCHA score ≥ 3), operator characteristics, hypnotics and neuromuscular blocking agents used, stylet use, endotracheal tube size, and non-invasive preoxygenation. A sensitivity analysis was conducted on PS analysis. Although there are different approaches to matching, one of the most common approaches in the medical literature is nearest neighbor pair matching without replacement within specified calipers of the PS [17]. Therefore, matchit command from MatchIt package (R software) has been used to perform this analysis with a caliper width of 0.2. Finally, the validity of the matching was tested by analyzing the standardized differences |∂|, with |∂|> 0.2 considered to be an imbalance. Then, according to several recommendations, we have proposed a bootstrap-based approach, a well-known resampling method for estimating the standard error of estimated statistics and of constructing confidence intervals [18]. After bootstrap simulations (1000), the OR estimated for each randomly chosen group of patients were calculated and are presented with the 95% CI.

All statistical analyses were performed with Stata statistical software (version 15, StataCorp, College Station, TX), R software (R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL: https://cran.r-project.org/) or Prism (version 9, GraphPad, SanDiego, CA). A p value of less than 0.05 was considered significant. Missing data were presented, considered as missing completely at random (MCAR) and not imputed, as no missing data were reported for the primary outcome. Furthermore, a sensitivity analysis was carried out to evaluate their possible impact on results.

Results

Intubation procedures

A total of 2139 intubation procedures have been evaluated in 48 ICUs from September 2011 to September 2021 from the datasets (Fig. 1). Finally, 629 (29.4%) intubation procedures were conducted with Macintosh blade No3 and 1510 (70.6%) with blade No4.

Fig. 1
figure 1

Flowchart of included patients

Patient, provider, and practice characteristics

Baseline characteristics of patients are presented in Table 1 and Table S2 (Electronic supplementary material). Patients intubated with Macintosh blade No4 were preferentially men (1032 (68.9%) vs 343 (55.1%), p < 0.0001), taller (170 [164–176] vs 169 [161–175] cm, p < 0.0001), without any difference of body mass index (BMI) (25.5 [22.3–29.4] vs 25.6 [22.0–29.6] kg m−2, p = 0.90). Patients were mainly intubated for urgent conditions (84.8 vs 85.7%, p = 0.64) with different indications (p < 0.0001, Table 1).

Table 1 Baseline characteristics of patients

Drugs, characteristics of the tracheal intubation and operators, and material used for tracheal intubation are presented in Table S2 through Table S6 in the Electronic supplementary material.

Primary outcome

First-attempt success rates were statistically different between blade No3 and No4 (79.5 vs 73.3%, relative risk, 1.41, 95% CI 1.23–1.77; p = 0.0025, respectively, Table 2 and Fig. 2). The number needed to treat (NNT) with Macintosh blade size No3 to prevent one first-attempt intubation failure was 14.6 (95% CI 9.0–42.5).

Table 2 Success of first-attempt direct laryngoscopy and glottic view according to Macintosh blade sizes in ICU
Fig. 2
figure 2

Results of first-attempt direct laryngoscopy and intubation success rates, and glottic view according to Cormack–Lehane score

Secondary exploratory outcomes

Glottic views as assessed by Cormack–Lehane score were equivalent between groups (p = 0.48, Table 2). Complications rates were equivalent between both groups (36.4 vs 39.7%, p = 0.17, Fig. 3 and Table S7, Electronic supplementary material). Further information on intubation difficulties and rescue technics is presented in Table S6 (Electronic supplementary material). There was no difference during on call and daytime first-attempt success rates (76.2 vs 73.9%, p = 0.47, respectively).

Fig. 3
figure 3

Complications rates of intubation according to Macintosh blade size. Moderate complications include esophageal intubation, tooth injury, operator reported aspiration, laryngeal injury, agitation, and cardiac arrhythmia. Severe complications include hypoxemia, cardiovascular collapse, cardiac arrest, and death related to intubation

By multivariate analysis, Macintosh blade No3 (OR 1.44 [1.14–1.84]; p = 0.0025), metal blade (OR 1.53 [1.16–1.99]; p = 0.0022), and the use of external laryngeal pressure during laryngoscopy (OR 2.72 [2.18–3.19]; p < 0.0001) were the three independent risk factors for first-attempt intubation success.

Adjustment analyses

IPTW analysis concluded to similar results on the beneficial impact of Macintosh blade No3 on first-attempt intubation success in univariate (84.1 vs 72.1%, p < 10–4) and multivariate (OR 2.07 [1.32–3.26], p < 10–4) analyses. For primary outcome, ICC was equal to 0.03 for center (p = 0.18), whereas it was < 0.01 for database (p = 0.63) in multivariate analysis.

Finally, taking into account center as a random variable in multivariate analysis, similar results were observed for Macintosh blade No3 (OR 1.86 [1.33–2.62]; p < 10–4), metal blade (OR 6.09 [1.74–21.3]; p = 0.005), and the requirement of external laryngeal pressure need during laryngoscopy (OR 2.29 [1.69–3.09; p < 10–4). In IPTW for the use of Macintosh blade, ICC effects were 0.28 and 0.52 for study databases and centers, respectively.

Sensitivity analysis was performed to assess the possible impact of missing data on these results. Missing data were equivalent in both groups before IPTW adjustment (p = 0.88). Of note, in multivariate analysis, primary outcome (success of first DL) was equally observed in missing data items than in others (74.4 vs 76.5%, p = 0.29). In missing data items, blade No4 was more frequent (77.0 vs 67.4%, p < 0.001). Mallampati data accounted for the largest number of missing data (587 out of 2139).

Discussion

The present study is, to our knowledge, the first to investigate the impact of Macintosh blade size on first-attempt DL and intubation success in ICU as well as associated complications rates. Main result is that Macintosh blade No3 seems to be associated with a higher rate of intubation success during first DL compared to Macintosh blade No4.

Intubation practices have been widely investigated worldwide since associated with potentially severe complications, whose rates depend on conditions (urgent or elective intubation), settings (operative room, ICU, emergency department or pre-hospital), and clinician experience [19,20,21]. Systematically not included in recent recommendations, the choice of Macintosh blade size remains left to the decision of clinicians in charge [22], and is mainly based on habits and previous experience, since no data are available to date in the literature.

In adults’ patients, Macintosh blades No3 or 4 are usually preferred. Blade No4 is longer by about 2 cm, and a few millimeters wider than blade No3, depending on the manufacturer. Having a longer length may render DL more delicate with a stronger tendency to decrease loading force (related to leverage arm, force intended to lift the mandible and submandibular space contents), to load the epiglottis, or even perform intra-esophageal blade tip positioning. The larger width may limit buccal introduction in case of limited mouth opening.

However, morphometric analyses of the oropharynx of adult patients show that it should be accessible to laryngoscopy with a No3 Macintosh blade, despite scarce literature [8, 23, 24]. The use of a No3 Macintosh blade could therefore have potential advantages, such as reducing intubation difficulty compared to laryngoscopy with No4, as shown in the present study. In our French multicenter study, most clinicians used Macintosh blades No4 (71%).

Despite being performed worldwide millions of times each year, clinicians might believe (in the absence of any literature on the subject) on the equivalent impact of Macintosh blades on first-attempt intubation success and glottic view. The choice of Macintosh blade size is nevertheless the first step whatever the setting [25, 26]. Despite more than half a century of existence, little changes have been provided to the device [27, 28] and they remain the preferred tool to perform intubation worldwide on a daily basis, even in case of potentially DI [26].

Clinical data on benefits and complications associated with VL and Macintosh blades are scarce [29]. Recent studies have investigated the use of sophisticated VL on intubation success rates in different settings [15, 30, 31]. Comparator group is intubation carried out with Macintosh blade for DL, without any recommendation on size. However, this choice might impact results of previously cited studies on VL and might explain confusing results about VL (which are numerous and require specific trainings) [32, 33]. For sure, some VL improve glottic view, but this feature does not directly translate into intubation success, lower intubation attempts, shorter intubation time required, or complications rates [34]. This might be related to tube angulation required to access glottic plan and canulate trachea [35]. Of note, the use of Macintosh blades for DL might be of valuable interest for learning curve of VL [29, 36]. Nonetheless, VL remain very valuable tools to learn larynx physiology to intubation trainees [37, 38]. Finally, VL are highly valuable alternatives [39,40,41] despite occasional limited availability and uncertain benefits in terms of first-pass intubation [15, 30, 34], rendering Macintosh technique essential [42].

Complications rates in our cohort are comparable to published ones, ranging from 35 to 40%. Of note, trends of higher severe complications rates in blade No4 group are driven by hypoxemia and cardiovascular collapse, in accordance with literature in case of urgent and non-anesthesia trained clinicians performing intubation [26, 43]. Differences in first-attempt success rates did not translate into increased complications rates in our cohort.

Glottic view, as assessed by Cormack–Lehane score [44], was equivalent between both groups, without direct translation into first-attempt intubation success. Percentage of glottic opening (POGO) score could be more informative by decreasing inter- and intra-clinicians’ variability [45].

No database effect was observed in our cohort. Contrarily, center effect was highlighted with an intra-class correlation coefficient at 0.03 and was taken into account as random-effect in mixed model. Indeed, the non-randomized design of studies concerning Macintosh blade size offers the possibility for clinician to select their “best or preferred blade size” or at least the one they are most comfortable with. The present design does not rule out the impact of specific clinicians who could have accounted for significant first-attempt failures (or success) in each center.

Finally, tracheal intubation, especially in difficult and urgent settings, requires high proficient skills that must be acquired during elective intubation such as in OR in simulated patients first [46]. A recent nationwide survey has found that VL are mainly used in case of DI in ICU [25] what could argue for larger VL use teaching programs, but not lefting behind fundamental DL through classical Macintosh blade.

Limitations

Our study has several limitations. First, due to the retrospective observational design, inherent reporting bias exist. However, the large cohort included, and the multiple settings might confer interesting data on intubation practices in French ICUs. Second, the wide period of included studies could have embedded changes in practices among clinicians. These changes could have impacted complications rates for examples (through the implementation of intubation protocols [12,13,14, 47], despite equivalent rates to published ones [5, 14, 15, 26]). Main outcome is very unlikely to have been impacted (first-attempt intubation success), since DL technic has not evolved. This might allow good confidence of this cohort with adequate representativity. Third, due to observational design, clinicians might have spontaneously chosen preferred blade size, the one they are more comfortable with and proficient, thus possibly overestimating success rates in comparison of randomly allocated Macintosh blades. Inter-individual variability among clinicians was not evaluated in our study due to inherent lack of information on individuals. Some clinicians might be more efficient than others at intubation gesture and may account for observed differences (or absence of difference). Fourth, variability due to center effects accounts for present results. Local habits influence working and technical rules in many fields, with either positive or negative impacts on patients’ care. Fifth, sample size estimation anticipated the inclusion of 2184 subjects. Present retrospective study has included 2139 intubations (45 occurrences missing) and lacks power. However, results from different IPTW models have provided differences in first-attempt success rates ranging between 7 and 12% (Table S8, Electronic supplementary material). New calculations would then conduct to the requirement of estimated sample sizes of 1180 and 368 patients, respectively. Finally, in missing data items more blade No4 were observed, especially related to Mallampati score which is often missing in non-cooperant ICU patients. The only way to eliminate these biases is to conduct large, randomized-controlled trials.

Conclusion

Endotracheal intubations are conducted millions of times each year worldwide, mainly upon DL. Present study is the first to date to have explored the impact of Macintosh blade size on success rates of first DL and intubation in ICU patients. In the present cohort, Macintosh blade No3 is associated with a higher first-pass intubation, but similar glottic views and complications rates, compared to blade No4. The present study will serve as a starting point, as a hypothesis generating study, to evaluate patients needed to design a future large scale randomized-controlled trial, and could help the interpretation of previously published trials on intubation with Macintosh blades.