Background

Predicting difficult laryngoscopic endotracheal intubation (TI) is an important concern for anesthesiologists. Anticipated difficulties offer opportunities to prepare alternative methods and use proper advanced management techniques. The Mallampati classification and other methods have been used to evaluate anatomical structures during the preoperative period for airway assessment.

Khan et al. suggested the upper lip bite test (ULBT) to, evaluates the ability of a patient to cover the mucosa of the upper lip with the lower incisors [1]. This simple bedside test was shown to have a good predictive value, specificity, and accuracy without the need for a light or sitting position [1, 2].

This method was based on cephalometric measurements, which differed in skeletal hard tissue and soft tissue profiles of Asian and Caucasian populations [3,4,5,6,7,8,9,10]. The ULBT evaluates mandibular movement, which reflects not only differences in skeletal hard tissue but also the conjointed movements of the ligaments, connective tissues, and soft tissues. In our experiences and other studies, there are some differences in ULBT in Koreans [11]. We suppose these discrepancies might be derived from cephalometric differences in Asian.

This study aimed to assess the differences in the ULBT in Koreans while considering ethnic differences. We tried to figure out the influence of the cephalometric differences to ULBT in Koreans.

Methods

After obtaining approval from the Severance Hospital Institutional Review Board (IRB number: 4–2008-0583), the trial was performed at Yonsei University Severance Hospital.

The written informed consent was obtained from all subjects participating in the trial.

Three hundred forty-four Korean adult patients undergoing general anesthesia with orotracheal intubation were included in our registered prospective observational study. (ClinicalTrials.gov identifier: NCT01908218, Principal investigator: So Woon Ahn, Date of registration JUL 2013). This manuscript adheres to the applicable EQUATOR Network guidelines. Patients were excluded if they had facial anomalies, had temporomandibular(TM) joint disorder, were edentulous or required a rapid sequence induction. In the pre-anesthetic care unit, we recorded each patient’s age, sex, weight, height, American Society of Anesthesiologists (ASA) classification, Modified Mallampati (MMT) class [12], ULBT ratings [1], inter-incisor distance (With the mouth open maximally, measure the distance between the incisors, IID), thyromental distance (distance from the thyroid notch to the tip of the jaw, TMD), sternomental distance (distance from the chin (mentum) to the top of the notch of the thyroid cartilage, SMD) in sitting and fully head-extended position. The choice of the anesthesia induction technique was left to the attending anesthesiologist. After the loss of a response to a train- of -four or single- twitch ulnar nerve stimulation, laryngoscopy was performed by three skilled anesthesiologists (trained for at least four years, > 1000 endotracheal intubations) using a Macintosh laryngoscope with a size 3 or 4 blades. After obtaining a view of the glottis by direct laryngoscopy, the anesthesiologist assessed the Cormack - Lehane grade [13]. Grades 1 and two were considered as easy laryngoscopies. Grades 3 and 4 were considered difficult laryngoscopies from the inability to visualize the vocal cords. Grading was checked with no external laryngeal pressure. After the first attempted laryngoscopic view with Cormack –Lehane grade 3 or 4, external laryngeal pressure (backward upward rightward pressure maneuver [BURP]) was applied. [12] In the case of a second failed endotracheal intubation, an attempt with another intubation method such as fiber-optic bronchoscopy or video laryngoscopy assisted intubation was attempted and recorded.

Based on an institutional pilot study, the area under the receiver operating characteristic (ROC) curve (AUC)s of the MMT and ULBT were 0.61 and 0.52, respectively. We determined that 344 patients would be required to demonstrate a difference between two predicting tools with a type 1 error (α) of 5% and power (1-β) of 90%(two-sided) using the PASS program (NCSS, Kaysville, UT, USA).

For continuous variables, a Student’s t-test was used to assess differences in means between the groups. For categorical data, a chi-square test was used to assess differences in proportions across the categories. The cut-off points of thyromental, sternomental and mouth opening distances were obtained from the analysis on the ROC which was calculated maximize the sensitivity and specificity [14]. Each test, individually and together with various combinations, was evaluated by its calculated sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). A comparison of the predictability for each test was performed using AUCs. The significance of the difference between the two areas was assessed using the method described by Delong [15]. Statistical analysis was done by SPSS, version 22.0 (SPSS, Inc., Chicago, IL, USA) and Medcalc 14.8.1 (MedCalc Software, Ostend, Belgium).

Results

Data were collected from four hundred forty-four elective surgical patients. Seventy-five patients were excluded because they were edentulous or the patients did not undergo TI because of changes in their anesthetic plans. A total of 344 patients’ data were analyzed (Fig. 1, Table 1).

Fig. 1
figure 1

Flow chart of patient participation

Table 1 Patients’ characteristics

During direct laryngoscopy, 89 patients presented with a difficult laryngoscopic view (Cormack–Lehane grade of 3 or 4) without external manipulation (Table 2). Among these patients, 86 could be intubated by applying external laryngeal pressure, and two required alternative techniques. One patient was intubated using video laryngoscopy (Glidescope®, Saturn Biomedical Systems, Burnaby, BC, Canada) and another patient was intubated with the use of fiber-optic devices. Moreover, one patient was intubated successfully after multiple laryngoscopic trials while waiting for the preparation of alternative devices.

Table 2 Relationship between pre-anaesthetic assessment classifications and Cormack and Lehane grade

An MMT class > II, IID ≤ 4.5 cm, TMD ≤ 8.3 cm, and SMD ≤ 17.9 cm were defined as cut-off points for difficult intubation (Table 3).

Table 3 Receiver operating characteristic (ROC) analysis for difficult intubation

Table 4 shows the true positive, false positive, true negative, false negative, accuracy, sensitivity, specificity, PPV, and NPV; moreover, AUCs obtained from the ROC analysis are shown for the MMT and ULBT (Appendix).

Table 4 Predictive values for the Upper lip bite test (ULBT) and the modified Mallampati test (MMT) to predict the occurrence of a grade 3 or 4 according to the Cormack-Lehane grade

The AUC, the primary endpoint of this trial, was lower for the ULBT than the MMT (the difference between the areas: 0.13, 95% confidence interval: 0.0697–0.191, p < 0.0001, Table 3, Fig. 2).

Fig. 2
figure 2

ROC curves of MMT and ULBT

In our study, the accuracy of the ULBT (73.83%) was higher than that of the MMT (62.80%), and the specificity of the ULBT (98.04%) was higher than that of the MMT (61.18%). Particularly, the ULBT showed significantly lower sensitivity (4.49%) compared with that of the MMT in our trial (67.42%). The prevalence of a difficult laryngoscopy (DL) was 25.87% (89 of 344), and the percentage of patients with an MMT class > II was 67.4% (60 of 344), while only nine patients (2.6%) showed a grade III ULBT (Table 2).

Discussion

Our study demonstrates the ULBT shows particularly high specificity, low false-positive rates and high accuracy in Koreans. The differences were due to lower incidence of high-grade ULBT in Korean than in other ethnities [2, 16, 17]. According to a review of several works of literature, such character might be explainable with the soft tissue redundancy and skeletal variance in Far East Asians.

Several methods such as the MMT classification, IID, TMD, and SMD have focused on one or more patient-related factors that may identify those at risk for a difficult TI before the induction of anesthesia [16, 18,19,20]. Concerning applicability, the ULBT does not require an additional light, restriction of phonation, or for the patient to be in the sitting position. The ULBT has been used as a simple bedside test for a DL with good predictive accuracy. It could serve as a good predictor for difficult laryngoscopic intubation because the range and freedom of mandibular movement and the architecture of the teeth have pivotal roles in facilitating laryngoscopic intubation [1]. The ULBT showed higher accuracy (75.9–91%) in previous studies than the MMT (63.7–67.7%) [1, 2, 21, 22] and good predictability with a high AUC value (0.604–0.85) [1, 2, 22]. The percentage of correctly predicted easy laryngoscopies among all laryngoscopies (the specificity) of the ULBT was very high (82.35–92.5%) [1, 2, 23, 24].

The nature of the soft tissue profile is affected by many factors other than the skeletal hard tissue profile, including ethnicity. The ULBT evaluates the range and freedom of mandibular movement and the architecture of the teeth. Mandibular movement is the conjoined movements of skeletal hard tissue, ligament, and soft tissue [25]. Furthermore, the ULBT classification is based on the upper lip mucosa, which is soft tissue [1].

From our observational results comparing the ULBT with the MMT, which has been used widely, the AUC of the MMT in Koreans was 0.627, which is indicative of unreliable predictability similar to those found in recent meta-analyses [5, 26]. However, the AUC of the ULBT was 0.519, which implies predictability that is much lower than those in other studies of different ethnicities (0.604–0.826) [1, 2, 27]. The NPV of the MMT is higher than that of the ULBT. A high AUC value and high NPV mean that the MMT is highly reliable and likely to detect an easy laryngoscopic view. The ULBT has a lower PPV than the MMT. Therefore, many positive results from this procedure are false positives. However, the PPV and NPV are not intrinsic and also depend on the prevalence, which was very low for positive results. In this trial, the prevalence of high-grade ULBT values was 2.6% (9 of 344). This result is also similar to that of another study on Koreans (16 of 305) [11]. We speculated that differences in the soft tissue and bony structure in line with ethnic differences might explain the differences between the ULBT and MMT results. Our results also indicated that the ULBT has higher accuracy and specificity than the MMT. However, sensitivity was much lower (4.49%) compared with that of the MMT in our trial (67%) and consistent with ULBT data from previous trials (28.2–76.5%) [1, 2, 21, 22, 28]. This means that many patients with a DL will not be identified by the ULBT (a large number of patients will have false-negative tests).

$$ \mathrm{Sensitivity}=\frac{\mathrm{Number}\kern0.17em \mathrm{of}\ \mathrm{true}\ \mathrm{positives}}{\mathrm{Number}\ \mathrm{of}\ \mathrm{true}\ \mathrm{positives}+\mathrm{Number}\ \mathrm{of}\ \mathrm{false}\ \mathrm{negatives}} $$

We deduced that one of the factors causing the low sensitivity was the low incidence of a grade III ULBT in Korean subjects. In our study, only nine patients of 344 could not bite their upper lip, and only three of them presented a difficult laryngoscopic view. In Korea, a grade III ULBT is rarely observed [11]. There can be several reasons for this, with ethnic cephalometric differences possibly being one reason. Several studies were published comparing soft tissues between different ethnic groups in orthodontics and the field of maxillofacial surgery [3, 4, 6, 7, 10, 29,30,31,32].

In a comparison of Southern Chinese and British Caucasian cephalometric standards, Chinese upper lips were longer with a more acute angulation than those of Caucasians [7]. Moreover, Korean subjects had a lower angle of nasal inclination, and a higher degree of lip protrusion compared with European-American adults and the upper and lower lips were positioned more anteriorly [3]. Chang and colleagues conducted a morphometric analysis comparing Asian and European-American subjects [32]. Far East Asian (Chinese, Japanese, Korean, and Taiwanese) men seemed to have a significantly shorter cranium and smaller anterior cranial base angles. Compared with Caucasians, Asians with clinically acceptable occlusions tended to have a shorter midface, prominent mandibles, and an anteriorly displaced TMJ in the posterior cranial base [32]. It was also noted that these features resulted from the relative retrusion of the nasomaxillary complex and the relatively forward position of the mandible.

Therefore, in Asians, the scarcity of a grade III ULBT is explainable as a result of an anteriorly displaced TMJ and redundant soft lip tissues. It is clear that the ULBT shows a much lower false-positive rate, lower sensitivity, and higher PPV than other predictive methods.

Limitations

One of the limitations of a test to predict a DL is the discrepancy between a DL and difficult intubation. Patients who presented with a DL (Cormack–Lehane grade 3 or 4) could easily be classified as a grade II or better with the application of external pressure to the larynx (the BURP maneuver) to move the epiglottis. These patients would have been described as having an easy TI. Therefore, it is difficult to predict a difficult TI (DTI) only by mandibular movement, and a DL does not predict a DTI.

Moreover, in this trial, the prevalence of a DL was 25.87% (89 of 344), and the DTI/DL ratio was 3.4% (3/89). These are similar to the results shown in previous studies (4.3–77.8%) [20, 33, 34]. And, even though we used nerve stimulation for checking the achievement of muscle relaxation, the methodological limitation might exist cause we didn’t protocolize the technique of anesthetic induction, so it is possible that direct laryngoscopy and endotracheal intubation has not been performed under the same conditions in all patients. We didn’t collect the data of surgery that patients underwent. It could be comparable, but we just wanted to evaluate the predictability of ULBT and had excluded patients who had considerable abnormalities to airway prediction and evaluation.

Conclusions

In Koreans, the ULBT shows very high specificity. The ULBT presented low false-positive rates and high accuracy. Cephalometric ethnical differences may present in Far East Asians are one reason. Therefore, especially in Asians, soft tissue redundancy and skeletal variance should be considered based on ethnic differences when evaluating parameters related to soft tissue such as the ULBT.