In contemporary obstetric anesthesia practice, manual palpation of bony landmarks remains a common method of identifying a safe vertebral level for neuraxial placement. Nevertheless, it is also widely acknowledged that identifying the vertebral level using manual palpation is not a reliable method of locating a safe interspace for epidural and spinal placements, especially in high body mass index (BMI) patients with unrecognizable landmarks.1

In recent years, there has been substantial research into the use of preprocedural lumbar ultrasound (LUS) for neuraxial labour anesthesia and analgesia. Compared with manual palpation, LUS has been shown to significantly improve the first-pass success rate and accuracy in identifying the desired lumbar vertebral levels.2,3 Despite this apparent advantage, many practitioners still do not routinely employ LUS for neuraxial placements, in part because of insufficient technical proficiency to confidently acquire and interpret LUS images4 and perceived lack of benefit for experienced anesthesia providers.5,6 Inevitably, avoidance of routine application due to these perceived barriers may contribute to skill decay over time, thereby reinforcing the regular use of a less reliable manual palpation method.7

If LUS image acquisition and interpretation could be made simple, quick, and accurate, the rate of clinical adoption may improve. Recently, machine learning techniques have been employed to identify discriminative features of the laminae. Using deep networks that use transfer learning, we developed a proprietary automated ultrasound software called the Spine Level Identification (SLIDE) system to automatically identify vertebral landmarks in real time while the operator slides the ultrasound transducer over the lumbar spine in the transverse plane along the midline, starting at the sacrum and moving cephalad.8 The SLIDE system can automatically detect lumbar vertebral spinous processes and intervertebral spaces by directly analyzing ultrasound images without an external tracking hardware. By identifying planar transitions in the acquired LUS images, using the sacrum as a frame of reference, SLIDE predicts the vertebral level that the transducer is positioned (Fig. 1). In this study, we sought to determine the agreement of SLIDE for identifying the primary target of the L3–4 intervertebral space in healthy term pregnant women with traditional LUS. The primary outcome was identification agreement of L3–4 interspace with SLIDE compared with manual palpation, as percentage agreement and Gwet’s agreement coefficient (AC1), using the traditional lumber ultrasound method as a reference standard. The secondary outcomes included identification agreement of other lumbar levels and duration required to identify the L3–4 target interspace.

Fig. 1
figure 1

Scan path for automatic vertebral level detection via SLIDE (blue arrow) and operator interface with corresponding transverse plane ultrasound images of the lumbar vertebral anatomy.

Methods

Study population

This study was registered before patient enrolment at www.ClinicalTrials.gov (NCT02982317; principal investigator, Anthony Chau; first registered, 5 December 2016). After receiving approval from The University of British Columbia Clinical Research Ethics Board (Vancouver, BC, Canada; H16-02858) and written informed patient consent, healthy singleton term (≥ 36 weeks’ gestation) parturients scheduled for elective Cesarean delivery under spinal anesthesia at BC Women’s Hospital from January 2017 to June 2017 were recruited. We excluded patients who were under the age of 19 yr, could not give consent because of a language barrier, had a BMI ≥ 40 kg·m-2, were in active labour, had any documented spinal abnormalities (e.g., scoliosis, previous lower back surgery), and had known skin allergy to surgical tape and study marker pens. Withdrawal criteria included inadequate time for protocol completion prior to scheduled surgery or equipment malfunction during data acquisition.

Study protocol

After informed consent was obtained, all participants were instructed to position themselves in the sitting position with their shoulders forward and back flexed. All measurements obtained using the manual palpation, SLIDE, and LUS methods were blinded using the transparency sheet technique described previously.9 In brief, a transparency sheet was attached to the participant’s back with medical tape (3M™ Micropore™ Surgical Tape; 3M, St. Paul, MN, USA). Between methods, markings from estimated intervertebral levels and spinous processes were transferred from the skin to the transparency sheet before erasing these markings from the participant’s back (Fig. 2).

Fig. 2
figure 2

Transparent sheet attached to the patient’s back (left). Markings are made on the corners and edges of the sheet, such that the sheet can be repeatedly placed at the same location. A fully populated transparency (middle), with the interpretation shown (right). In this example, both SLIDE and manual palpation successfully identify all five intervertebral gap locations (from L5–S1 up to L1–L2).27

Using the manual palpation method, a study anesthesiologist (J. B.) first identified and palpated the iliac crests bilaterally, and then established the level at which a transverse line connecting the superior aspects of the iliac crests intersected the lumbar spine. If the line intersected at an interspace, this was marked as the L3–4 interspace. If the line intersected at a spinous process, the immediate interspace below the spinous process was marked as the L3–4 interspace. This is consistent with a study in term pregnant patients that found the anatomical position of the intercristal line by palpation was most frequently at the L3–4 interspace.10 Once this space was located, all other lumbar interspaces (from L5–S1 to L1–2) were determined from this level by manual palpation.

Next, a nonmedical study investigator (J. H.) identified the L3–4 interspace and other lumber interspaces using the SLIDE method. This software was developed to work with a SonixTouch® ultrasound machine and the associated C5–2/60 curvilinear transducer (Analogic Corp., Richmond, BC, Canada). The software requires initial scanning starting midline at the sacrum. Therefore, the curvilinear transducer was first placed perpendicular to the patient’s skin in the transverse plane above the gluteal folds and with the help of the sonogram displayed on the interface, centred visually to allow the software to detect symmetry and to identify the sacrum. Once the sacrum was identified, the transducer was then moved cephalad slowly. As the transducer was moving up, SLIDE continuously classified the image on the sonogram as the sacrum, spinous process (bone), or intervertebral space (gap) with an estimated probability of that classification ranging from 0 to 1 (0 to 100%) generated by an internal algorithm.8 As soon as an intervertebral space was identified by SLIDE, the interface would indicate a change and the study investigator would pause to allow time to make a marking on the patient’s back with the corresponding classification and level that were displayed (see Fig. 3 and Electronic Supplementary Material, eVideo). All lumbar interspaces (from L5–S1 to L1–L2) were marked on the patient’s skin and subsequently transferred to the transparency sheet.

Fig. 3
figure 3

Software and operator interface, integrated into 3D Slicer Software,28 illustrating the interaction between SLIDE and the provider performing the scan. On the bottom left of the interface, SLIDE shows the real-time prediction of the transducer's location and type (e.g., L4–5 interspace/gap vs L4 spinous process/bone). The image is continuously classified as the sacrum, spinous process (bone), or intervertebral space (gap) with an estimated probability of that classification ranging from 0 to 1 (0 to 100%). This probability is shown in the boxes in the left lower corner. From left to right: the probability of spinous process (bone), probability of sacrum, probability of interspace (gap). In this example, the L4–5 gap is identified with approximately 98% probability. On the right-hand side, SLIDE displays the sonogram in real time.

Finally, one provider experienced with freehand LUS identified the L3–4 interspace with the ultrasound transducer in the longitudinal paramedian plane as previously described.1 Then, this intervertebral space was used as a reference point to identify all the lumbar spinous processes with the ultrasound transducer in the transverse position plane and at the midline. A second provider observed the first provider during scanning and agreed on the reference marks made on the lumbar spinous processes. The second provider did not perform an independent scan. Recognizing a spinal process may span an area of a few centimetres, the point that both providers felt best represents the centre of this area was used as the spinal process location and all spinal process locations were marked on the participants’ backs. The protocol required that at least two providers agree on all lumbar spinous process locations (from L5 to L1) obtained by the LUS method. If there was disagreement, a third provider would be recruited but this was not required for the study.

A transparency sheet was collected from each patient with three sets of markings. The location of estimated intervertebral spaces from manual palpation and SLIDE were recorded as green and blue dots, respectively. The location of the spinous processes identified by LUS were recorded as red dots and were used to separate boundaries of intervertebral spaces. Agreement with the reference LUS method was defined if an intervertebral space identified by the manual palpation or SLIDE method was located within the LUS boundaries of that intervertebral space.

Statistical analysis

Agreement between method types was compared using the raw percentage agreement and Gwet’s AC1 for chance-corrected agreement.11 Mixed-effects logistic regression was used to compare the raw agreement between manual palpation and SLIDE methods. Mixed-effects linear regression was used to compare the time needed to identify the target L3–4 interspace. Mixed-effects regression was also used to determine the relationship between agreement and demographic variables, maternal age, BMI, gestational age, and parity. If any of these variables were significant (P < 0.1), they were then added to a multivariable adjusted model with method type. All analyses were completed in R (R Foundation for Statistical Computing, Vienna, Austria).12

The sample size was estimated based on a prior study by Hayes et al.,13 showing that traditional manual palpation had a 63% accuracy in determining the correct lumbar vertebral level using fluoroscopy as a control. We considered a priori that a 20% improvement in agreement with SLIDE would be clinically significant. Using a type I error of 0.05 and a power of 80%, 76 participants were required to detect a 20% difference in agreement between methods. To account for 10% attrition, a total of 84 parturients were recruited.

Results

A total of 84 parturients were approached and invited to participate, with 77 individuals consenting and seven individuals declining participation. Of the 77 individuals that consented and participated, one patient was withdrawn because of protocol violation. Baseline demographics are summarized in Table 1.

Table 1 Demographic data of study patients

The raw agreement for manual palpation was 70% (Gwet’s AC1 = 0.59; 95% CI, 0.41 to 0.77) and the raw agreement for SLIDE was 84% (Gwet’s AC1 = 0.82; 95% CI, 0.70 to 0.93). In general, where manual palpation disagreed with LUS, it identified L2–3 more often than L4–5. Nevertheless, when SLIDE disagreed with LUS, it was equally above or below L3–4 (Fig. 4A) except for a single outlier in the SLIDE method, which mistook an L1–2 space for an L3–4 space. Additionally, SLIDE identified all lumbar interspaces correctly more frequently than manual palpation did. Compared with SLIDE, manual palpation had a higher tendency to disagree with LUS and to identify intervertebral levels one higher than the reference location (Fig. 4B).

Fig. 4
figure 4

(A) Primary outcome: identified level by SLIDE or palpation relative to the freehand ultrasound identified L3–L4 level. The size of the circles is proportional to the number of cases in each category as indicated by the numbers in the circles. Note the single outlier in the SLIDE method, which mistook an L1–2 space for an L3–4 space. (B) Identified level by SLIDE or manual palpation relative to the freehand ultrasound identified level for all intervertebral spaces. The size of the circles is proportional to the number of cases in each category. Using freehand ultrasound as a reference, all lumbar interspaces were identified correctly with SLIDE more frequently than with manual palpation. Compared with SLIDE, manual palpation has a higher tendency to incorrectly identify intervertebral levels one higher than the true location.

From the mixed-effects logistic regressions, there was no significant relationship between agreement with LUS in locating L3–4 with maternal age, gestational age, or parity. Nevertheless, there was a significant relationship between agreement with LUS with BMI (P = 0.001) and method types (P = 0.02). These variables were added to a multivariable regression model and after adjusting for BMI, SLIDE had significantly higher agreement with LUS in locating L3–4 (adjusted odds ratio [aOR], 2.99; 95% CI, 1.21 to 8.7; P = 0.02) and all other lumber interspaces (aOR, 3.28; 95% CI, 1.55 to 7.9; P = 0.001). In other words, the odds of locating L3–L4 in agreement with LUS, and all other lumber interspaces, are approximately three times higher with SLIDE than with manual palpation after controlling for BMI (Table 2).

Table 2 Summary of accuracy and regression analyses for all levels with raw and adjusted odds ratios controlled for BMI

The mean (standard deviation) duration required to obtain L3–4 was 0.4 (0.1) min for manual palpation, 1.5 (0.8) min for SLIDE, and 1.9 (0.7) min for LUS. Compared with manual palpation, the duration was significantly longer for LUS (mean difference, 1.5 min; 95% CI, 1.3 to 1.6; P < 0.001) and SLIDE (mean difference, 1.2 min; 95% CI, 1.0 to 1.4; P < 0.001) with no difference between LUS and SLIDE (mean difference, 0.2 min; 95% CI, -0.03 to 0.4; P = 0.10).

Discussion

In this prospective observational study, the key finding is that the novel SLIDE automated software system can identify target L3–4 lumbar vertebral interspaces in pregnant women with a greater agreement with LUS than standard manual palpation by an experienced anesthesiologist. Furthermore, SLIDE had a greater agreement with LUS than manual palpation did at every lumbar intervertebral level, with significantly greater odds of locating L3–4 in agreement with LUS after controlling for BMI. Although there was a statistically significant difference in the time needed to locate the L3–4 interspace by manual palpation compared with SLIDE or LUS, the mean difference and 95% CI indicated no clinically important difference between the three techniques in time needed to identify the target interspace.

The concept of automated spinal ultrasound analysis technology has been explored in both the obstetric and nonobstetric populations.14,15 Other tools have been developed for similar purposes, but often rely on bulky or expensive external tracking hardware.16,17,18 The advantage of the SLIDE system is that it is simple and quick to use, as it automatically discriminates transverse images of the spine between the sacrum, intervertebral gap, and vertebral bone via a convolutional neural network, and is a machine learning model suitable for image classification tasks.19 This contrasts with similar systems that have been proposed, which used different computer vision techniques.14,20,21

The reported agreement between LUS and the manual palpation method varies widely in published studies, ranging from 29%22 to 70%.23 Our finding is consistent with this upper agreement range, likely related to the fact that a single anesthesiologist performed manual palpation on all patients, thus minimizing the between-provider variations that might have contributed to the lower agreement range in other studies. It is important to note that at the L2–3 and L4–5 interspaces, where neuraxial placements are also commonly performed, the SLIDE system had an agreement with LUS of 82% and 88%, respectively, approximately 13% to 21% greater agreement than the manual palpation method. The odds of the SLIDE system identifying the L2–3 and L4–5 levels in agreement with LUS were almost three to four times higher than with manual palpation. Also, the odds of the SLIDE system identifying L1–2 level in agreement with LUS were almost three times higher than with manual palpation. This is an important finding because the clinical use of SLIDE in determining a single level at L3–4 is limited unless it could also identify all other spaces with a high agreement with LUS, particularly those that might be too high for a subarachnoid injection.

There are a few limitations to our study. First, it has been shown that there can be errors associated with the freehand LUS approach in accurately determining the true intervertebral levels due to lumbosacral transition vertebrae.24 In a retrospective study of 3,855 patients with abdominal computed tomography scans with ages ranging from 18 to 100 yr (median age, 65.3 yr), lumbosacral transitional vertebrae were found in 29% of the scans.24 This means that the reference value we used may be off by one level in either direction, possibly in up to one-third of the cases. The prevalence of lumbosacral transitional vertebrae in pregnancy is unknown, and formal radiological evaluation would represent the gold standard for accurately determining the spinal levels;25 however, we used LUS as the reference modality as this was the safest and most commonly used method of spinal level assessment by anesthesiologists in an obstetric setting. Second, while we specified having at least two individuals experienced with LUS image interpretation to agree on this reference point, the second provider observed the scan and did not independently conduct another scan. This could introduce bias in the measurements. Third, the agreement of SLIDE with LUS depends on the full image of the vertebra being visualized; the agreement will be affected if the operator deviates off midline when sliding the transducer cephalad or if abnormal anatomy was encountered rendering a part of the vertebra cut off from the image. Automatic centreline detection may be one solution to this issue and may help guide the nonmedical operator while using the SLIDE system. Fourth, the generalizability of our study findings is restricted to the condition and population of this study. Evaluating SLIDE primarily as a proof-of-concept, we chose to have the manual palpation performed by a single operator to minimize interoperator variability. While this improved internal validity, it decreased external validity so further study involving a group of obstetric anesthesiologists in routine clinical practice would be needed. Furthermore, SLIDE requires the patient to sit relatively still to locate the correct interspace. Any significant movement, which may occur if a labouring patient repositions themself during contractions, for example, could disrupt SLIDE and force a rescan, resulting in additional time required.

Indeed, our results indicate there is room to further improve on the SLIDE system. The fact that the agreement was only 84% at L3–4 suggests that 16% of the time, SLIDE did not agree with the LUS reference point. In addition, it is important to point out that there was a single outlier where the SLIDE method mistook an L1–2 for an L3–4 interspace. Although this occurred in only one of 76 participants (1.3%), a miss of two intervertebral levels too high reaches a clinical significance so further refinement in SLIDE’s algorithms would be required.

When performing the scan with SLIDE, the scan sometimes needed to be repeated to validate the results of the first scan, without updating level markings on the patient. A modification of the scanning protocol could be introduced that requires the operator to perform a second scan that must match the results of the first scan, to increase the confidence of automatic findings. Moreover, the convolutional neural network used in this study was initially trained on labelled data from 20 nonpregnant, healthy individuals. The statistical representation of the spinal anatomy in this nonpregnant population was used to automatically classify transverse images of pregnant individuals in this study, so further training on pregnant individuals may improve its agreement with LUS. Future studies could expand into the thoracic spine to determine its use in placing thoracic epidural analgesia and in patients with more challenging back anatomy. Despite the potential opportunities of SLIDE, the clinical utility of this software will be limited to obstetric centres with access to an ultrasound machine with a curvilinear probe on the birthing unit when offering neuraxial placements. Nevertheless, this limitation may be mitigated by the continuing rise in the availability of portable ultrasound devices in various medical settings.26

In conclusion, we found that in healthy term pregnant women, the SLIDE automated software system can identify target L3–4 lumbar vertebral interspaces with greater agreement than standard manual palpation by an experienced clinical provider. Future studies should examine the agreement between SLIDE and formal radiological images to determine its value for clinical integration.