Abstract
Aim
Deep learning (DL) algorithms can be used for automated analysis of medical imaging. The aim of this study was to assess the accuracy of an innovative, fully automated DL algorithm for analysis of sagittal balance in adult spinal deformity (ASD).
Material and methods
Sagittal balance (sacral slope, pelvic tilt, pelvic incidence, lumbar lordosis and sagittal vertical axis) was evaluated in 141 preoperative and postoperative radiographs of patients with ASD. The DL, landmark-based measurements, were compared with the ground truth values from validated manual measurements.
Results
The DL algorithm showed an excellent consistency with the ground truth measurements. The intra-class correlation coefficient between the DL and ground truth measurements was 0.71–0.99 for preoperative and 0.72–0.96 for postoperative measurements. The DL detection rate was 91.5% and 84% for preoperative and postoperative images, respectively.
Conclusion
This is the first study evaluating a complete automated DL algorithm for analysis of sagittal balance with high accuracy for all evaluated parameters. The excellent accuracy in the challenging pathology of ASD with long construct instrumentation demonstrates the eligibility and possibility for implementation in clinical routine.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Adult spinal deformity (ASD) is known to severely reduce the health-related quality of life and shows an increasing prevalence in patients > 65 years (30–68%) [1, 2]. Long construct instrumentation is the surgical treatment for high degree ASD [3, 4].
The radiological evaluation of sagittal balance is fundamental for characterization, classification, and consecutive treatment planning of ASD [5,6,7]. Among the most important radiographic parameters are sacral slope (SS), pelvic tilt (PT), pelvic incidence (PI), lumbar lordosis (LL) and sagittal vertical axis (SVA) [8]. These parameters can be identified and measured on total spine radiographs including the pelvis. A precise and reproducible measurement of these radiographic parameters is therefore essential. For the most part, this evaluation is performed manually with software assistance, which is time consuming and examiner dependent [9,10,11].
Artificial intelligence (AI) technologies are employed in different fields in medicine. Machine learning with deep learning (DL) algorithms is currently developed for precise image analysis. Only few publications investigated AI based, automated analysis of the clinically relevant sagittal balance parameters by a single algorithm [12, 13]. None of these algorithms showed a sufficiently high accuracy of the automated measurements of basic sagittal balance parameters until now.
Limiting factors in establishing a fully automated DL algorithm have been reduced image quality and stitching artefacts by now. Beyond that, high-grade spinal deformity and postoperative long construct instrumentations with implant artefacts lead to high inaccuracy of the analysis and prevented the implementation in routine clinical use [12,13,14].
We were recently able to show high accuracy of spinopelvic parameters as measured by AI in different lumbar pathologies with short instrumentations and without detection of SVA [15].
The aim of this study was to assess the accuracy of a new, complete automated DL algorithm for analysis of essential parameters of sagittal balance in a large and challenging cohort of patients with ASD and after their correction with long construct instrumentation.
Material and methods
The evaluation of the DL performance was conducted retrospectively on a cohort of 141 patients with ASD. Study approval of the local Ethics Committee was obtained prior to the initiation of this study (EA1/342/21). Patients with ASD that underwent corrective surgery with long construct instrumentation with more than three segments were included in the study. Exclusion criteria were prior spinal surgery with instrumentation or kyphoplasty.
Radiographic data
From the 141 identified patients, 118 had preoperative and 125 had postoperative lateral total spine radiographs obtained by three different X-ray machines (Kodak Elite CR and Kodak DRX-Evolution X-ray scanners; Carestream Health, Rochester, NY, USA and EOS imaging; ATEC, Paris, France) at our institution. All postoperative radiographs included long construct spinal instrumentation (pedicle screws, rods, interbody cages). Screws were cement augmented in 21 patients.
Ground truth manual measurements
For comparison with the DL measurements all preoperative and postoperative radiographs were manually measured independently by two of the authors (F.A. and J.L.) using the SurgiMap Spine software as previously reported (Nemaris Inc., New York, NY, USA) [9, 10, 15, 16]. SS, PT, PI, LL (as measured by L1-S1 lordosis) and SVA were measured and recorded. For intraobserver reliability, the measurements were repeated.
Deep learning-based measurements
The DL-based algorithm for automatic computation of sagittal balance parameters included three main steps: (1) Automatic adjustment of image brightness, contrast and identification of stitching artefacts for segmentation of all relevant anatomical structures—cervical, thoracic, and lumbar vertebral bodies, sacral endplate, femoral heads and instrumentations, (2) landmark detection on sacrum and L1, and (3) line fitting and computation of all parameters.
The segmentation model was trained using Mask-RCNN architecture on 946 training images obtained from 22 different clinical sites in their clinical routine [17]. The training images were independent from the 118 preoperative and 125 postoperative measured radiographs. As an input to the segmentation model, the DICOM images were preprocessed to enhance the brightness and contrast. Furthermore, a histogram equalization was applied to highlight the bony structures in the images. The segmentation model was trained on the masks around the visible anatomical structures and their corresponding categories. The training labels were generated by the medical staff with background knowledge on human anatomy. The model was trained for 100 epochs on NVIDIA GeForce 1080 GPU with a 90–10 validation split.
The development of the landmark detection algorithm relied on the location of detected structures in the first step of segmentation, allowing the generation of crops of the sacrum and the vertebral bodies. Two separate models were trained to place (1) five landmarks on sacral endplate and (2) six landmarks on L1 with three landmarks each on each upper and lower endplate. The CNN network was based on UNet architecture and was fed with 256 × 256 squared crops along with the landmarks as heatmaps as input [18]. The output heatmaps from the model were converted to coordinates as the final prediction. Euclidean distance error and AdamW optimizer were used for training with a learning rate of 0.001 for 60 epochs [19].
The final step compiles all the necessary predictions from segmentation and landmark placement models to compute the relevant parameters. The vertebral bodies are labelled from sacral/caudal to cranial/cervical counting five lumbar, twelve thoracic and seven cervical vertebras. The spinopelvic parameters were computed using the line regression on the detected landmarks on sacrum/L1 and midpoint of the detected femoral heads. The SVA was computed based on the midpoint of C7 and the most posterior landmark of sacral endplate (Fig. 1).
Statistical analysis
The mean values, root mean square error (RMSE) and standard deviation (STD) were calculated for the parameters. The correct detection rate of the DL algorithm was described (in percentage), where all parameters could be computed fully automatically. The intra-class correlation coefficient (ICC), Pearson correlation coefficient and the correspondent p values were calculated for intra- and interobserver as well as intermodal reliability. Statistical significance was defined as p < 0.05. All statistical analyses were conducted with SPSS 27 (IBM Corp., Armonk, New York, NY, USA) and Python 3 programming language [20].
Results
The preoperative detection rate of the DL algorithm was 91.5%. The postoperative detection rate was 84.8%. The intraobserver ICC (Pearson correlation coefficient) for the SurgiMap-assisted manual preoperative and postoperative measurement was 0.85–0.99 and 0.93–0.99, respectively. The interobserver ICC (Pearson correlation coefficient) for the SurgiMap-assisted manual preoperative measurement was 0.96 for SS, 0.99 for PT, 0.96 for PI, 0.97 for LL and 0.99 for SVA. The interobserver ICC (Pearson correlation coefficient) for the SurgiMap-assisted manual postoperative measurement was 0.99 for SS, 0.99 for PT, 0.99 for PI, 0.99 for LL and 0.99 for SVA (Table 1). The ground truth values are given in Table 1. The ICC between the manual measurements and the DL measurements was 0.71–0.99 for the preoperative and 0.72–0.96 for the postoperative analysis (Table 2). The measurement accuracy was not affected by implants or cement augmentation of screws, as no statistically significant differences of the evaluated parameters could be revealed between these groups in a subgroup analysis (p > 0.05) (Fig. 1).
Discussion
This study is the first to show high accuracy for measurement of fundamental sagittal balance parameters by one single, complete automated DL algorithm.
The main finding of this study is that the new DL algorithm is a reliable tool due to the high precision. DL evaluation of high degree degeneration and spinal deformity is the most challenging. All patients of this cohort had ASD and were evaluated preoperatively and postoperatively with long construct instrumentations.
The highest measurement accuracy in this cohort was observed for SVA. This is of particular importance, as it is a fundamental radiological parameter to evaluate and classify ASD and global balance. The assessment of sagittal balance in combination with the also investigated PT allows for further consideration of compensatory mechanisms. The highest inaccuracy was observed for the detection of SS, which is consistent with so far published results and due to sacral endplate irregularities and summation of implants in this area [21]. However, the clinical importance of SS is inferior to PI, for which our results compare favourable to other studies [13, 22].
Previous studies did not show sufficiently high accuracy for relevant spinopelvic parameters with one DL algorithm [12,13,14, 22, 23]. The only study investigating the four most relevant spinopelvic parameters, as they were investigated in our study, showed a detection accuracy of PI of 0.69 and the authors concluded that the DL algorithm is not suitable for implementation in clinical routine [13]. The other investigated parameters showed comparable high ICCs to our study. Further studies with high accuracy did not investigate postoperative radiographs with implants or showed high accuracy for spinopelvic parameters but not SVA [21, 22, 24].
Previously, we investigated automated DL measurements of sagittal balance in short-segment spinal deformities and mono- and bisegmental instrumentations. The accuracy of the present study compares equally to these findings [15].
Until now, only three studies investigated automated DL-based SVA measurements. On this occasion, the main cause for the difficulty in computing SVA is the visibility of C7 as observed in our study. The clinical routine is based on conventional total spine radiographs in many hospitals until now. DL measurements need to cope with varying radiograph quality to be suitable for clinical use. Among other aspects, the most important challenges in this study were radiographs issued from three different X-ray machines with lower image quality (including stitching artefacts), long construct instrumentations and cement augmentation of screws. Prior studies of DL analysis of sagittal balance with high accuracy excluded up to 28% of radiographs due to poor radiograph quality [13, 22]. This may improve the measurement performance but prevents a statement on how the DL algorithm would perform on real clinical data. All postoperatively examined radiographs in this cohort included long construct instrumentation. A high performance of the algorithm for preoperative and postoperative analysis is important for clinical implementation. The implant density of postoperative measurements and cement augmentation of screws did not affect the measurement accuracy in our study.
The DL algorithm in this cohort is multimodal and involves vertebrae segmentation and separate landmark placement for each segmented vertebra. Six landmarks are placed on vertebral bodies and five landmarks on the sacrum, which is significantly more than previously published DL approaches. This contributes to a high accuracy and very robust workflow in view of the challenging presented cohort.
The manual measurements were done with software assistance, which has shown to be a reliable tool [9, 10, 12]. The intra- and interobserver correlations in this cohort compare equal and favourable with prior published results [22]. As the evaluation of the DL accuracy is based on these ground truth measurements, this is a key point of all validation studies.
A limitation of this study can be seen in the fact that we only investigated four spinopelvic parameters. Previous studies were able to evaluate more parameters [22]. However, the four presented parameters are among the most relevant and challenging to detect and are sufficient for clinical use and decision-making.
Conclusion
The new DL algorithm provided high accuracy for complete automated detection of sagittal balance in ASD. For the first time, the precision and the robustness of a DL algorithm allow for implementation in clinical routine. In the spotlight of the recent discussion, this study demonstrates a performing synergism of DL and human effort for improved analysis of medical imaging.
References
Meyers AJ, Wick JB, Rodnoi P, Khan A, Klineberg EO (2021) Does L5–S1 anterior lumbar interbody fusion improve sagittal alignment or fusion rates in long segment fusion for adult spinal deformity? Glob Spine J 11:697–703. https://doi.org/10.1177/2192568220921833
Schwab F, Dubey A, Gamez L, El Fegoun AB, Hwang K, Pagala M, Farcy JP (2005) Adult scoliosis: prevalence, SF-36, and nutritional parameters in an elderly volunteer population. Spine (Phila Pa 1976) 30:1082–1085. https://doi.org/10.1097/01.brs.0000160842.43482.cd
Lafage R, Schwab F, Elysee J, Smith JS, Alshabab BS, Passias P, Klineberg E, Kim HJ, Shaffrey C, Burton D, Gupta M, Mundis GM, Ames C, Bess S, Lafage V (2021) Surgical planning for adult spinal deformity: anticipated sagittal alignment corrections according to the surgical level. Glob Spine J 12:1761–1769. https://doi.org/10.1177/2192568220988504
Passias PG, Kummer N, Imbo B, Lafage V, Lafage R, Smith JS, Line B, Vira S, Schoenfeld AJ, Gum JL, Daniels AH (2023) Improvements in outcomes and cost after adult spinal deformity corrective surgery between 2008 and 2019. Spine 48(3):189–195
Le Huec JC, Charosky S, Barrey C, Rigal J, Aunoble S (2011) Sagittal imbalance cascade for simple degenerative spine and consequences: algorithm of decision for appropriate treatment. Eur Spine J 20(Suppl 5):699–703. https://doi.org/10.1007/s00586-011-1938-8
Le Huec JC, Roussouly P (2011) Sagittal spino-pelvic balance is a crucial analysis for normal and degenerative spine. Eur Spine J 20(Suppl 5):556–557. https://doi.org/10.1007/s00586-011-1943-y
Schwab F, Ungar B, Blondel B, Buchowski J, Coe J, Deinlein D, DeWald C, Mehdian H, Shaffrey C, Tribus C, Lafage V (2012) Scoliosis research society-Schwab adult spinal deformity classification: a validation study. Spine (Phila Pa 1976) 37:1077–1082. https://doi.org/10.1097/BRS.0b013e31823e15e
Le Huec JC, Thompson W, Mohsinaly Y, Barrey C, Faundez A (2019) Sagittal balance of the spine. Eur Spine J 28:1889–1905. https://doi.org/10.1007/s00586-019-06083-1
Akbar M, Terran J, Ames CP, Lafage V, Schwab F (2013) Use of Surgimap Spine in sagittal plane analysis, osteotomy planning, and correction calculation. Neurosurg Clin N Am 24:163–172. https://doi.org/10.1016/j.nec.2012.12.007
Lafage R, Ferrero E, Henry JK, Challier V, Diebo B, Liabaud B, Lafage V, Schwab F (2015) Validation of a new computer-assisted tool to measure spino–pelvic parameters. Spine J 15:2493–2502. https://doi.org/10.1016/j.spinee.2015.08.067
Maillot C, Ferrero E, Fort D, Heyberger C, Le Huec JC (2015) Reproducibility and repeatability of a new computerized software for sagittal spinopelvic and scoliosis curvature radiologic measurements: Keops(®). Eur Spine J 24:1574–1581. https://doi.org/10.1007/s00586-015-3817-1
Korez R, Putzier M, Vrtovec T (2020) A deep learning tool for fully automated measurements of sagittal spinopelvic balance from X-ray images: performance evaluation. Eur Spine J 29:2295–2305. https://doi.org/10.1007/s00586-020-06406-7
Yeh Y-C, Weng C-H, Huang Y-J, Fu C-J, Tsai T-T, Yeh C-Y (2021) Deep learning approach for automatic landmark detection and alignment analysis in whole-spine lateral radiographs. Sci Rep 11:7618. https://doi.org/10.1038/s41598-021-87141-x
Galbusera F, Niemeyer F, Wilke HJ, Bassani T, Casaroli G, Anania C, Costa F, Brayda-Bruno M, Sconfienza LM (2019) Fully automated radiological analysis of spinal disorders and deformities: a deep learning approach. Eur Spine J 28:951–960. https://doi.org/10.1007/s00586-019-05944-z
Grover P, Siebenwirth J, Caspari C, Drange S, Dreischarf M, Le Huec JC, Putzier M, Franke J (2022) Can artificial intelligence support or even replace physicians in measuring sagittal balance? A validation study on preoperative and postoperative full spine images of 170 patients. Eur Spine J 31:1943–1951. https://doi.org/10.1007/s00586-022-07309-5
Vila-Casademunt A, Pellisé F, Acaroglu E, Pérez-Grueso FJS, Martín-Buitrago MP, Sanli T, Yakici S, de Frutos AG, Matamalas A, Sánchez-Márquez JM, Obeid I, Yaman O, Bagó J, Essg ESSG (2015) The reliability of sagittal pelvic parameters: the effect of lumbosacral instrumentation and measurement experience. Spine 40:E253–E258
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision. p 2961–2969
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18. Springer. p 234–241
Kingma D, Ba J (2014) Adam: a method for stochastic optimization. In: International conference on learning representations
Rossum van G (1995) Python reference manual. In: CWI
Orosz LD, Bhatt FR, Jazini E, Dreischarf M, Grover P, Grigorian J, Roy R, Schuler TC, Good CR, Haines CM (2022) Novel artificial intelligence algorithm: an accurate and independent measure of spinopelvic parameters. J Neurosurg Spine 37:893–901. https://doi.org/10.3171/2022.5.Spine22109
Vrtovec T, Ibragimov B (2022) Spinopelvic measurements of sagittal balance with deep learning: systematic review and critical evaluation. Eur Spine J 31:2031–2045. https://doi.org/10.1007/s00586-022-07155-5
Zerouali M, Parpaleix A, Benbakoura M, Rigault C, Champsaur P, Guenoun D (2023) Automatic deep learning-based assessment of spinopelvic coronal and sagittal alignment. Diagn Interv Imaging. https://doi.org/10.1016/j.diii.2023.03.003
Wu Y, Chen X, Dong F, He L, Cheng G, Zheng Y, Ma C, Yao H, Zhou S (2023) Performance evaluation of a deep learning-based cascaded HRNet model for automatic measurement of X-ray imaging parameters of lumbar sagittal curvature. Eur Spine J. https://doi.org/10.1007/s00586-023-07937-5
Funding
Open Access funding enabled and organized by Projekt DEAL. The funding was provided by Federal Ministry of Education and Research (BMBF).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Löchel, J., Putzier, M., Dreischarf, M. et al. Deep learning algorithm for fully automated measurement of sagittal balance in adult spinal deformity. Eur Spine J (2024). https://doi.org/10.1007/s00586-023-08109-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00586-023-08109-1