Introduction

Musculoskeletal disorders, such as back pain and spinal deformities, have a significant impact on individuals' well-being, quality of life, and economy [1]. Spine muscles play a critical role in supporting the spine and transmitting forces within the musculoskeletal system [2]. Abnormalities or dysfunction in spinal muscles are often associated with musculoskeletal disorders. Accurate segmentation of spinal muscles is important for understanding the mechanisms underlying these disorders and for developing appropriate treatment strategies. While changes in muscle structure are typically a result of spine pathology rather than a cause, understanding these changes can provide valuable insights for both patients and physicians.

Medical imaging techniques, such as magnetic resonance imaging (MRI) are commonly used to acquire muscle images and analyze musculoskeletal structures [3]. These techniques provide detailed information about the morphology and composition of the spine muscles [4]. However, accurately segmenting spinal muscles in these images can be difficult due to several factors such as different data, and imaging protocols.

Several challenges make the segmentation of spine muscles complex. First, the complexity and variability of muscle structures, such as size, shape, and orientation, make it difficult to design a one-size-fits-all segmentation approach. Second, image artifacts, such as noise and partial volume effects, can degrade image quality and affect segmentation accuracy. Third, patient-specific variations, such as body posture and position, can introduce additional challenges in accurately delineating muscle boundaries. These challenges highlight the need for advanced and robust segmentation techniques.

Therefore, accurate segmentation of spine muscles is vital for understanding musculoskeletal disorders and designing effective rehabilitation strategies [5]. Advanced imaging techniques and computational algorithms have contributed to significant advancements in this field. However, challenges related to the complexity of muscle structures, image artifacts, and patient-specific variations still exist. The purpose of this systematic review is to evaluate the state of the art of spinal muscle segmentation using AI methods and identify optimal algorithms to identify areas for improvement to improve clinical evaluation and treatment planning for musculoskeletal disorders and apply them to further research.

Methods

Study eligibility criteria

The inclusion criteria for this study were as follows: (1) research unrelated to segmentation spine muscle, (2) studies written in English. The exclusion criteria were as follows: (1) studies that not used MRI to measure muscle, (2) studies not that did not meet other criteria. Figure 1 for more details.

Fig. 1
figure 1

PRISMA flow chart

Search method to identify appropriate studies

In this study, we conducted a literature search following the guidelines of Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) in the PubMed/MEDLINE library [6]. We searched for papers published from January 1992 to August 2023 using the following search term; segmentation spine muscle MRI. These search queries were employed to retrieve relevant articles for our research.

Data extraction

To conduct an analysis of relevant papers suitable for our study, the following variables were extracted: (i) Author; (ii) Year; (iii) Segmentation method; (iv) Subjects; (v) Data; (vi) Performance; Table 1.

Table 1 Index of spinal muscle segmentation research results from MRI images

Ethical considerations

As this is a systematic review, ethical approval is not required. Confidential patient information will not be collected or used in this study.

Results

After reviewing the abstracts and screening according to the PRISMA guidelines, we excluded 189 studies that were not relevant to spine muscle segmentation. Additionally, 0 studies not written in English were excluded. Furthermore, 127 studies that did not use MRI as a measurement equipment were excluded. We also excluded 41 studies that did not evaluate indicators that met the criteria. Finally, a total of 12 studies were included in our research scope [7,8,9,10,11,12,13,14,15,16,17,18]. The studies included in the systematic review were conducted between 1992 and 2023 and involved healthy volunteers, back pain patients, ASD patients. MRI imaging was performed on devices from several manufacturers, including Siemens, GE, and MEDSPEC. Studies included automatic segmentation using AI, segmentation using PDFF, and segmentation using ROI. Segmentation performance was higher AI method than other segmentation method. Most high DSC 0.91 was David Baur’s U-Net. (Table 2).

Table 2 Index of spinal muscle AI segmentation research results from MRI images

Discussion

This systematic review provided insight into the different methods and outcomes of spinal muscle splitting. The identified segmentation techniques, including traditional image processing methods, statistical models, machine learning approaches, and deep learning-based algorithms, have shown promise in accurately segmenting spine muscles. Each technique has its advantages and limitations, and the choice of technique depends on the specific requirements of the segmentation task, including accuracy, computational complexity, and adaptability to different types of spine muscle images. Among the segmentation methods used in this systematic review, segmentation using AI showed the best performance. Among them, we compared how performance differs depending on the model and preprocessing method used. Tables 2 and  3.

Table 3 Hyperparameter of spinal muscle segmentation studies

Advances in deep learning-based algorithms, especially CNN architectures, have significantly improved spinal muscle segmentation. David Baur developed a CNN to segment lumbar spinal muscles in lower back pain patients from consecutive MRI slices and classify fatty muscle degeneration automatically. The study used 100 lumbar spine MRIs with 3650 slices for automatic image segmentation. The U-Net-based network achieved high segmentation accuracy, particularly for overall muscle segmentation, with a Dice similarity coefficient (DSC) of 0.91. These algorithms have demonstrated outstanding performance by learning complex features directly from muscle images without the need for hand-crafted features.

Kenneth A.Weber, Madeline Hess’s T1 axial Muscle Segmentation uses V-Net. Kenneth A.Weber’s performance is (Left DSC:0.862 ± 0.017, Right DSC: 0.871 ± 0.016) lower than Madeline Hess’s performance (DSC:0.88). This is because the elements that make up v-net are different. Table 4 compares these differences. We also compared the performance of the 3D CNN and 2D CNN. In E. O. Wesselink's study, the objective was to compare the performance between 2D convolutional neural networks (CNNs) and 3D CNNs. While 2D CNNs are designed to extract features from 2-dimensional images, 3D CNNs do so from 3-dimensional volumetric data. In this study, data augmentation techniques were applied, and the True positive rate (TPR) for right-sided muscles specifically the multifidus, erector spinae, and psoas major was compared between the two models. As indicated in Fig. 2, the 2D model demonstrated superior performance in identifying muscles when compared to the ground truth, outperforming the 3D model. The performance of the segmentation model varies depending on the presence and severity of spine pathology [19]. In Benjamin Dourthe's study, the Dice Similarity Coefficient (DSC) values for three specific Regions of Interest (ROI) were compared between healthy individuals and those with Adult Spinal Deformity (ASD). The ROIs included the vertebral body, psoas major, and multifidus erector spinae. The study uses data from five different sets to make an in-depth comparison of how well these anatomical regions are identified in both groups. Based on the analysis, the lumbar region in healthy individuals performed better in terms of ROI identification compared to those with ASD. Figure 3.

Table 4 Comparison of V-Net configurations used for spine muscle segmentation
Fig. 2
figure 2

Comparison of True Positive Rate for Right-sided Muscles: 2D vs 3D with Data Augmentation

Fig. 3
figure 3

Comparison of DSC Values for Healthy and ASD Lumbar Across Multiple Sets

Frank Niemeyer et al. [20] highlights the differences in segmentation performance between individuals with lumbar spine pathology, such as adult spinal deformity (ASD), and those without. Based on the provided data and the referenced study, there are several factors that could contribute to the observed differences in segmentation performance. One of the primary reasons for the difference in segmentation performance could be attributed to the higher heterogeneity of lumbar spine pathology in ASD patients. In healthy individuals, the anatomical structures are more consistent and predictable, allowing segmentation algorithms to perform better. However, in ASD patients, the anatomical structures are more varied due to the deformities and associated pathological changes. This variability makes it challenging for segmentation models to accurately identify regions of interest (ROI), leading to decreased performance. The difference in segmentation performance between healthy individuals and those with ASD can be primarily attributed to the higher heterogeneity and complexity of pathological anatomy in ASD patients.

The performance differences in spinal muscle segmentation algorithms can be attributed to several factors. such as model architectures, dataset sizes, and batch size. Different neural network architectures, U-Net, CNN, and V-Net, have unique structural characteristics that influence their performance. For instance, U-Net is designed for biomedical image segmentation and excels at capturing fine details and contextual information, whereas CNNs are more general-purpose and can vary significantly in their complexity and depth. The performance differences in spinal muscle segmentation algorithms can be attributed to a combination of hyperparameters, model architectures, and dataset characteristics. While the choice of hyperparameters such as learning rate, optimizer, activation function, and regularization techniques (dropout) significantly impact model performance, the dataset size and the specific loss functions used are equally crucial. To optimize segmentation performance, it is essential to carefully tune these parameters and consider the specific requirements of the task at hand. Future research could focus on systematically evaluating these factors across different models to establish more standardized guidelines for optimal performance in spinal muscle segmentation.

Limitation

Spine muscle segmentation is crucial due to its pivotal role in the analysis of musculoskeletal disorders and the design of effective rehabilitation strategies. The reviewed studies showcased various segmentation techniques, with deep learning-based algorithms demonstrating superior performance. However, challenges related to accuracy, robustness, and dataset availability persist. CT imaging can also perform automatic segmentation of spinal muscles well. For example, among studies on automatic segmentation of spinal muscles in CT images, there is a study using Bayesian U-Net to investigate the relationship between the accuracy of muscle segmentation around the spine in torso CT images [21], and a method of 3D segmentation of skeletal muscles, including paraspinal muscles, by region in the L3 slice of body CT images using simultaneous learning using 2D U-Net [22], or multi-scale iterative random forest classification was used. A fully automated segmentation study of paraspinal muscles in 3D trunk CT images [23]. etc. There is this. These studies should consider incorporating both MRI and CT modalities in paravertebral muscle segmentation. CT imaging can be particularly useful for evaluating patient groups where MRI imaging is not feasible, such as those with pacemakers. Also, because the comparative segmentation methods of the included studies are all different, it cannot be concluded that the best algorithm among the studies is the artificial intelligence-based segmentation. In the future, a method to integrate all studies and conduct quantitative evaluation will need to be developed. Addressing these challenges will lead to more accurate segmentation techniques and enhance clinical assessment and treatment planning for musculoskeletal disorders.

Conclusion

Spinal muscle segmentation is a variety of techniques, ranging from traditional methods to deep learning algorithms such as David Baur's U-Net, have shown promise in accurately segmenting spinal muscles. Deep learning, in particular, excels at this task by learning complex features directly from images. Spinal muscle segmentation plays an important role in musculoskeletal disease analysis and rehabilitation planning. Deep learning has shown excellent performance, but issues related to accuracy, robustness, and dataset availability still remain. Addressing these challenges will further improve clinical evaluation and treatment strategies for musculoskeletal disorders.