Preliminary exploration of deep learning-assisted recognition of superior labrum anterior and posterior lesions in shoulder MR arthrography

Purpose MR arthrography (MRA) is the most accurate method for preoperatively diagnosing superior labrum anterior–posterior (SLAP) lesions, but diagnostic results can vary considerably due to factors such as experience. In this study, deep learning was used to facilitate the preliminary identification of SLAP lesions and compared with radiologists of different seniority. Methods MRA data from 636 patients were retrospectively collected, and all patients were classified as having/not having SLAP lesions according to shoulder arthroscopy. The SLAP-Net model was built and tested on 514 patients (dataset 1) and independently tested on data from two other MRI devices (122 patients, dataset 2). Manual diagnosis was performed by three radiologists with different seniority levels and compared with SLAP-Net outputs. Model performance was evaluated by the receiver operating characteristic (ROC) curve, area under the ROC curve (AUC), etc. McNemar’s test was used to compare performance among models and between radiologists’ models. The intraclass correlation coefficient (ICC) was used to assess the radiologists’ reliability. p < 0.05 was considered statistically significant. Results SLAP-Net had AUC = 0.98 and accuracy = 0.96 for classification in dataset 1 and AUC = 0.92 and accuracy = 0.85 in dataset 2. In dataset 1, SLAP-Net had diagnostic performance similar to that of senior radiologists (p = 0.055) but higher than that of early- and mid-career radiologists (p = 0.025 and 0.011). In dataset 2, SLAP-Net had similar diagnostic performance to radiologists of all three seniority levels (p = 0.468, 0.289, and 0.495, respectively). Conclusions Deep learning can be used to identify SLAP lesions upon initial MR arthrography examination. SLAP-Net performs comparably to senior radiologists.


Introduction
Superior labrum anterior-posterior (SLAP) lesions were first proposed by Andrews et al. in 1985 and classified by Snyder et al. in 1990 [1,2].SLAP lesions are anterior-toposterior lesions of the superior glenoid labrum and may involve the attachment of the tendon of the long head of the biceps brachii [3].With the development of arthroscopic repair technology, the number of patients with SLAP lesions who undergo arthroscopic repair is also increasing [4,5].Therefore, it is crucial to accurately identify patients with SLAP lesions before surgery, which can reduce the delay caused by diagnostic errors.
The clinical diagnosis of SLAP lesions is challenging [6,7].MRI is widely used in preoperatively evaluating SLAP lesions due to its good soft tissue resolution.However, study results have shown that the accuracy, sensitivity, and specificity of conventional MRI in diagnosing SLAP lesions are not high [8,9].MR arthrography (MRA) is currently considered the most accurate method for the preoperative assessment of SLAP lesions, with a higher average sensitivity (80.4%) and average specificity (90.7%) than conventional MRI [10].Researchers have found that the diagnosis of SLAP lesions is mainly affected by the experience of radiologists [11], that overdiagnosis may occur [9], and that the false positive rate of diagnosis is high due to the existence of anatomical variation [6].
With the development of hardware and the improvement of algorithms, artificial intelligence research, especially in the area of deep learning, has brought about rapid progress in imaging [12].Deep learning aims to simulate the structure and function of the human brain's neural network to learn and extract complex features and patterns from data.The core idea is to use a multilayered neural network to build a cascaded feature extraction and transformation process, where each layer performs some transformations on the input data, gradually mapping the data to a higher-level abstract representation.Deep learning can efficiently and stably assist radiologists in diagnosing and classifying various diseases [13].Recent research results show that deep learning can assist in identifying and classifying various sports injuries, reaching or even exceeding the performance of senior musculoskeletal (MSK) radiologists [14,15].Nevertheless, deep learning research on sports injury diseases is still limited, and many common sports injuries remain to be explored.
This study aims to preliminarily explore the feasibility of deep learning in shoulder MRA for SLAP lesions to assist radiologists in identifying SLAP lesions more accurately, reduce diagnostic errors caused by differences in radiologist experience, and provide more accurate information for clinical practice.

Materials and methods
The institutional review board of our hospital and the medical science ethics committee approved this retrospective study, and the requirement for informed consent was waived (ethical approval number: IRB00006761-M2020458).

Patients
The clinical and imaging data of patients (including adults and adolescents) undergoing shoulder MRA in our hospital were collected retrospectively.The inclusion criterion was MRA examination within ten days before arthroscopic surgery.The exclusion criteria were as follows: (1) poor image quality with many motion artifacts (assessed by M.N., with eight years of experience); (2) previous history of shoulder surgery, tumor, fracture, autoimmune disease, or infection; and (3) Buford complex.From January 2013 to August 2022, 765 patients underwent an MRA examination, and 636 were ultimately included in the study according to the above criteria, as shown in Fig. 1.
All patients underwent shoulder arthroscopic surgery by specialized surgeons in the Sports Medicine Center of our hospital, and the presence or absence of SLAP lesions (class 0: normal; class 1: SLAP lesions) was determined according to the arthroscopy results.The normal patients comprised patients with other types of labral injuries (such as Bankart injuries), rotator cuff tears, and adhesive capsulitis as well as patients with suspected but later disproven SLAP injuries or anatomical variations on conventional MRI.

MR arthrography
Shoulder arthrography was performed by two trained radiologists (W.C., with 20 years of experience, and Y.Z., with 10 years of experience).The contrast medium was made by mixing 0.25 ml gadopentetic acid (Gd-DTPA), 5 ml iodine contrast medium, 5 ml lidocaine (0.02 g/ml), and 5 ml normal saline; the contrast medium was injected by an anterior approach with a 20-G syringe.For injection into the glenohumeral joint space, the routine injection dose ranged from 15 to 18 ml; the patient was supine during injection, with the upper arm slightly externally rotated, and the palm and elbow fossa were facing upward.After injection of 1-2 ml of contrast medium, X-ray imaging was performed to confirm that the contrast medium had successfully entered the joint space, and MRA examination was completed within 30 min after the injection of the contrast medium.
MRA was performed with a Discovery 750 W Silent

Region of interest
Two radiologists (W.C. and M.N.) with different levels of seniority delineated all Discovery 750 W Silent images using Python-based labeling software (https:// github.com/ tzuta lin/ label Img) and drew the region of interest (ROI) in the form of a bounding box.After the initial drawing by M.N., W.C. adjusted the result of his drawing to ensure the accuracy of the ROI.In the axial and OCOR T1-FSE-FS images, the ROI was delineated layer by layer to ensure that only the upper labrum structure was included, minimizing the inclusion of other structures.Examples of ROI delineation are shown in Fig. 2.

Deep learning workflow
The deep learning models were trained using an NVIDIA Tesla 10 × (32 GB video memory, NVIDIA, Santa Clara, California, USA) and an Intel(R) Xeon(R) Gold 5215 CPU (Intel, Santa Clara, California, USA).A flow chart of the study is shown in Fig. 3.

Preprocessing
Image preprocessing can convert images into a form more suitable for machine analysis and processing, eliminate the influence of dimensions between different image features, highlight meaningful information for machine analysis, and suppress irrelevant information to improve the use value of images without changing the image information within.In this study, image preprocessing was performed through normalization: Due to the small number of patients in this study and to improve the model's generalization, data augmentation was performed by randomly changing the image brightness and adding Gaussian noise.Changing the image brightness aims to imitate the adjustment of the image window width/ Normalization = x i − x min x max − x min (x ∶ gray value corresponding to any pixel)

Recognition of SLAP lesions
In this study, we developed SLAP-Net based on deep learning to identify SLAP lesions in MRA images.The network comprises nine layers: two convolutional layers, two batch normalization layers, two maximum pooling layers, and three linear layers.Figure 3

Radiologist evaluations
Three

Statistical analysis
The statistical and deep learning analyses were performed using Python (version 3.6.0;Python Software Foundation, Fredericksburg, VA, USA) software, and data processing was carried out with the PyTorch (version 1.1.0)framework based on dataflow programming.Model performance was assessed by receiver operating characteristic (ROC) curve analysis and evaluated in terms of accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and area under the ROC curve (AUC).The intraclass correlation coefficient (ICC) was used to assess the reliability of the diagnoses between the MSK radiologists.
McNemar's test was used to compare performance between the radiologists and SLAP-Net models.p < 0.05 indicated a statistically significant result.

Results
Among The SLAP-Net established in this study had an AUC of 0.98 (95% CI: 0.945-1.000)for identifying classes 0 and 1 on the test set of dataset 1, with an accuracy of 0.96, and its AUC for identifying classes 0 and 1 on dataset 2 was 0.92 (95% CI: 0.857-1.000),with an accuracy of 0.85.The agreement of three MSK radiologists with different seniority levels for manual diagnosis on the test set of dataset 1 was 0.996 (95% CI: 0.971-1.000),and the agreement of manual diagnosis on dataset 2 was 0.995 (95% CI: 0.964-1.000).The accuracy levels of class 0 and class 1 identification by radiologists 1, 2, and 3 on the test set of dataset 1 were 0.85, 0.83, and 0.81, respectively.SLAP-Net had similar diagnostic performance to radiologist 1 (p = 0.055) and outperformed radiologists 2 and 3 (p = 0.025 and 0.011).On dataset 2, the accuracy levels of these three radiologists in identifying classes 0 and 1 were 0.85, 0.86, and 0.81, respectively.The diagnostic performance of SLAP-Net was similar to that of the three radiologists (p = 0.468, 0.289 and 0.495), as shown in Table 1.Examples of errors made by the model and manual diagnosis are shown in Fig. 4, and the ROC curve of the model is shown in Fig. 5.

Discussion
In this study, we aimed to develop an MRA-based SLAP-Net model and preliminarily explore the feasibility of deep learning for identifying shoulder joint SLAP lesions to reduce the overlooking or over diagnosis of lesions and provide more accurate disease information for clinical practice.The study results show that SLAP-Net has good diagnostic performance in identifying SLAP lesions in images from different devices, reaching the same level of performance as a senior MSK radiologist.
SLAP lesions are an important cause of shoulder pain and instability [16,17].Although various physical examinations can be used to detect SLAP lesions, the diagnostic value of these tests is inconsistent, the specificity is low [18][19][20], and the clinical diagnosis of SLAP lesions is difficult due to the possibility of multiple lesions in the shoulder joint.Therefore, imaging has irreplaceable value in the preoperative identification and evaluation of SLAP lesions.Currently, MRA is considered the preferred method for the preoperative diagnosis of SLAP lesions [21].Nevertheless, due to anatomical variations of the labrum, complex structures around the upper labrum, radiologists' inexperience and fatigue, and other factors, the diagnosis of SLAP lesions varies [8].There are reports that the sensitivity of SLAP lesion diagnosis by MRA is between 0.60 and 0.90, the specificity is between 0.50 and 0.98, and there is large variability between studies of different physicians and institutions [8,[22][23][24].Therefore, developing a deep learning method with high stability and high diagnostic efficiency for identifying SLAP lesions will help inexperienced radiologists diagnose SLAP lesions more effectively and reduce patient treatment delays caused by reporting errors.
The treatment of SLAP lesions is still controversial [25].However, in any case, reasonable intervention should be carried out for SLAP lesions to prevent continuous progression from causing a more substantial impact on patients' lives.Therefore, it is vital to initially identify which patients have SLAP lesions to avoid delays in diagnosis due to diagnostic errors.However, because MRA examinations are rarely carried out in some medical institutions, many radiologists may lack relevant diagnostic experience, resulting in inaccurate diagnoses.The SLAP-Net model developed in this study can assist these radiologists in diagnosis and is expected to improve diagnostic accuracy, sensitivity, and specificity.
In this study, we used the images in dataset 1 to build and test the SLAP-Net model and used images from the other two devices (dataset 2) to conduct independent tests to verify the generality of the model.The results showed that the model achieved excellent diagnostic performance on dataset 1.Even though the model's performance on dataset 2 was reduced, it still reached the level of an experienced MSK radiologist, which demonstrates that the model has good general performance.However, due to the lack of an out-of-group test set, a more accurate assessment of generality requires further research in the future.The performance of the SLAP-Net model developed in this study is similar to that of senior MSK radiologists, implying that the model has both good stability and good diagnostic performance.

Limitations
(1) In this study, we only identified SLAP lesions preliminarily and did not conduct a more detailed classification because only type II SLAP lesions were common in the dataset; the other types were relatively scarce, which may have made for insufficient sample sizes of those.The imbalance between the numbers would lead to poor model performance or even failure to converge.We will conduct further related research after collecting enough data in the future.
(2) Our study did not use conventional MRI data as input because MRA is the most accurate preoperative examination method for diagnosing SLAP lesions.In the future, we will also collect data with conventional MRI and conduct related research.(3) This study lacks an out-of-group test set and only uses different devices for verification.Because no suitable dataset has been found thus far, this work will be done in the future.(4) The training, verification, and testing of SLAP-Net were performed only on data from 3.0-T equipment, and no related research has been conducted on data from 1.5-T equipment.

Conclusion
Our study demonstrates that SLAP-Net has good diagnostic performance in identifying SLAP lesions, and the performance of this model is comparable to that of senior radiologists.Therefore, the SLAP-Net model has the potential to help radiologists and clinicians identify SLAP lesions more accurately before surgery.

Fig. 1
Fig. 1 Flowchart showing the baseline patient characteristics ("Model Training" column) shows the model structure diagram.SLAP-Net uses the Adam optimizer, the loss function is weighted crossentropy loss (the weights of classes 0 and 1 are 1 and 1.6), and the learning rate is dynamically adjusted through the cosine annealing strategy.We used all the images from the Discovery 750 W Silent (referred to as dataset 1), divided into a training set, a verification validation set and a test set at an 8:1:1 ratio, to train and test the SLAP-Net model; we then used all images from the Magnetom Trio and UMR 770

Fig. 2 Fig. 3
Fig. 2 Examples of the manual drawing of ROIs.The figure shows the axial and OCOR T1-FSE-FS images, where panels a and c are the original images, respectively, and panels b and d are the corresponding bounding boxes.All ROIs were delineated to maximize the inclusion of labral structures while minimizing other additional structures

Fig. 4 Fig. 5
Fig. 4 Examples of manually diagnosed and SLAP-Netdiagnosed errors.Panels a and b show a 25-year-old male patient diagnosed with SLAP type 2 by arthroscopic surgery but diagnosed as normal by radiologist 3; the model diagnosis was correct.The red arrows indicate SLAP lesions.Panels c and d show a 21-yearold male patient.The results of arthroscopic surgery showed no SLAP lesions, but SLAP-Net diagnosed SLAP lesions; the manual diagnosis was correct

Table 1
The diagnostic effect of the SLAP-Net model and three radiologists with different seniority levels on datasets 1 and 2 and a comparison between the model and manual diagnosis AUC area under the receiver operating characteristic curve, PPV positive predictive value, NPV negative predictive value, χ 2 the chi-square value calculated by McNemar's test *The results of comparing different radiologists with SLAP-Net on dataset 1 and dataset 2; McNemar's test was used for the statistical analysis **Dataset 1 contains 52 test patients, and dataset 2 is an independent test set that contains 122 patients