Skip to main content
Log in

Deep Learning–Assisted Identification of Femoroacetabular Impingement (FAI) on Routine Pelvic Radiographs

  • Published:
Journal of Imaging Informatics in Medicine Aims and scope Submit manuscript

Abstract

To use a novel deep learning system to localize the hip joints and detect findings of cam-type femoroacetabular impingement (FAI). A retrospective search of hip/pelvis radiographs obtained in patients to evaluate for FAI yielded 3050 total studies. Each hip was classified separately by the original interpreting radiologist in the following manner: 724 hips had severe cam-type FAI morphology, 962 moderate cam-type FAI morphology, 846 mild cam-type FAI morphology, and 518 hips were normal. The anteroposterior (AP) view from each study was anonymized and extracted. After localization of the hip joints by a novel convolutional neural network (CNN) based on the focal loss principle, a second CNN classified the images of the hip as cam positive, or no FAI. Accuracy was 74% for diagnosing normal vs. abnormal cam-type FAI morphology, with aggregate sensitivity and specificity of 0.821 and 0.669, respectively, at the chosen operating point. The aggregate AUC was 0.736. A deep learning system can be applied to detect FAI-related changes on single view pelvic radiographs. Deep learning is useful for quickly identifying and categorizing pathology on imaging, which may aid the interpreting radiologist.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availability

The data that support the findings of this study are available from the corresponding author, MH, upon reasonable request.

References

  1. Ghaffari A, Davis I, Storey T, Moser M. Current concepts of femoroacetabular impingement. Radiol Clin North Am. 2018;56(6):965-982.

    Article  PubMed  Google Scholar 

  2. Amanatullah DF, Antkowiak T, Pillay K, et al. Femoroacetabular impingement: current concepts in diagnosis and treatment. Orthopedics. 2015;38(3):185-199.

    Article  PubMed  Google Scholar 

  3. Tannast M, Siebenrock KA, Anderson SE. Femoroacetabular impingement: radiographic diagnosis--what the radiologist should know. AJR Am J Roentgenol. 2007;188(6):1540-1552.

    Article  PubMed  Google Scholar 

  4. Mascarenhas VV, Rego P, Dantas P, et al. Imaging prevalence of femoroacetabular impingement in symptomatic patients, athletes, and asymptomatic individuals: a systematic review. Eur J Radiol. 2016;85(1):73-95.

    Article  PubMed  Google Scholar 

  5. Clohisy JC, Carlisle JC, Trousdale R, et al. Radiographic evaluation of the hip has limited reliability. Clin Orthop Relat Res. 2009;467(3):666-675.

    Article  PubMed  Google Scholar 

  6. Egger AC, Frangiamore S, Rosneck J. Femoroacetabular impingement: a review. Sports Med Arthrosc. 2016;24(4):e53-e58.

    Article  PubMed  Google Scholar 

  7. Sim Y, Chung MJ, Kotter E, et al. Deep convolutional neural network-based software improves radiologist detection of malignant lung nodules on chest radiographs. Radiology. 2020;294(1):199-209.

    Article  PubMed  Google Scholar 

  8. Park VY, Han K, Seong YK, et al. Diagnosis of thyroid nodules: performance of a deep learning convolutional neural network model vs. radiologists. Sci Rep. 2019;9(1):17843.

  9. Truhn D, Schrading S, Haarburger C, Schneider H, Merhof D, Kuhl C. Radiomic versus convolutional neural networks analysis for classification of contrast-enhancing lesions at multiparametric breast MRI. Radiology. 2019;290(2):290-297.

    Article  PubMed  Google Scholar 

  10. Mutasa S, Chang PD, Ruzal-Shapiro C, Ayyala R. MABAL: a novel deep-learning architecture for machine-assisted bone age labeling. J Digit Imaging. 2018;31(4):513-519.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Xue Y, Zhang R, Deng Y, Chen K, Jiang T. A preliminary examination of the diagnostic value of deep learning in hip osteoarthritis. PLoS One. 2017;12(6):e0178992.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Irmakci I, Anwar SM, Torigian DA, Bagci U. Deep learning for Musculoskeletal image analysis. arXiv [eessIV]. Published online March 1, 2020. http://arxiv.org/abs/2003.00541

  13. Jones RM, Sharma A, Hotchkiss R, et al. Assessment of a deep-learning system for fracture detection in musculoskeletal radiographs. NPJ Digit Med. 2020;3:144.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Tomita N, Cheung YY, Hassanpour S. Deep neural networks for automatic detection of osteoporotic vertebral fractures on CT scans. Comput Biol Med. 2018;98:8-15.

    Article  PubMed  Google Scholar 

  15. Urakawa T, Tanaka Y, Goto S, Matsuzawa H, Watanabe K, Endo N. Detecting intertrochanteric hip fractures with orthopedist-level accuracy using a deep convolutional neural network. Skeletal Radiol. 2019;48(2):239-244.

    Article  PubMed  Google Scholar 

  16. Bien N, Rajpurkar P, Ball RL, et al. Deep-learning-assisted diagnosis for knee magnetic resonance imaging: development and retrospective validation of MRNet. PLoS Med. 2018;15(11):e1002699.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Pedoia V, Norman B, Mehany SN, Bucknor MD, Link TM, Majumdar S. 3D convolutional neural networks for detection and severity staging of meniscus and PFJ cartilage morphological degenerative changes in osteoarthritis and anterior cruciate ligament subjects. J Magn Reson Imaging. 2019;49(2):400-410.

    Article  PubMed  Google Scholar 

  18. Lang N, Zhang Y, Zhang E, et al. Differentiation of spinal metastases originated from lung and other cancers using radiomics and deep learning based on DCE-MRI. Magn Reson Imaging. 2019;64:4-12.

    Article  PubMed  PubMed Central  Google Scholar 

  19. McBee MP, Awan OA, Colucci AT, et al. Deep learning in radiology. Acad Radiol. 2018;25(11):1472-1480.

    Article  PubMed  Google Scholar 

  20. Raman SP, Chen Y, Schroeder JL, Huang P, Fishman EK. CT texture analysis of renal masses: pilot study using random forest classification for prediction of pathology. Acad Radiol. 2014;21(12):1587-1596.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Kohli M, Prevedello LM, Filice RW, Geis JR. Implementing machine learning in radiology practice and research. AJR Am J Roentgenol. 2017;208(4):754-760.

    Article  PubMed  Google Scholar 

  22. Hecht-Nielsen R. Theory of the backpropagation neural network. In: Neural Networks for Perception. Elsevier; 1992:65–93.

  23. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:770–778.

  24. Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015:1–9.

  25. Lin T-Y, Goyal P, Girshick R, He K, Dollár P. Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision. 2017:2980–2988.

  26. Ng AY. Feature selection, L 1 vs. L 2 regularization, and rotational invariance. In: Proceedings of the Twenty-First International Conference on Machine Learning. 2004:78.

  27. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(1):1929-1958.

    MathSciNet  Google Scholar 

  28. Mascarenhas VV, Caetano A, Dantas P, Rego P. Advances in FAI imaging: a focused review. Curr Rev Musculoskelet Med. 2020;13(5):622-640.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Lloyd RFW, Feeney C. Adolescent hip pain: the needle in the haystack. A case report. Physiotherapy. 2011;97(4):354-356. doi:https://doi.org/10.1016/j.physio.2010.11.009

    Article  CAS  PubMed  Google Scholar 

  30. Zeng G, Zheng G. Deep learning-based automatic segmentation of the proximal femur from MR images. In: Zheng G, Tian W, Zhuang X, eds. Intelligent Orthopaedics: Artificial Intelligence and Smart Image-Guided Technology for Orthopaedics. Springer Singapore; 2018:73–79.

  31. Hodgdon T, Thornhill RE, James ND, Beaulé PE, Speirs AD, Rakhra KS. CT texture analysis of acetabular subchondral bone can discriminate between normal and cam-positive hips. Eur Radiol. 2020;30(8):4695-4704.

    Article  PubMed  Google Scholar 

  32. Zech JR, Badgeley MA, Liu M, Costa AB, Titano JJ, Oermann EK. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. 2018;15(11):e1002683.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Medina G, Buckless CG, Thomasson E, Oh LS, Torriani M. Deep learning method for segmentation of rotator cuff muscles on MR images. Skeletal Radiol. 2021;50(4):683-692.

    Article  PubMed  Google Scholar 

  34. James SLJ, Ali K, Malara F, Young D, O’Donnell J, Connell DA. MRI findings of femoroacetabular impingement. AJR Am J Roentgenol. 2006;187(6):1412-1419.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by MH, SM, and RCH. The first draft of the manuscript was written by MH, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Michael K. Hoy.

Ethics declarations

Ethics Approval

This is an observational study approved by the institutional review board, which waived the requirement for informed consent, in compliance with the Health Insurance Portability and Accountability Act of 1996 (HIPAA).

Consent to Participate

The authors affirm that the institutional review board waived requirement for consent to participate.

Consent for Publication

The authors affirm that the institutional review board waived requirement for consent to publish.

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

The localization network implemented a 40-layer custom-built neural network, trained from random initializations. The design of this network was based on principles borrowed from three well-established architectures, namely, ResNet, a product of Microsoft, Inception by Google, and RetinaNet from Facebook. Each of these architectures contributed unique elements to the customization of the neural network, aimed at optimal functionality and efficiency.

The localization network was trained using inputs consisting of full-resolution original anterior–posterior (AP) view radiographs of the hip. These inputs were fed to an algorithm that was responsible for generating what are known as anchor boxes. Anchor boxes, a concept adopted from the RetinaNet architecture, are essentially potential bounding boxes of various sizes and aspect ratios that could potentially encapsulate a target object in an image—in this case, the hip joint.

Three distinct sizes for these anchor boxes were selected, ranging from 122 to 128 pixels. In terms of aspect ratios, two different variants were chosen, varying from 0.80 to 1.10. Each anchor box was strategically offset by 8 pixels to ensure a comprehensive search space over the entire unscaled input image.

Anchors were then classified into positive and negative classes, with positive anchor boxes signifying the presence of a hip joint. The classification was based on the degree of overlap between the anchor box and the ground truth bounding box (the region that truly encapsulates the hip joint in the image) using a metric known as intersection over union (IoU). Anchor boxes with an IoU greater than 0.6 were classified as positive, while those with an IoU less than 0.25 were deemed negative. Anchor boxes with IoU values in the range of 0.25 to 0.6 were excluded from the training set to maintain a clean separation between the positive and negative classes.

Intersection over Union (IoU) is a metric used to measure the accuracy of object detection models. It calculates the amount of overlap between two bounding boxes: a predicted bounding box and a ground truth bounding box. The greater the region of overlap, the greater the IoU.

To calculate the IoU between the predicted and the ground truth bounding boxes, you first take the intersecting area between the two corresponding bounding boxes for the same object. You subsequently divide the union of the boxes, which is the total area covered by both the boxes. IoU is a number from 0 to 1 that specifies the amount of overlap between the predicted and ground truth bounding box.

figure a

Over the course of the study, over 30 million anchor boxes were generated in the bounding box dataset, of which just over 30 thousand were identified as positive for the presence of a hip joint. Considering the large class imbalance, the study incorporated certain strategies during the training of the localization network. Positive boxes were oversampled to make up 1% of the network inputs while training, addressing the imbalanced class distribution to an extent. Further, a modified focal loss function, as outlined in the RetinaNet paper, was utilized. This function forced higher training gradients on misclassified positive cases, ensuring that the model pays more attention to these instances, thereby enhancing the performance of the localization network. The localization network head which ultimately classified the boxes was a standard Inception-ResNet hybrid network trained from random initializations.

Inputs to the classifier network were the five bounding boxes, determined by the preceding localization network, that exhibited the highest object scores, suggesting these areas had the highest likelihood of containing the hip joint.

The structure of this classifier convolutional neural network (CNN) was designed to mirror that of the previously described localization network. However, the classifier network was initialized with the trained weights of the localization network, which essentially means it started off with pre-existing knowledge from the localization process.

To ensure robustness and prevent overfitting, runtime regularization techniques were applied to both the localization and classifier networks. One such technique is L2 regularization, which penalizes large weights in the network model by adding a term proportional to the square of the magnitude of weights to the loss function. This discourages overly complex models and aids in generalization. Another technique used was dropout, a method that temporarily deactivates a random subset of neurons (units) during training to prevent them from adapting too much to the training data, further promoting model generalization.

The classification network relied on a weighted cross-entropy loss function for two to four classes, used to compute the error between the network’s predictions and the true labels. This function played a crucial role in updating the model weights during backpropagation and consequent iterations.

The raw outputs of the classification network were two numerical values, interpreted as un-normalized logarithmic odds (log-odds) for each class. To make these log-odds interpretable, they were normalized using the softmax function, a common method in machine learning that transforms log-odds into probabilities. The final predictions for each patient were then determined by aggregating the softmax scores after adjusting them for the object score.

The classifier network’s hyperparameters were meticulously fine-tuned based on its performance on a validation set, comprising 20% of the total sample. To achieve an unbiased estimate of the model’s performance, the dataset was divided into training (70%), validation (20%), and testing (10%) subsets. The testing set was sequestered, or kept separate, during the training and validation process, ensuring an unbiased assessment of the model’s final performance.

Once the optimal hyperparameters were identified using the validation set, the network was subjected to a single run on the sequestered test set, and its performance was recorded. The primary performance metric chosen for this network was the aggregated area under the receiver-operating curve (AUC). AUC measures the quality of a classifier and indicates how well the model can distinguish between the classes across various threshold settings. An AUC closer to 1 is desirable as it signifies a higher true-positive rate and a lower false-positive rate.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hoy, M.K., Desai, V., Mutasa, S. et al. Deep Learning–Assisted Identification of Femoroacetabular Impingement (FAI) on Routine Pelvic Radiographs. J Digit Imaging. Inform. med. 37, 339–346 (2024). https://doi.org/10.1007/s10278-023-00920-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10278-023-00920-y

Keywords

Navigation