Skip to main content

You only Learn Once: Universal Anatomical Landmark Detection

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 (MICCAI 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12905))


Detecting anatomical landmarks in medical images plays an essential role in understanding the anatomy and planning automated processing. In recent years, a variety of deep neural network methods have been developed to detect landmarks automatically. However, all of those methods are unary in the sense that a highly specialized network is trained for a single task say associated with a particular anatomical region. In this work, for the first time, we investigate the idea of “You Only Learn Once (YOLO)” and develop a universal anatomical landmark detection model to realize multiple landmark detection tasks with end-to-end training based on mixed datasets. The model consists of a local network and a global network: The local network is built upon the idea of universal U-Net to learn multi-domain local features and the global network is a parallelly-duplicated sequential of dilated convolutions that extract global features to further disambiguate the landmark locations. It is worth mentioning that the new model design requires much fewer parameters than models with standard convolutions to train. We evaluate our YOLO model on three X-ray datasets of 1,588 images on the head, hand, and chest, collectively contributing 62 landmarks. The experimental results show that our proposed universal model behaves largely better than any previous models trained on multiple datasets. It even beats the performance of the model that is trained separately for every single dataset. Our code is available at

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others


  1. 1.

  2. 2.


  1. Candemir, S., et al.: Lung segmentation in chest radiographs using anatomical atlases with nonrigid registration. IEEE Trans. Med. Imaging 33(2), 577–590 (2013)

    Article  Google Scholar 

  2. Chiras, J., Depriester, C., Weill, A., Sola-Martinez, M., Deramond, H.: Percutaneous vertebral surgery. technics and indications. J. Neuroradiol.= Journal de neuroradiologie 24(1), 45–59 (1997)

    Google Scholar 

  3. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323. JMLR Workshop and Conference Proceedings (2011)

    Google Scholar 

  4. Huang, C., Han, H., Yao, Q., Zhu, S., Zhou, S.K.: 3D U\(^2\)-Net: a 3D universal u-net for multi-domain medical image segmentation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11765, pp. 291–299. Springer, Cham (2019).

    Chapter  Google Scholar 

  5. Ibragimov, B., Korez, R., Likar, B., Pernuš, F., Xing, L., Vrtovec, T.: Segmentation of pathological structures by landmark-assisted deformable models. IEEE Trans. Med. Imaging 36(7), 1457–1469 (2017)

    Article  Google Scholar 

  6. Ibragimov, B., Likar, B., Pernuš, F., Vrtovec, T.: Shape representation for efficient landmark-based segmentation in 3-d. IEEE Trans. Med. Imaging 33(4), 861–874 (2014)

    Article  Google Scholar 

  7. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)

    Google Scholar 

  8. Jaeger, S., et al.: Automatic tuberculosis screening using chest radiographs. IEEE Trans. Med. Imaging 33(2), 233–245 (2013)

    Article  Google Scholar 

  9. Lange, T., et al.: 3d ultrasound-ct registration of the liver using combined landmark-intensity information. Int. J. Comput. Assist. Radiol. Surg. 4(1), 79–88 (2009)

    Article  Google Scholar 

  10. Lay, N., Birkbeck, N., Zhang, J., Zhou, S.K.: Rapid multi-organ segmentation using context integration and discriminative models. In: Gee, J.C., Joshi, S., Pohl, K.M., Wells, W.M., Zöllei, L. (eds.) IPMI 2013. LNCS, vol. 7917, pp. 450–462. Springer, Heidelberg (2013).

    Chapter  Google Scholar 

  11. Li, H., Han, H., Zhou, S.K.: Bounding maps for universal lesion detection. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12264, pp. 417–428. Springer, Cham (2020).

    Chapter  Google Scholar 

  12. Lian, C., et al.: Multi-task dynamic transformer network for concurrent bone segmentation and large-scale landmark localization with dental CBCT. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12264, pp. 807–816. Springer, Cham (2020).

    Chapter  Google Scholar 

  13. Lindner, C., Bromiley, P.A., Ionita, M.C., Cootes, T.F.: Robust and accurate shape model matching using random forest regression-voting. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1862–1874 (2014)

    Article  Google Scholar 

  14. Liu, D., Zhou, S.K., Bernhardt, D., Comaniciu, D.: Search strategies for multiple landmark detection by submodular maximization. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2831–2838. IEEE (2010)

    Google Scholar 

  15. Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE (1999)

    Google Scholar 

  16. Payer, C., Štern, D., Bischof, H., Urschler, M.: Integrating spatial configuration into heatmap regression based cnns for landmark localization. Med. Image Anal. 54, 207–219 (2019)

    Article  Google Scholar 

  17. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015).

    Chapter  Google Scholar 

  18. Smith, L.N.: Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 464–472. IEEE (2017)

    Google Scholar 

  19. Štern, D., Ebner, T., Urschler, M.: From local to global random regression forests: exploring anatomical landmark localization. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 221–229. Springer, Cham (2016).

    Chapter  Google Scholar 

  20. Urschler, M., Ebner, T., Štern, D.: Integrating geometric configuration and appearance information into a unified framework for anatomical landmark localization. Med. Image Anal. 43, 23–36 (2018)

    Article  Google Scholar 

  21. Wang, C.W., et al.: A benchmark for comparison of dental radiography analysis algorithms. Med. Image Anal. 31, 63–76 (2016)

    Article  Google Scholar 

  22. Wang, P., et al.: Understanding convolution for semantic segmentation. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1451–1460. IEEE (2018)

    Google Scholar 

  23. Yang, D., et al.: Deep image-to-image recurrent network with shape basis learning for automatic vertebra labeling in large-scale 3d ct volumes. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 498–506. Springer, Cham (2017).

    Chapter  Google Scholar 

  24. Yao, Q., He, Z., Han, H., Zhou, S.K.: Miss the point: targeted adversarial attack on multiple landmark detection. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12264, pp. 692–702. Springer, Cham (2020).

    Chapter  Google Scholar 

  25. Zhou, S.K., et al.: A review of deep learning in medical imaging: imaging traits, technology trends, case studies with progress highlights, and future promises. In: Proceedings of the IEEE (2021)

    Google Scholar 

  26. Zhou, S.K., Rueckert, D., Fichtinger, G.: Handbook of Medical Image Computing and Computer Assisted Intervention. Academic Press, Cambridge (2019)

    Google Scholar 

  27. Zhou, S.K.: Shape regression machine and efficient segmentation of left ventricle endocardium from 2d b-mode echocardiogram. Med. Image Anal. 14(4), 563–581 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhu, H., Yao, Q., Xiao, L., Zhou, S.K. (2021). You only Learn Once: Universal Anatomical Landmark Detection. In: de Bruijne, M., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. MICCAI 2021. Lecture Notes in Computer Science(), vol 12905. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87239-7

  • Online ISBN: 978-3-030-87240-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics