Supervised Action Classifier: Approaching Landmark Detection as Image Partitioning

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10435)


In medical imaging, landmarks have significant clinical and scientific importance. Clinical measurements, derived from the landmarks, are used for diagnosis, therapy planning and interventional guidance in many cases. Automatic algorithms have been studied to reduce the need for manual placement of landmarks. Traditional machine learning techniques provide reasonable results; however, they have limitation of either robustness or precision given complexities and variabilities of the medical images. Recently, deep learning technologies have been emerging to tackle the problems. Among them, a deep reinforcement learning approach (DRL) has shown to successfully detect landmark locations by implicitly learning the optimized path from a starting location; however, its learning process can only include subsets of the almost infinite paths across the image context, and may lead to major failures if not trained with adequate dataset variations. Here, we propose a new landmark detection approach inspired from DRL. Instead of learning limited action paths in an image in a greedy manner, we construct a global action map across the whole image, which divides the image into four action regions (left, right, up and bottom) depending on the relative location towards the target landmark. The action map guides how to move to reach the target landmark from any location of the input image. This effectively translates the landmark detection problem into an image partition problem which enables us to leverage a deep image-to-image network to train a supervised action classifier for detection of the landmarks. We discuss the experiment results of two ultrasound datasets (cardiac and obstetric) by applying the proposed algorithm. It shows consistent improvement over traditional machine learning based and deep learning based methods.


Landmark detection Deep learning Image partition Machine learning Ultrasound 


  1. 1.
    Tu, Z.: Probabilistic boosting-tree: learning discriminative models for classification, recognition, and clustering. In: Tenth IEEE International Conference on Computer Vision (ICCV 2005), vol. 2. IEEE (2005)Google Scholar
  2. 2.
    Viola, P., Jones, M.: Fast and robust classification using asymmetric adaboost and a detector cascade. In: Advances in Neural Information Processing System, vol. 14 (2001)Google Scholar
  3. 3.
    Zhou, S.K., Comaniciu, D.: Shape regression machine. In: Karssemeijer, N., Lelieveldt, B. (eds.) IPMI 2007. LNCS, vol. 4584, pp. 13–25. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-73273-0_2 CrossRefGoogle Scholar
  4. 4.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  5. 5.
    Long, J., et al.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)Google Scholar
  6. 6.
    Ghesu, F.C., Georgescu, B., Mansi, T., Neumann, D., Hornegger, J., Comaniciu, D.: An artificial agent for anatomical landmark detection in medical images. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9902, pp. 229–237. Springer, Cham (2016). doi: 10.1007/978-3-319-46726-9_27 CrossRefGoogle Scholar
  7. 7.
    Badrinarayanan, V., et al.: SegNet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. arXiv preprint arXiv:1505.07293 (2015)
  8. 8.
    Yang, D., et al.: Automatic vertebra labeling in large-scale 3D CT using deep image-to-image network with message passing and sparsity regularization. In: Niethammer, M., Styner, M., Aylward, S., Zhu, H., Oguz, I., Yap, P.-T., Shen, D. (eds.) IPMI 2017. LNCS, vol. 10265, pp. 633–644. Springer, Cham (2017). doi: 10.1007/978-3-319-59050-9_50 CrossRefGoogle Scholar
  9. 9.
    Xie, S., Tu, Z.: Holistically-nested edge detection. In: Proceedings of the IEEE International Conference on Computer Vision (2015)Google Scholar
  10. 10.
    Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). doi: 10.1007/978-3-319-24574-4_28 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Medical Imaging TechnologiesSiemens Healthineers Technology CenterPrincetonUSA
  2. 2.Department of Computer ScienceUniversity of Southern CaliforniaCaliforniaUSA
  3. 3.Department of Computer ScienceRutgers UniversityPiscatawayUSA

Personalised recommendations