Bodypart Recognition Using Multi-stage Deep Learning

  • Zhennan Yan
  • Yiqiang ZhanEmail author
  • Zhigang Peng
  • Shu Liao
  • Yoshihisa Shinagawa
  • Dimitris N. Metaxas
  • Xiang Sean Zhou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9123)


Automatic medical image analysis systems often start from identifying the human body part contained in the image. Specifically, given a transversal slice, it is important to know which body part it comes from, namely “slice-based bodypart recognition”. This problem has its unique characteristic - the body part of a slice is usually identified by local discriminative regions instead of global image context, e.g., a cardiac slice is differentiated from an aorta arch slice by the mediastinum region. To leverage this characteristic, we design a multi-stage deep learning framework that aims at: (1) discover the local regions that are discriminative to the bodypart recognition, and (2) learn a bodypart identifier based on these local regions. These two tasks are achieved by the two stages of our learning scheme, respectively. In the pre-train stage, a convolutional neural network (CNN) is learned in a multi-instance learning fashion to extract the most discriminative local patches from the training slices. In the boosting stage, the learned CNN is further boosted by these local patches for bodypart recognition. By exploiting the discriminative local appearances, the learned CNN becomes more accurate than global image context-based approaches. As a key hallmark, our method does not require manual annotations of the discriminative local patches. Instead, it automatically discovers them through multi-instance deep learning. We validate our method on a synthetic dataset and a large scale CT dataset (7000+ slices from wholebody CT scans). Our method achieves better performances than state-of-the-art approaches, including the standard CNN.


Image Classification Deep Learning Convolutional Neural Network Local Patch Body Section 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: CVPR, pp. 3642–3649. IEEE (2012)Google Scholar
  2. 2.
    Criminisi, A., Shotton, J., Robertson, D., Konukoglu, E.: Regression forests for efficient anatomy detection and localization in CT studies. In: Menze, B., Langs, G., Tu, Z., Criminisi, A. (eds.) MICCAI 2010. LNCS, vol. 6533, pp. 106–117. Springer, Heidelberg (2011) Google Scholar
  3. 3.
    Donner, R., Menze, B.H., Bischof, H., Langs, G.: Global localization of 3D anatomical structures by pre-filtered Hough Forests and discrete optimization. Med. Image Anal. 17(8), 1304–1314 (2013)CrossRefGoogle Scholar
  4. 4.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. TPAMI 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  5. 5.
    Hong, L., Hong, S.: Methods and apparatus for automatic body part identification and localization. US Patent App. 11/933,518, (15 May 2008)Google Scholar
  6. 6.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)Google Scholar
  7. 7.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR, vol. 2, pp. 2169–2178. IEEE (2006)Google Scholar
  8. 8.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (IJCV) 60(2), 91–110 (2004)CrossRefGoogle Scholar
  9. 9.
    Parikh, D.: Recognizing jumbled images: the role of local and global information in image classification. In: ICCV, pp. 519–526. IEEE (2011)Google Scholar
  10. 10.
    Park, J., Kang, G., Pan, S.B., Kim, P.: A novel algorithm for identification of body parts in medical images. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds.) FSKD 2006. LNCS (LNAI), vol. 4223, pp. 1148–1158. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  11. 11.
    Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: NIPS, pp. 2553–2561 (2013)Google Scholar
  12. 12.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: Closing the gap to human-level performance in face verification. In: CVPR, pp. 1701–1708. IEEE (2014)Google Scholar
  13. 13.
    Wei, Y., Xia, W., Huang, J., Ni, B., Dong, J., Zhao, Y., Yan, S.: CNN: Single-label to multi-label (2014). arXiv preprint arXiv:1406.5726
  14. 14.
    Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR, pp. 1794–1801. IEEE (2009)Google Scholar
  15. 15.
    Zhan, Y., Zhou, X.S., Peng, Z., Krishnan, A.: Active scheduling of organ detection and segmentation in whole-body medical images. In: Metaxas, D., Axel, L., Fichtinger, G., Székely, G. (eds.) MICCAI 2008, Part I. LNCS, vol. 5241, pp. 313–321. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  16. 16.
    Zhang, S., Zhan, Y., Dewan, M., Huang, J., Metaxas, D.N., Zhou, X.S.: Towards robust and effective shape modeling: Sparse shape composition. Med. Image Anal. 16(1), 265–277 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Zhennan Yan
    • 1
    • 2
  • Yiqiang Zhan
    • 1
    Email author
  • Zhigang Peng
    • 1
  • Shu Liao
    • 1
  • Yoshihisa Shinagawa
    • 1
  • Dimitris N. Metaxas
    • 2
  • Xiang Sean Zhou
    • 1
  1. 1.Siemens HealthcareMalvernUSA
  2. 2.CBIM, Rutgers UniversityPiscatawayUSA

Personalised recommendations