Regularized Landmark Detection with CAEs for Human Pose Estimation in the Operating Room

  • Lasse HansenEmail author
  • Jasper Diesel
  • Mattias P. Heinrich
Conference paper
Part of the Informatik aktuell book series (INFORMAT)


Robust estimation of the human pose is a critical requirement for the development of context aware assistance and monitoring systems in clinical settings. Environments like operating rooms or intensive care units pose different visual challenges for the problem of human pose estimation such as frequent occlusions, clutter and difficult lighting conditions. Moreover, privacy concerns play a major role in health care applications and make it necessary to use unidentifiable data, e.g. blurred RGB images or depth frames. Since, for this reason, the data basis is much smaller than for human pose estimation in common scenarios, pose priors could be beneficial for regularization to train robust estimation models. In this work, we investigate to what extent existing pose estimation methods are suitable for the challenges of clinical environments and propose a CAE based regularization method to correct estimated poses that are anatomically implausible. We show that our models trained solely on depth images reach similar results on the MVOR dataset [1] as RGB based pose estimators while intrinsically being non-identifiable. In further experiments we prove that our CAE regularization can cope with several pose perturbations, e.g. missing parts or left-right flips of joints.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Srivastav V, Issenhuth T, Kadkhodamohammadi A, et al. MVOR: a multiview RGB-D operating room dataset for 2D and 3D human pose estimation. arXiv:180808180. 2018;.
  2. 2.
    Andriluka M, Pishchulin L, Gehler P, et al. 2D human pose estimation: new benchmark and state of the art analysis. Proc CVPR. 2014; p. 3686-3693.Google Scholar
  3. 3.
    Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model. Proc CVPR. 2008; p. 1-8.Google Scholar
  4. 4.
    Pishchulin L, Andriluka M, Gehler P, et al. Strong appearance and expressive spatial models for human pose estimation. Proc ICCV. 2013; p. 3487-3494.Google Scholar
  5. 5.
    Toshev A, Szegedy C. Deeppose: human pose estimation via deep neural networks. Proc CVPR. 2014; p. 1653-1660.Google Scholar
  6. 6.
    Wei SE, Ramakrishna V, Kanade T, et al. Convolutional pose machines. Proc CVPR. 2016; p. 4724-4732.Google Scholar
  7. 7.
    Newell A, Yang K, Deng J. Stacked hourglass networks for human pose estimation. Proc ECCV. 2016; p. 483-499.Google Scholar
  8. 8.
    Masci J, Meier U, Cire_san D, et al. Stacked convolutional auto-encoders for hierarchical feature extraction. Int Conf Artif Neural Netw. 2011; p. 52-59.Google Scholar
  9. 9.
    Oktay O, Ferrante E, Kamnitsas K, et al. Anatomically constrained neural networks (ACNNs): application to cardiac image enhancement and segmentation. IEEE Trans Med Imaging. 2018;37(2):384-395.CrossRefGoogle Scholar
  10. 10.
    Tekin B, Katircioglu I, Salzmann M, et al. Structured prediction of 3D human pose with deep neural networks. arXiv:160505180. 2016;.
  11. 11.
    Cao Z, Simon T, Wei SE, et al. Realtime multi-person 2D pose estimation using part affinity fields. Proc CCVPR. 2017; p. 7291-7299.Google Scholar

Copyright information

© Springer Fachmedien Wiesbaden GmbH, ein Teil von Springer Nature 2019

Authors and Affiliations

  • Lasse Hansen
    • 1
    Email author
  • Jasper Diesel
    • 2
  • Mattias P. Heinrich
    • 1
  1. 1.Institute of Medical InformaticsUniversity of LübeckLübeckDeutschland
  2. 2.Drägerwerk AG & Co. KGaALübeckDeutschland

Personalised recommendations