Advertisement

Multi-Domain Pose Network for Multi-Person Pose Estimation and Tracking

  • Hengkai GuoEmail author
  • Tang Tang
  • Guozhong Luo
  • Riwei Chen
  • Yongchen Lu
  • Linfu Wen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11130)

Abstract

Multi-person human pose estimation and tracking in the wild is important and challenging. For training a powerful model, large-scale training data are crucial. While there are several datasets for human pose estimation, the best practice for training on multi-dataset has not been investigated. In this paper, we present a simple network called Multi-Domain Pose Network (MDPN) to address this problem. By treating the task as multi-domain learning, our methods can learn a better representation for pose prediction. Together with prediction heads fine-tuning and multi-branch combination, it shows significant improvement over baselines and achieves the best performance on PoseTrack ECCV 2018 Challenge without additional datasets other than MPII and COCO.

Keywords

Human pose estimation Multi-domain learning 

References

  1. 1.
    Andriluka, M., et al.: Posetrack: a benchmark for human pose estimation and tracking (2017)Google Scholar
  2. 2.
    Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: Computer Vision and Pattern Recognition, pp. 3686–3693 (2014)Google Scholar
  3. 3.
    Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1302–1310 (2017)Google Scholar
  4. 4.
    Chen, Y., Wang, Z., Peng, Y., Zhang, Z., Yu, G., Sun, J.: Cascaded pyramid network for multi-person pose estimation (2017)Google Scholar
  5. 5.
    Dai, J., et al.: Deformable convolutional networks. In: IEEE International Conference on Computer Vision, pp. 764–773 (2017)Google Scholar
  6. 6.
    Fang, H., Xie, S., Tai, Y.W., Lu, C.: RMPE: regional multi-person pose estimation. In: The IEEE International Conference on Computer Vision (ICCV), vol. 2 (2017)Google Scholar
  7. 7.
    Girdhar, R., Gkioxari, G., Torresani, L., Paluri, M., Du, T.: Detect-and-track: efficient pose estimation in videos (2018)Google Scholar
  8. 8.
    Guo, H., Wang, G., Chen, X., Zhang, C., Qiao, F., Yang, H.: Region ensemble network: improving convolutional network for hand pose estimation. In: IEEE International Conference on Image Processing, pp. 4512–4516 (2017)Google Scholar
  9. 9.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)Google Scholar
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  11. 11.
    Jin, S., et al.: Towards multi-person pose tracking: bottom-up and top-down methods (2017)Google Scholar
  12. 12.
    Li, H., Li, Y., Porikli, F.: Convolutional neural net bagging for online visual tracking. Comput. Vis. Image Understand. 153, 120–129 (2016)CrossRefGoogle Scholar
  13. 13.
    Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48CrossRefGoogle Scholar
  14. 14.
    Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4293–4302 (2016)Google Scholar
  15. 15.
    Papandreou, G., et al.: Towards accurate multi-person pose estimation in the wild, pp. 3711–3719 (2017)Google Scholar
  16. 16.
    Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement (2018)Google Scholar
  17. 17.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2015)CrossRefGoogle Scholar
  18. 18.
    Xiao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 472–487. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01231-1_29CrossRefGoogle Scholar
  19. 19.
    Xiao, T., Li, H., Ouyang, W., Wang, X.: Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1249–1258 (2016)Google Scholar
  20. 20.
    Xiu, Y., Li, J., Wang, H., Fang, Y., Lu, C.: Pose flow: Efficient online pose tracking. arXiv preprint arXiv:1802.00977 (2018)
  21. 21.
    Zhu, X., Jiang, Y., Luo, Z.: Multi-person pose estimation for posetrack with enhanced part affinity fields. In: ICCV PoseTrack Workshop (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Hengkai Guo
    • 1
    Email author
  • Tang Tang
    • 1
  • Guozhong Luo
    • 1
  • Riwei Chen
    • 1
  • Yongchen Lu
    • 1
  • Linfu Wen
    • 1
  1. 1.ByteDance AI LabBeijingChina

Personalised recommendations