Smoothness-based consistency learning for macaque pose estimation

Xue, Ping; Deng, ShiXiong

doi:10.1007/s11760-023-02665-1

Smoothness-based consistency learning for macaque pose estimation

Original Paper
Published: 01 July 2023

Volume 17, pages 4327–4335, (2023)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Ping Xue¹ &
ShiXiong Deng²

128 Accesses
Explore all metrics

Abstract

Macaques are a rare substitute and play an important role in study of human psychology and spiritual science. Accurate estimation of macaque pose information is key to these studies, macaque pose estimation remains to be hindered by the scarcity of labeled images. To address this problem, this work introduces a novel semi-supervised approach called smoothness-based spatio-temporal consistency learning (SSTCL) and a dual network structure (DNS) to leverage the amounts of unlabeled real images. Specifically, the SSTCL introduces the smoothness assumption to help the model generalize from the labeled training images to the unlabeled images, and the spatio-temporal consistency is designed to leverage both spatial and temporal consistencies to pick the most reliable pseudo-labels. Moreover, a dual network structure (DNS) is proposed to empower the model the ability of self-correction, which can prevent the degeneration caused by the noisy pseudo-labels in semi-supervised learning. In ablation experiments, the effectiveness of DNS for pseudo-label quality assurance is demonstrated. We evaluate the proposed method on the public OpenMonkeyPose dataset, the results show that the proposed method can achieve competitive performance while using less labeled images, and the final accuracy surpasses the strong baseline HRNet-w48 of 2.1 AP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning

A Unified Framework for Domain Adaptive Pose Estimation

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows

Data Availability

In this article, public dataset OpenMonkeyStudio is available at https://github.com/OpenMonkeyStudio/OMS_Data

References

Mathis, A., Mamidanna, P., Cury, K.M., Abe, T., Murthy, V.N., Mathis, M.W., Bethge, M.: Deeplabcut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21(9), 1281–1289 (2018)
Article Google Scholar
Pereira, T.D., Aldarondo, D.E., Willmore, L., Kislin, M., Wang, S.S.-H., Murthy, M., Shaevitz, J.W.: Fast animal pose estimation using deep neural networks. Nat. Methods 16(1), 117–125 (2019)
Article Google Scholar
Graving, J.M., Chae, D., Naik, H., Li, L., Koger, B., Costelloe, B.R., Couzin, I.D.: Deepposekit, a software toolkit for fast and robust animal pose estimation using deep learning. Elife 8, 47994 (2019)
Article Google Scholar
Negrete, S.B., Labuguen, R., Matsumoto, J., Go, Y., Inoue, K.-i., Shibata, T.: Multiple monkey pose estimation using openpose. bioRxiv (2021)
Pereira, T.D., Tabris, N., Li, J., Ravindranath, S., Papadoyannis, E.S., Wang, Z.Y., Turner, D.M., McKenzie-Smith, G., Kocher, S.D., Falkner, A.L., et al.: Sleap: multi-animal pose tracking. BioRxiv (2020)
Mathis, A., Biasi, T., Schneider, S., Yuksekgonul, M., Rogers, B., Bethge, M., Mathis, M.W.: Pretraining boosts out-of-domain robustness for pose estimation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1859–1868 (2021)
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: European Conference on Computer Vision, pp. 740–755. Springer (2014)
Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: New benchmark and state of the art analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3686–3693 (2014)
Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242 (2016)
Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. arXiv preprint arXiv:1703.01780 (2017)
Abuduweili, A., Li, X., Shi, H., Xu, C.-Z., Dou, D.: Adaptive consistency regularization for semi-supervised transfer learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6923–6932 (2021)
Mu, J., Qiu, W., Hager, G.D., Yuille, A.L.: Learning from synthetic animals. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12386–12395 (2020)
Cao, J., Tang, H., Fang, H.-S., Shen, X., Lu, C., Tai, Y.-W.: Cross-domain adaptation for animal pose estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9498–9507 (2019)
Xie, R., Wang, C., Zeng, W., Wang, Y.: Humble teacher and eager student: dual network learning for semi-supervised 2D human pose estimation. arXiv preprint arXiv:2011.12498 (2020)
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)
Xiao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 466–481 (2018)
Zhang, Z., Tang, J., Wu, G.: Simple and lightweight human pose estimation. arXiv preprint arXiv:1911.10346 (2019)
Li, W., Wang, Z., Yin, B., Peng, Q., Du, Y., Xiao, T., Yu, G., Lu, H., Wei, Y., Sun, J.: Rethinking on multi-stage networks for human pose estimation. arXiv preprint arXiv:1901.00148 (2019)
Cai, Y., Wang, Z., Luo, Z., Yin, B., Du, A., Wang, H., Zhang, X., Zhou, X., Zhou, E., Sun, J.: Learning delicate local representations for multi-person pose estimation. In: European Conference on Computer Vision, pp. 455–472. Springer (2020)
Zhang, F., Zhu, X., Dai, H., Ye, M., Zhu, C.: Distribution-aware coordinate representation for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7093–7102 (2020)
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019)
Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., Schiele, B.: Deepercut: a deeper, stronger, and faster multi-person pose estimation model. In: European Conference on Computer Vision, pp. 34–50. Springer (2016)
Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020)
Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016)
Cao, Z., Hidalgo, G., Simon, T., Wei, S.-E., Sheikh, Y.: Openpose: realtime multi-person 2d pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 172–186 (2019)
Article Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28, 91–99 (2015)
Google Scholar
Munea, T.L., Jembre, Y.Z., Weldegebriel, H.T., Chen, L., Huang, C., Yang, C.: The progress of human pose estimation: a survey and taxonomy of models applied in 2D human pose estimation. IEEE Access 8, 133330–133348 (2020)
Article Google Scholar
Badger, M., Wang, Y., Modh, A., Perkes, A., Kolotouros, N., Pfrommer, B.G., Schmidt, M.F., Daniilidis, K.: 3D bird reconstruction: a dataset, model, and shape recovery from a single view. arXiv preprint arXiv:2008.06133 (2020)
Zhou, F., Jiang, Z., Liu, Z., Chen, F., Chen, L., Tong, L., Yang, Z., Wang, H., Fei, M., Li, L., et al.: Structured context enhancement network for mouse pose estimation. arXiv preprint arXiv:2012.00630 (2020)
Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017)
Bala, P.C., Eisenreich, B.R., Yoo, S.B.M., Hayden, B.Y., Park, H.S., Zimmermann, J.: Automated markerless pose estimation in freely moving macaques with OpenMonkeyStudio. Nat. Commun. 11(1), 1–12 (2020)
Article Google Scholar
Lee, D.-H., et al.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML, vol. 3, p. 896 (2013)
Radosavovic, I., Dollár, P., Girshick, R., Gkioxari, G., He, K.: Data distillation: Towards omni-supervised learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4119–4128 (2018)
Xie, Q., Luong, M.-T., Hovy, E., Le, Q.V.: Self-training with noisy student improves imagenet classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10687–10698 (2020)
Van Engelen, J.E., Hoos, H.H.: A survey on semi-supervised learning. Mach. Learn. 109(2), 373–440 (2020)
Article MathSciNet MATH Google Scholar
Cho, E., Kim, D.: Accurate human pose estimation by aggregating multiple pose hypotheses using modified kernel density approximation. IEEE Signal Process. Lett. 22(4), 445–449 (2014)
Article Google Scholar
Xu, X., Zou, Q., Lin, X., Huang, Y., Tian, Y.: Integral knowledge distillation for multi-person pose estimation. IEEE Signal Process. Lett. 27, 436–440 (2020)
Article Google Scholar
Luo, Z., Wang, Z., Huang, Y., Wang, L., Tan, T., Zhou, E.: Rethinking the heatmap regression for bottom-up human pose estimation. In: CVPR (2021)
Geng, Z., Sun, K., Xiao, B., Zhang, Z., Wang, J.: Bottom-up human pose estimation via disentangled keypoint regression. In: CVPR (2021)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

Download references

Funding

This research was funded by the Natural Science Foundation of Heilongjiang Province of China (F201310).

Author information

Authors and Affiliations

The Higher Educational Key Laboratory for Measuring and Control Technology and Instrumentation of Heilongjiang province, Harbin University of Science and Technology, 52 Xue Fu Lu, Harbin, 150080, Heilongjiang Province, China
Ping Xue
Department of College of Measurement and Communication Engineering, Harbin University of Science and Technology, 52 Xue Fu Lu, Harbin, 150080, Heilongjiang Province, China
ShiXiong Deng

Authors

Ping Xue
View author publications
You can also search for this author in PubMed Google Scholar
ShiXiong Deng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Xue Ping and Deng Shixiong prepared the manuscript text. Xue Ping prepared Table 2 and Figs. 3 and 4 through ablation experiment. Deng Shixiong collected the dataset and participated in the experiment and prepared Figs. 1 and 2, Table 1 and Algorithm 1.

Corresponding author

Correspondence to Ping Xue.

Ethics declarations

Competing Interests

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Ethical Approval

This work did not require ethical approval under the research governance guidelines operating at the time of the research.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xue, P., Deng, S. Smoothness-based consistency learning for macaque pose estimation. SIViP 17, 4327–4335 (2023). https://doi.org/10.1007/s11760-023-02665-1

Download citation

Received: 04 May 2023
Revised: 28 May 2023
Accepted: 10 June 2023
Published: 01 July 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s11760-023-02665-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Smoothness-based consistency learning for macaque pose estimation

Abstract

Access this article

Similar content being viewed by others

3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning

A Unified Framework for Domain Adaptive Pose Estimation

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Smoothness-based consistency learning for macaque pose estimation

Abstract

Access this article

Similar content being viewed by others

3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning

A Unified Framework for Domain Adaptive Pose Estimation

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation