Skip to main content
Log in

OpenMonkeyChallenge: Dataset and Benchmark Challenges for Pose Estimation of Non-human Primates

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

The ability to automatically estimate the pose of non-human primates as they move through the world is important for several subfields in biology and biomedicine. Inspired by the recent success of computer vision models enabled by benchmark challenges (e.g., object detection), we propose a new benchmark challenge called OpenMonkeyChallenge that facilitates collective community efforts through an annual competition to build generalizable non-human primate pose estimation models. To host the benchmark challenge, we provide a new public dataset consisting of 111,529 annotated (17 body landmarks) photographs of non-human primates in naturalistic contexts obtained from various sources including the Internet, three National Primate Research Centers, and the Minnesota Zoo. Such annotated datasets will be used for the training and testing datasets to develop generalizable models with standardized evaluation metrics. We demonstrate the effectiveness of our dataset quantitatively by comparing it with existing datasets based on seven state-of-the-art pose estimation models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Andriluka, M., Pishchulin, L., Gehler, P., & Schiele, B. (2014). 2D human pose estimation: New benchmark and state of the art analysis. In Computer Vision and Pattern Recognition.

  • Andriluka, M., Iqbal, U., Milan, A., Insafutdinov, E., Pishchulin, L., Gall, J., & Schiele, B. (2018). Posetrack: A benchmark for human pose estimation and tracking. In Computer Vision and Pattern Recognition.

  • Antol, S., Agrawal, A., Lu, J., Mitchell, M., Batra, D., Zitnick, C. L., & Parikh, D. (2015). VQA: Visual question answering. In International Conference on Computer Vision.

  • Bala, P., Eisenreich, B., Yoo, S. B., Hayden, B., Park, H., & Zimmermann, J. (2020). Automated markerless pose estimation in freely moving macaques with openmonkeystudio. Nature Communications.

  • Belagiannis, V., & Zisserman, A. (2017). Recurrent human pose estimation. In International Conference on Automatic Face & Gesture Recognition.

  • Bliss-Moreau, E., Machado, C. J., & Amaral, D. G. (2013). Macaque cardiac physiology is sensitive to the valence of passively viewed sensory stimuli. PLoS One.

  • Cao, Z., Martinez, G. H., Simon, T., Wei, S.-E., & Sheikh, Y. A. (2019). Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence.

  • Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T. S., & Zhang, L. (2020). Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In Computer vision and pattern recognition.

  • de Bem, R., Arnab, A., Golodetz, S., Sapienza, M., & Torr, P. H. S. (2018). Deep fully-connected part-based models for human pose estimation. In Asian conference on machine learning.

  • Dunn, T., Marshall, J., Severson, K., Aldarondo, D., Hildebrand, D., Chettih, S., Wang, W., Gellis, A., Carlson, D., Aronov, D., Freiwald, W., Wang, F., & Olveczky, B. (2021). Geometric deep learning enables 3D kinematic profiling across species and environments. Nature Methods.

  • Eichner, M., & Ferrari, V. (2010). We are family: Joint pose estimation of multiple persons. In European Conference on Computer Vision.

  • Fang, H.-S., Xie, S., Tai, Y.-W., & Lu, C. (2017). RMPE: Regional multi-person pose estimation. In International conference on computer vision.

  • Güler, R. A., Neverova, N., & Kokkinos, I. (2018). Densepose: Dense human pose estimation in the wild. In Computer vision and pattern recognition.

  • Günel, S., Rhodin, H., Morales, D., Campagnolo, J., Ramdya, P., & Fua, P. (2019). Deepfly3d, a deep learning-based approach for 3d limb and appendage tracking in tethered, adult drosophila. eLife.

  • Hayden, B. Y., Park, H. S., & Zimmermann, J. (2021). Automated pose estimation in primates. American Journal of Primatology.

  • Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., & Schiele, B. (2016). Deepercut: A deeper, stronger, and faster multi-person pose estimation model. In European conference on computer vision.

  • Iqbal, U., Milan, A., & Gall, J. (2017). Posetrack: Joint multi-person pose estimation and tracking. In Computer vision and pattern recognition.

  • Iskakov, K., Burkov, E., Lempitsky, V., & Malkov, Y. (2019). Learnable triangulation of human pose. In International conference on computer vision.

  • Jakab, T., Gupta, A., Bilen, H., & Vedaldi, A. (2020). Self-supervised learning of interpretable keypoints from unlabelled videos. In Computer vision and pattern recognition.

  • Karashchuk, P., Rupp, K., Dickinson, E., Sanders, E., Azim, E., Brunton, B., & Tuthill, J. (2020). Anipose: A toolkit for robust markerless 3D pose estimation. In BioRxiv.

  • Knaebe, B., Weiss, C., Zimmermann, J., & Hayden, B. (2022). The promise of behavioral tracking systems for advancing primate animal welfare. Animals.

  • Labuguen, R., Matsumoto, J., Negrete, S., Nishimaru, H., Nishijo, H., Takada, M., Go, Y., Inoue, K.-I., & Shibata, T. (2021). Macaquepose: A novel “in the wild” macaque monkey pose dataset for markerless motion capture. Frontiers in Behavioral Neuroscience.

  • Li, S., Li, J., Tang, H., Qian, R., & Lin, W. (2020). ATRW: A benchmark for amur tiger re-identification in the wild. In ACM International Conference on Multimedia.

  • Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In European conference on computer vision.

  • Lin, W., Liu, H., Liu, S., Li, Y., Qian, R., Wang, T., Xu, N., Xiong, H., Qi, G.-J., & Sebe, N. (2020). Human in events: A large-scale benchmark for human-centric video analysis in complex events. arXiv preprintarXiv:2005.04490.

  • Ludwig, K., Scherer, S., Einfalt, M., & Lienhart, R. (2021). Self-supervised learning for human pose estimation in sports. In IEEE International Conference on Multimedia Expo Workshops.

  • Machado, C. J., Bliss-Moreau, E., Platt, M. L., & Amaral, D. G. (2011). Social and nonsocial content differentially modulates visual attention and autonomic arousal in rhesus macaques. PLoS One.

  • Mathis, A., Mamidanna, P., Cury, K. M., Abe, T., Murthy, V. N., Mathis, M. W., & Bethge, M. (2018). Deeplabcut: Markerless pose estimation of user-defined body parts with deep learning. Nature Neuroscience.

  • Mathis, M. W., & Mathis, A. (2020). Deep learning tools for the measurement of animal behavior in neuroscience. Current Opinion in Neurobiology.

  • McInnes, L., Healy, J., & Melville, J. (2018). Umap: Uniform manifold approximation and projection for dimension reduction. arXiv.

  • Newell, A., Yang, K., & Deng, J. (2016). Stacked hourglass networks for human pose estimation. In European conference on computer vision.

  • Ng, X. L., Ong, K. E., Zheng, Q., Ni, Y., Yeo, S. Y., & Liu, J. (2022). Animal kingdom: A large and diverse dataset for animal behavior understanding. In Computer vision and pattern recognition.

  • Pereira, T. D., Aldarondo, D. E., Willmore, L., Kislin, M., Wang, S. S. H., Murthy, M., & Shaevitz, J. W. (2018). Fast animal pose estimation using deep neural networks. Nature Methods.

  • Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P., & Schiele, B. (2016). Deepcut: Joint subset partition and labeling for multi person pose estimation. In Computer vision and pattern recognition.

  • Redmon, J., & Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv.

  • Ren, Z., & Lee, Y. J. (2018). Cross-domain self-supervised multi-task feature learning using synthetic imagery. In computer vision and pattern recognition.

  • Sade, D. S. (1973). An ethogram for rhesus monkeys i. Antithetical contrasts in posture and movement. American Journal of Physical Anthropology.

  • Sapp, B., & Taskar, B. (2013). Modec: Multimodal decomposable models for human pose estimation. In Computer vision and pattern recognition.

  • Sumer, O., Dencker, T., & Ommer, B. (2017). Self-supervised learning of pose embeddings from spatiotemporal relations in videos. In International conference on computer vision.

  • Sun, K., Xiao, B., Liu, D., & Wang, J. (2019). Deep high-resolution representation learning for human pose estimation. In Computer vision and pattern recognition.

  • Torralba, A., & Efros, A. A. (2011). Unbiased look at dataset bias. In Computer vision and pattern recognition.

  • Toshev, A., & Szegedy, C. (2014). Deeppose: Human pose estimation via deep neural networks. In Computer vision and pattern recognition.

  • von Marcard, T., Henschel, R., Black, M., Rosenhahn, B., & Pons-Moll, G. (2018). Recovering accurate 3D human pose in the wild using imus and a moving camera. In European conference on computer vision.

  • Wan, C., Probst, T., Gool, L. V., & Yao, A. (2019). Self-supervised 3D hand pose estimation through training by fitting. In Computer vision and pattern recognition.

  • Wei, S.-E., Ramakrishna, V., Kanade, T., & Sheikh, Y. (2016). Convolutional pose machines. In Computer vision and pattern recognition.

  • Wiltschko, A., Johnson, M., Iurilli, G., Peterson, R., Katon, J., Pashkovski, S., Abraira, V., Adams, R., & Datta, S. (2015). Mapping sub-second structure in mouse behavior. Neuron.

  • Xiao, B., Wu, H., & Wei, Y. (2018). Simple baselines for human pose estimation and tracking. In European conference on computer vision.

  • Yang, H., Dong, W., Carlone, L., & Koltun, V. (2021). Self-supervised geometric perception. In Computer vision and pattern recognition.

  • Yao, Y., Jafarian, Y., & Park, H. S. (2019). Monet: Multiview semi-supervised keypoint via epipolar divergence. In International Conference on Computer Vision.

Download references

Acknowledgements

We thank Lin Huynh, Peeyush Samba, Justin Aronson, and Jen Holmberg for help on image acquisition. We thank the staff at the Minnesota Zoo for copious help, especially Tom Ness, Kathy Schlegel, Jamie Toste, Laurie Trechsel, and Kelli Gabrielson.

Funding

This work is partially supported by NSF IIS 2024581 (HSP, JZ, and BYH), NIH P51 OD011092 (ONPRC), NIH P51 OD011132 (YNPRC), NIH R01-NS120182 (JR), and K99-MH083883 (CJM).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Praneet Bala.

Ethics declarations

Conflict of interest

The authors declare no conflicts of interest. All procedures were performed in compliance with the guidelines of the IACUC of the University of Minnesota.

Ethical approval

All procedures were performed in compliance with the guidelines of the IACUC of the University of Minnesota.

Informed Consent

Informed consent is not relevant because there were no human subjects.

Additional information

Communicated by Matej Kristan.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Jan Zimmermann, Benjamin Y. Hayden, Hyun Soo Park are co-last authors.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yao, Y., Bala, P., Mohan, A. et al. OpenMonkeyChallenge: Dataset and Benchmark Challenges for Pose Estimation of Non-human Primates. Int J Comput Vis 131, 243–258 (2023). https://doi.org/10.1007/s11263-022-01698-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-022-01698-2

Keywords

Navigation