Skip to main content

Representing human motion with FADE and U-FADE: an efficient frequency-domain approach

Abstract

In this work, we present FADE, a frequency-based descriptor to encode human motion. FADE is simple, and provides high compression rate and low computational complexity. In order to reduce space and time complexity, we exploit the biomechanical property that human motion is bounded in frequency. FADE and U-FADE can be used in combination with both supervised and unsupervised learning approaches in order to classify and cluster human actions, respectively. We present also a branch of FADE, called Uncompressed FADE (U-FADE). U-FADE performs well in combination with some unsupervised algorithms such as spectral clustering, paying the price of a reduced compression rate. Also, U-FADE performs in general better than FADE well with small datasets. We tested our descriptors with well-known, public motion databases, such as HDM05, Berkeley MHAD, and MSR. Moreover, we compared FADE and U-FADE with diverse state of the art approaches.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. Müller M, Röder T, Clausen M, Eberhardt B, Krüger B, Weber A (2007) Documentation mocap database hdm05.

References

  • Bishop, C. M. (2006). Pattern recognition and machine learning. New York: Springer.

    MATH  Google Scholar 

  • Bissacco, A., Chiuso, A., & Ma, Y., Soatto, S. (2001). Recognition of human gaits. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (Vol. 2, pp. II-52–II-57).

  • Cavallo, A., & Falco, P. (2014). Online segmentation and classification of manipulation actions from the observation of kinetostatic data. IEEE Transactions on Human–Machine Systems, 44(2), 256–269.

    Article  Google Scholar 

  • Chen, X., & Koskela, M. (2015). Skeleton-based action recognition with extreme learning machines. Neurocomputing, 149, 387–396.

    Article  Google Scholar 

  • Chen, C., Liu, K., & Kehtarnavaz, N. (2016). Real-time human action recognition based on depth motion maps. Journal of Real-Time Image Processing, 12(1), 155–163. https://doi.org/10.1007/s11554-013-0370-1.

    Article  Google Scholar 

  • Cho, K., & Chen, X. (2014). Classifying and visualizing motion capture sequences using deep neural networks. In International conference on computer vision theory and applications (VISAPP) (Vol. 2, pp. 122–130).

  • De Schutter, J., Di Lello, E., De Schutter, J. F., Matthysen, R., Benoit, T., & De Laet, T. (2011). Recognition of 6 dof rigid body motion trajectories using a coordinate-free representation. In IEEE international conference on robotics and automation (pp. 2071–2078).

  • Di Benedetto, A., Palmieri, F. A., Cavallo, A., & Falco, P. (2016). A hidden Markov model-based approach to grasping hand gestures classification. In B. Simone, E. Anna, M. F. Carlo, & P. Eros (Eds.), Advances in neural networks (pp. 415–423). Cham: Springer International Publishing.

    Chapter  Google Scholar 

  • Du, Y., Wang, W., & Wang, L. (2015). Hierarchical recurrent neural network for skeleton based action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1110–1118)

  • Dyn, N., Levin, D., & Rippa, S. (1990). Data dependent triangulations for piecewise linear interpolation. IMA Journal of Numerical Analysis, 10(1), 137–154.

    Article  MathSciNet  MATH  Google Scholar 

  • Evangelidis, G., Singh, G., & Horaud, R. (2014). Skeletal quads: Human action recognition using joint quadruples. In International conference on pattern recognition

  • Falco, P., Saveriano, M., Hasany, E. G., Kirk, N. H., & Lee, D. (2017). A human action descriptor based on motion coordination. IEEE Robotics and Automation Letters, 2(2), 811–818.

    Article  Google Scholar 

  • Forestier, N., & Nougier, V. (1998). The effects of muscular fatigue on the coordination of a multijoint movement in human. Neuroscience Letters, 252(3), 187–190.

    Article  Google Scholar 

  • Gowayyed, M. A., Torki, M., Hussein, M. E., & El-Saban, M. (2013). Histogram of oriented displacements (hod): Describing trajectories of human joints for action recognition. In International joint conference on artificial intelligence

  • Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507.

    Article  MathSciNet  MATH  Google Scholar 

  • Huang, G. B., Zhu, Q. Y., & Siew, C. K. (2006). Extreme learning machine: Theory and applications. Neurocomputing, 70(1), 489–501.

    Article  Google Scholar 

  • Kulić, D., Ott, C., Lee, D., Ishikawa, J., & Nakamura, Y. (2012). Incremental learning of full body motion primitives and their sequencing through human motion observation. The International Journal of Robotics Research, 31(3), 330–345.

    Article  Google Scholar 

  • Kulić, D., Takano, W., & Nakamura, Y. (2008). Incremental learning, clustering and hierarchy formation of whole body motion patterns using adaptive hidden markov chains. The International Journal of Robotics Research, 27(7), 761–784.

    Article  Google Scholar 

  • Le Naour, T., Courty, N., & Gibet, S. (2012). Fast motion retrieval with the distance input space. In M. Kallmann, & K. Bekris (Eds.), Motion in Games: Proceedings of the 5th International Conference, MIG 2012, Rennes, France, November 15–17, 2012. Berlin, Heidelberg: Springer.

  • Lee, D., Ott, C., & Nakamura, Y. (2009). Mimetic communication with impedance control for physical human-robot interaction. In IEEE international conference on robotics and automation (Vol. 2009, pp. 1535–1542).

  • Lee, D., Soloperto, R., & Saveriano, M. (2017). Bidirectional invariant representation of rigid body motions and its application to gesture recognition and reproduction. Autonomous Robots, 42, 1–21.

    Google Scholar 

  • Leightley, D., Li, B., McPhee, J. S., Yap, M. H., & Darby, J. (2014). Exemplar-based human action recognition with template matching from a stream of motion capture. In C. Aurélio & K. Mohamed (Eds.), Image analysis and recognition (pp. 12–20). Cham: Springer International Publishing.

    Google Scholar 

  • Li, W., Zhang, Z., & Liu, Z. (2010). Action recognition based on a bag of 3d points. In IEEE computer society conference on computer vision and pattern recognition-workshops (pp. 9–14).

  • Liu, J., Shahroudy, A., Xu, D., & Wang, G. (2016). Spatio-temporal lstm with trust gates for 3d human action recognition. In European conference on computer vision (pp. 816–833). Springer.

  • Lovász, L., & Plummer, M. D. (2009). Matching theory (Vol. 367). Providence: American Mathematical Society.

    MATH  Google Scholar 

  • Mahasseni, B., & Todorovic, S. (2016). Regularizing long short term memory with 3d human-skeleton sequences for action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3054–3062).

  • Medina, J. R., Lawitzky, M., Mörtl, A., Lee, D., & Hirche, S. (2011). An experience-driven robotic assistant acquiring human knowledge to improve haptic cooperation. In 2011 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 2416–2422). IEEE.

  • Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., & Bajcsy, R. (2013). Berkeley MHAD: A comprehensive multimodal human action database. In IEEE workshop on applications of computer vision (pp. 53–60).

  • Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., & Bajcsy, R. (2014). Sequence of the most informative joints (smij): A new representation for human skeletal action recognition. Journal of Visual Communication and Image Representation, 25(1), 24–38.

    Article  Google Scholar 

  • Pervez, A., Ali, A., Ryu, J. H., & Lee, D. (2017). Novel learning from demonstration approach for repetitive teleoperation tasks. In World haptics conference (WHC) (pp. 60–65).

  • Rabiner, L. R. (1989). A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.

    Article  Google Scholar 

  • Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. Transactions on Acoustics, Speech, and Signal Processing, 26, 43–49.

    Article  MATH  Google Scholar 

  • Saveriano, M., & Lee, D. (2013). Invariant representation for user independent motion recognition. In IEEE international symposium on robot and human interactive communication (pp. 650–655).

  • Schmidts, A. M., Lee, D., & Peer, A. (2011). Imitation learning of human grasping skills from motion and force data. In 2011 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 1002–1007). IEEE.

  • Shah, D., Falco, P., Saveriano, M., & Lee, D. (2016). Encoding human actions with a frequency domain approach. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 5304–5311).

  • Soloperto, R., Saveriano, M., & Lee, D. (2015). A bidirectional invariant representation of motion for gesture recognition and reproduction. In IEEE international conference on robotics and automation (ICRA) (pp. 6146–6152).

  • Vemulapalli, R., Arrate, F., & Chellappa, R. (2014). Human action recognition by representing 3d skeletons as points in a lie group. In IEEE conference on computer vision and pattern recognition (pp. 588–595).

  • Walker, J. S. (1996). Fast fourier transforms (Vol. 24). Boca Raton: CRC Press.

    MATH  Google Scholar 

  • Wang, C., Wang, Y., & Yuille, A. L. (2016). Mining 3d key-pose-motifs for action recognition. In 2016 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2639–2647).

  • Wang, J., Liu, Z., Wu, Y., & Yuan, J. (2012). Mining actionlet ensemble for action recognition with depth cameras. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1290–1297).

  • Wang, Q., Kurillo, G., Ofli, F., & Bajcsy, R. (2015). Unsupervised temporal segmentation of repetitive human actions based on kinematic modeling and frequency analysis. In 2015 international conference on 3D vision (pp. 562–570). https://doi.org/10.1109/3DV.2015.69.

  • Xia, L., Chen, C. C., & Aggarwal, J. (2012). View invariant human action recognition using histograms of 3d joints. In IEEE computer society conference on computer vision and pattern recognition workshops (pp. 20–27).

  • Xu, W., Liu, X., & Gong, Y. (2003). Document clustering based on non-negative matrix factorization. In ACM SIGIR conference on research and development in information retrieval (pp. 267–273).

  • Zhu, W., Lan, C., Xing, J., Zeng, W., Li, Y., Shen, L., & Xie, X., et al. (2016). Co-occurrence feature learning for skeleton based action recognition using regularized deep lstm networks. In AAAI (Vol. 2, p. 8).

Download references

Acknowledgements

This work has been supported by the Marie Curie Action LEACON, EU project 659265, and by the Technical University of Munich, International Graduate School of Science and Engineering (IGSSE).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pietro Falco.

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Falco, P., Saveriano, M., Shah, D. et al. Representing human motion with FADE and U-FADE: an efficient frequency-domain approach. Auton Robot 43, 179–196 (2019). https://doi.org/10.1007/s10514-018-9722-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-018-9722-9

Keywords