Advertisement

Classifying Excavator Operations with Fusion Network of Multi-modal Deep Learning Models

  • Jin-Young Kim
  • Sung-Bae ChoEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 950)

Abstract

Prognostics and health management (PHM) aims to offer comprehensive solutions for managing equipment health. Classifying the excavator operations plays an important role in measuring the lifetime, which is one of the tasks in PHM because the effect on the lifetime depends on the operations performed by the excavator. Several researchers have struggled with classifying the operations with either sensor or video data, but most of them have difficulties with the use of single modal data only, the surrounding environment, and the exclusive feature extraction for the data in different domains. In this paper, we propose a fusion network that classifies the excavator operations with multi-modal deep learning models. Trained are multiple classifiers with specific type of data, where feature extractors are reused to place at the front of the fusion network. The proposed fusion network combines a video-based model and a sensor-based model based on deep learning. To evaluate the performance of the proposed method, experiments are conducted with the data collected from real construction workplace. The proposed method yields the accuracy of 98.48% which is higher than conventional methods, and the multi-modal deep learning models can complement each other in terms of precision, recall, and F1-score.

Keywords

Excavator Classification Deep learning Multi-modal data Autoencoder Feature extraction 

Notes

Acknowledgement

This work has been supported by a grant from Doosan infracore, Inc.

References

  1. 1.
    Sikorska, J.Z., Hodkiewicz, M., Ma, L.: Prognostic modeling options for remaining useful life estimation by industry. Mech. Syst. Sig. Process. 25(5), 1803–1836 (2011)CrossRefGoogle Scholar
  2. 2.
    Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)Google Scholar
  3. 3.
    Wu, Z., Jiang, Y.G., Wang, X., Ye, H., Xue, X.: Multi-stream multi-class fusion of deep networks for video classification. In: Proceedings of ACM on Multimedia Conference, pp. 791–800 (2016)Google Scholar
  4. 4.
    Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4733 (2017)Google Scholar
  5. 5.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–445 (2015)CrossRefGoogle Scholar
  6. 6.
    Sanchez, D., Melin, P., Castillo, O.: Optimization of modular granular neural networks using a firefly algorithm for human recognition. Eng. Appl. Artif. Intell. 64, 172–186 (2017)CrossRefGoogle Scholar
  7. 7.
    Sanchez, D., Melin, P., Castillo, O.: A grey wolf optimizer for modular granular neural networks for human recognition. Comput. Intell. Neurosci. 2017, 1–26 (2017)CrossRefGoogle Scholar
  8. 8.
    Melin, P., Sanchez, D.: Multi-objective optimization for modular granular neural networks applied to pattern recognition. Inf. Sci. 460, 594–610 (2018)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Dao, M., Nguyen, N.H., Nasrabadi, N.M., Tran, T.D.: Collaborative multi-sensor classification via sparsity-based representation. IEEE Trans. Sig. Process. 64(9), 2400–2415 (2016)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Chavez-Garcia, R.O., Aycard, O.: Multiple sensor fusion and classification for moving object detection and tracking. IEEE Trans. Intell. Transp. Syst. 17(2), 525–534 (2016)CrossRefGoogle Scholar
  11. 11.
    Cao, J., Huang, W., Zhao, T., Wang, J., Wang, R.: An enhance excavation equipments classification algorithm based on acoustic spectrum dynamic feature. Multidimension. Syst. Sig. Process. 28(3), 921–943 (2017)CrossRefGoogle Scholar
  12. 12.
    Cao, J., Zhao, T., Wang, J., Wang, R., Chen, Y.: Excavation equipment classification based on improved MFCC features and ELM. Neurocomputing 261, 231–241 (2017)CrossRefGoogle Scholar
  13. 13.
    Choi, S.G., Cho, S.B.: Sensor information fusion by integrated AI to control public emotion in a cyber-physical environment. Sensors 18(11), 3767–3787 (2018)CrossRefGoogle Scholar
  14. 14.
    Kim, J.Y., Cho, S.B.: Electric energy consumption prediction by deep learning with state explainable autoencoder. Energies 12(4), 739 (2019)CrossRefGoogle Scholar
  15. 15.
    Donahue, J., Anne Hendricks, L., Huadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)Google Scholar
  16. 16.
    Zha, S., Luisier, F., Andrews, W., Srivastava, N., Salakhutdinov, R.: Exploiting image-trained CNN architectures for unconstrained video classification. arXiv preprint arXiv:1503.04144 (2015)
  17. 17.
    Ye, H., Wu, Z., Zhao, R.W., Wang, X., Jiang, Y.G., Xue, X.: Evaluating two-stream CNN for video classification. In: Proceedings of ACM on International Conference on Multimedia Retrieval, pp. 435–442 (2015)Google Scholar
  18. 18.
    Wu, Z., Wang, X., Jiang, Y.G., Ye, H., Xue, X.: Modeling spatial-temporal clues in a hybrid deep learning framework for video classification. In: Proceedings of the ACM International Conference on Multimedia, pp. 461–470 (2015)Google Scholar
  19. 19.
    Han, J., Zhang, D., Wen, S., Guo, L., Liu, T., Li, X.: Two-stage learning to predict human eye fixations via SDAEs. IEEE Trans. Cybern. 46(2), 487–498 (2016)CrossRefGoogle Scholar
  20. 20.
    Kim, J.Y., Cho, S.B., Detecting intrusive malware with a hybrid generative deep learning model. In: International Conference on Intelligent Data Engineering and Automated Learning, pp. 499–507 (2018)CrossRefGoogle Scholar
  21. 21.
    Hochreiter, S., Schmidhuber, J.: Long-short term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  22. 22.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)Google Scholar
  23. 23.
    Xingjian, S.G.I., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Advances in Neural Information Processing Systems, pp. 802–810 (2015)Google Scholar
  24. 24.
    Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Computer ScienceYonsei UniversitySeoulSouth Korea

Personalised recommendations