Deep Modular Multimodal Fusion on Multiple Sensors for Volcano Activity Recognition

  • Hiep V. LeEmail author
  • Tsuyoshi Murata
  • Masato Iguchi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11053)


Nowadays, with the development of sensor techniques and the growth in a number of volcanic monitoring systems, more and more data about volcanic sensor signals are gathered. This results in a need for mining these data to study the mechanism of the volcanic eruption. This paper focuses on Volcano Activity Recognition (VAR) where the inputs are multiple sensor data obtained from the volcanic monitoring system in the form of time series. And the output of this research is the volcano status which is either explosive or not explosive. It is hard even for experts to extract handcrafted features from these time series. To solve this problem, we propose a deep neural network architecture called VolNet which adapts Convolutional Neural Network for each time series to extract non-handcrafted feature representation which is considered powerful to discriminate between classes. By taking advantages of VolNet as a building block, we propose a simple but effective fusion model called Deep Modular Multimodal Fusion (DMMF) which adapts data grouping as the guidance to design the architecture of fusion model. Different from conventional multimodal fusion where the features are concatenated all at once at the fusion step, DMMF fuses relevant modalities in different modules separately in a hierarchical fashion. We conducted extensive experiments to demonstrate the effectiveness of VolNet and DMMF on the volcanic sensor datasets obtained from Sakurajima volcano, which are the biggest volcanic sensor datasets in Japan. The experiments showed that DMMF outperformed the current state-of-the-art fusion model with the increase of F-score up to 1.9% on average.


Multimodal fusion Volcano Activity Recognition Time series Convolutional Neural Network 



We would like to thank Osumi Office of River and National Highway, Kyushu Regional Development Bureau, MLIT for providing volcanic sensor datasets.


  1. 1.
    Sparks, R.S.J.: Forecasting volcanic eruptions. Earth Planetary Sci. Lett. 210(1), 1–15 (2003)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Palaz, D., Collobert, R.: Analysis of CNN-based speech recognition system using raw speech as input. EPFL-REPORT-210039. Idiap (2015)Google Scholar
  3. 3.
    Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)Google Scholar
  5. 5.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  6. 6.
    Yang, J., Nguyen, M.N., San, P.P., Li, X., Krishnaswamy, S.: Deep convolutional neural networks on multichannel time series for human activity recognition. In: IJCAI, pp. 3995–4001 (2015)Google Scholar
  7. 7.
    Ordóñez, F.J., Roggen, D.: Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition. Sensors 16(1), 115 (2016)CrossRefGoogle Scholar
  8. 8.
    Zheng, Y., Liu, Q., Chen, E., Ge, Y., Zhao, J.L.: Time series classification using multi-channels deep convolutional neural networks. In: Li, F., Li, G., Hwang, S., Yao, B., Zhang, Z. (eds.) WAIM 2014. LNCS, vol. 8485, pp. 298–310. Springer, Cham (2014). Scholar
  9. 9.
    Senin, P., Malinchik, S.: SAX-VSM: interpretable time series classification using SAX and vector space model. In: 2013 IEEE 13th International Conference on Data Mining (ICDM), pp. 1175–1180. IEEE (2013)Google Scholar
  10. 10.
    Batista, G.E., Wang, X., Keogh, E.J.: A complexity-invariant distance measure for time series. In: Proceedings of the 2011 SIAM International Conference on Data Mining, pp. 699–710. Society for Industrial and Applied Mathematics (2011)Google Scholar
  11. 11.
    Xi, X., Keogh, E., Shelton, C., Wei, L., Ratanamahatana, C.A.: Fast time series classification using numerosity reduction. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 1033–1040. ACM (2006)Google Scholar
  12. 12.
    Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. Knowl. Inf. Syst. 3(3), 263–286 (2001)CrossRefGoogle Scholar
  13. 13.
    Bulling, A., Blanke, U., Schiele, B.: A tutorial on human activity recognition using body-worn inertial sensors. ACM Comput. Surv. (CSUR) 46(3), 33 (2014)CrossRefGoogle Scholar
  14. 14.
    Cao, H., Nguyen, M.N., Phua, C., Krishnaswamy, S., Li, X.: An integrated framework for human activity classification. In: UbiComp, pp. 331–340 (2012)Google Scholar
  15. 15.
    Koskela, T., Lehtokangas, M., Saarinen, J., Kaski, K.: Time series prediction with multilayer perceptron, FIR and Elman neural networks. In: Proceedings of the World Congress on Neural Networks, pp. 491–496. INNS Press, San Diego (1996)Google Scholar
  16. 16.
    Ibs-von Seht, M.: Detection and identification of seismic signals recorded at Krakatau volcano (Indonesia) using artificial neural networks. J. Volcanol. Geothermal Res. 176(4), 448–456 (2008)CrossRefGoogle Scholar
  17. 17.
    Malfante, M., Dalla Mura, M., Metaxian, J.-P., Mars, J.I., Macedo, O., Inza, A.: Machine learning for Volcano-seismic signals: challenges and perspectives. IEEE Sig. Process. Mag. 35(2), 20–30 (2018)CrossRefGoogle Scholar
  18. 18.
    Benítez, M.C., et al.: Continuous HMM-based seismic-event classification at Deception Island Antarctica. IEEE Trans. Geosci. Remote Sens. 45(1), 138–146 (2007)CrossRefGoogle Scholar
  19. 19.
    Duin, R.P.W., Orozco-Alzate, M., Londono-Bonilla, J.M.: Classification of volcano events observed by multiple seismic stations. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 1052–1055. IEEE (2010)Google Scholar
  20. 20.
    Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.Y.: Multimodal deep learning. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 689–696 (2011)Google Scholar
  21. 21.
    Kahou, S.E., et al.: Emonets: multimodal deep learning approaches for emotion recognition in video. J. Multimodal User Interfaces 10(2), 99–111 (2016)CrossRefGoogle Scholar
  22. 22.
    Iguchi, M., Tameguri, T., Ohta, Y., Ueki, S., Nakao, S.: Characteristics of volcanic activity at Sakurajima volcano’s Showa crater during the period 2006 to 2011 (special section Sakurajima special issue). Bull. Volcanol. Soc. Japan 58(1), 115–135 (2013)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer ScienceTokyo Institute of TechnologyTokyoJapan
  2. 2.Disaster Prevention Research InstituteKyoto UniversityKyotoJapan

Personalised recommendations