Skip to main content

Advertisement

Log in

Design of a hierarchy modular neural network and its application in multimodal emotion recognition

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Achievement of the fusion for different modalities is a critical issue for multimodal emotion recognition. Feature-level fusion methods cannot deal with missing or corrupted data, while decision-level fusion methods may lose the correlation information between different modalities. To solve the above problems, a hierarchy modular neural network (HMNN) is proposed and is applied for multimodal emotion recognition. First, an HMNN is constructed to mimic the hierarchy modular architecture as demonstrated in the human brain. Each module contains several submodules dealing with features from different modalities. Connections are built between submodules within the same module and between corresponding submodules from different modules. Then, a learning algorithm based on Hebbian learning is used to train the connection weights in HMNN, which simulates the learning mechanism of the human brain. HMNN recognizes the label based on the activity level of each module and adopts the winner-take-all strategy. Finally, the proposed HMNN is applied on a public dataset for multimodal emotion recognition. Experimental results show that the proposed HMNN improves the recognition results, when compared with other decision-fusion methods, including support vector machine, as well as neural networks such as back-propagation and radial basis function neural networks. Furthermore, the inter-submodule connections in one module realizes information integration from different modalities and improves the performance of HMNN. Besides, the experiments suggest the effectiveness of HMNN on dealing with missing/corrupted data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Abadi MK, Subramanian R, Kia SM, Avesani P, Patras I, Sebe N (2015) DECAF: MEG-based multimodal database for decoding affective physiological responses. IEEE Trans Affect Comput 6(3):209–222

    Article  Google Scholar 

  • Ali M, Sarwar A, Sharma V, Suri J (2017) Artificial neural network based screening of cervical cancer using a hierarchical modular neural network architecture (HMNNA) and novel benchmark uterine cervix cancer database. Neural Comput Appl 4:1–15

    Google Scholar 

  • Bejani M, Gharavian D, Charkari NM (2014) Audiovisual emotion recognition using ANOVA feature selection method and multi-classifier neural networks. Neural Comput Appl 24(2):399–412

    Article  Google Scholar 

  • Bertolero MA, Yeo BT, D’Esposito M (2015) The modular and integrative functional architecture of the human brain. Proc Natl Acad Sci USA 112(49):e6798

    Article  Google Scholar 

  • Bhattacharya A, Choudhury D, Dey D (2018) Edge-enhanced bi-dimensional empirical mode decomposition-based emotion recognition using fusion of feature set. Soft Comput 22(3):889–903

    Article  Google Scholar 

  • Bliss TVP, Collingridge GL (1993) A synaptic model of memory: long-term potentiation in the hippocampus. Nature 361(6407):31–39

    Article  Google Scholar 

  • Chanel G, Kierkels JJM, Soleymani M, Pun T (2009) Short-term emotion assessment in a recall paradigm. Int J Hum Comput Stud 67(8):607–627

    Article  Google Scholar 

  • Chen J, Hu B, Xu L, Moore P, Su Y (2015) Feature-level fusion of multimodal physiological signals for emotion recognition. In: The IEEE international conference on bioinformatics and biomedicine, pp 395–399

  • Chen ZJ, He Y, Rosa-Neto P, Germann J, Evans AC (2008) Revealing modular architecture of human brain structural networks by using cortical thickness from MRI. Cereb Cortex 18(10):2374–2381

    Article  Google Scholar 

  • Chetouani M, Mahdhaoui A, Ringeval F (2009) Time-scale feature extractions for emotional speech characterization. Cognit Comput 1(2):194–201

    Article  Google Scholar 

  • Derrac J, García S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evolut Comput 1(1):3–18

    Article  Google Scholar 

  • Fan GF, Peng LL, Hong WC (2018) Short term load forecasting based on phase space reconstruction algorithm and bi-square kernel regression model. Appl Energy 224:13–33

    Article  Google Scholar 

  • Goltsev A (2004) Secondary learning in the assembly neural network. Neurocomputing 62(3):405–426

    Article  Google Scholar 

  • Goltsev A, Gritsenko V (2009) Modular neural networks with Hebbian learning rule. Neurocomputing 72(10):2477–2482

    Article  Google Scholar 

  • Gonalves VP, Giancristofaro GT, Filho GPR, Johnson T, Carvalho V, Pessin G, Neris VPDA, Ueyama J (2017) Assessing users’ emotion at interaction time: a multimodal approach with multiple sensors. Soft Comput 21(18):5309–5323

    Article  Google Scholar 

  • He Y, Wang J, Wang L, Chen ZJ, Yan C, Yang H, Tang H, Zhu C, Gong Q, Zang Y, Evans AC (2009) Uncovering intrinsic modular organization of spontaneous brain activity in humans. PLoS ONE 4(4):e5226

    Article  Google Scholar 

  • Hilgetag CC, Hütt MT (2014) Hierarchical modular brain connectivity is a stretch for criticality. Trends Cognit Sci 18(3):114–115

    Article  Google Scholar 

  • Hirsch JC, Barrionuevo G, Crepel F (1992) Homo- and heterosynaptic changes in efficacy are expressed in prefrontal neurons: an in vitro study in the rat. Synapse 12(1):82–85

    Article  Google Scholar 

  • Ioannou S, Kessous L, Caridakis G, Karpouzis K, Aharonson V, Kollias S (2006) Adaptive on-line neural network retraining for real life multimodal emotion recognition. In: International conference on artificial neural networks, pp 81–92

  • Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (2014) Adaptive mixtures of local experts. Neural Comput 3(1):79–87

    Article  Google Scholar 

  • Karpouzis K, Caridakis G, Cowie R, Douglas-Cowie E (2013) Induction, recording and recognition of natural emotions from facial expressions and speech prosody. J Multimodal User Interfaces 7(3):195–206

    Article  Google Scholar 

  • Kessous L, Castellano G, Caridakis G (2010) Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis. J Multimodal User Interfaces 3(1):33–48

    Article  Google Scholar 

  • Lu BL, Ito M (1999) Task decomposition and module combination based on class relations: a modular neural network for pattern classification. IEEE Trans Neural Netw 10(5):1244–1256

    Article  Google Scholar 

  • Meunier D, Lambiotte R, Fornito A, Ersche KD, Bullmore ET (2009) Hierarchical modularity in human brain functional networks. Front Neuroinform 3:37

    Article  Google Scholar 

  • Mitsuyama S, Motoike J, Matsuo H (1999) Automatic classification of urinary sediment images by using a hierarchical modular neural network. In: SPIE’s international symposium on medical imaging, pp 680–688

  • Mozaffari A, Scott KA, Chenouri S, Azad NL (2017) A modular ridge randomized neural network with differential evolutionary distributor applied to the estimation of sea ice thickness. Soft Comput 21(16):4635–4659

    Article  Google Scholar 

  • Planet S, Iriondo I (2013) Children’s emotion recognition from spontaneous speech using a reduced set of acoustic and linguistic features. Cognit Comput 5(4):526–532

    Article  Google Scholar 

  • Russell NT, Bakker HHC, Chaplin RI (2000) Modular neural network modelling for long-range prediction of an evaporator. Control Eng Pract 8(1):49–59

    Article  Google Scholar 

  • Sánchez D, Melin P, Castillo O (2015) Optimization of modular granular neural networks using a hierarchical genetic algorithm based on the database complexity applied to human recognition. Inf Sci 309:73–101

    Article  Google Scholar 

  • Sheikhan M, Bejani M, Gharavian D (2013) Modular neural-SVM scheme for speech emotion recognition using ANOVA feature selection method. Neural Comput Appl 23(1):215–227

    Article  Google Scholar 

  • Shibata K, Ikeda Y (2009) Effect of number of hidden neurons on learning in large-scale layered neural networks. In: ICCAS-SICE, pp 5008–5013

  • Soleymani M, Lichtenauer J, Pun T, Pantic M (2012a) A multimodal database for affect recognition and implicit tagging. IEEE Trans Affect Comput 3(1):42–55

    Article  Google Scholar 

  • Soleymani M, Pantic M, Pun T (2012b) Multimodal emotion recognition in response to videos. IEEE Trans Affect Comput 3(2):211–223

    Article  Google Scholar 

  • Sun B, Li L, Wu X, Zuo T, Chen Y, Zhou G, He J, Zhu X (2016) Combining feature-level and decision-level fusion in a hierarchical classifier for emotion recognition in the wild. J Multimodal User Interfaces 10(2):125–137

    Article  Google Scholar 

  • Verma GK, Tiwary US (2014) Multimodal fusion framework: a multiresolution approach for emotion classification and recognition from physiological signals. Neuroimage 102:162–172

    Article  Google Scholar 

  • Wagner J, Andre E, Lingenfelser F, Kim J (2011) Exploring fusion methods for multimodal emotion recognition with missing data. IEEE Trans Affect Comput 2(4):206–218

    Article  Google Scholar 

  • Wang P, Xu L, Zhou SM, Fan Z, Li Y, Feng S (2010) A novel Bayesian learning method for information aggregation in modular neural networks. Expert Syst Appl 37(2):1071–1074

    Article  Google Scholar 

  • Wang SJ, Hilgetag CC, Zhou C (2011) Sustained activity in hierarchical modular neural networks: self-organized criticality and oscillations. Front Comput Neurosci 5:30

    Google Scholar 

  • Wen G, Hou Z, Li H, Li D, Jiang L, Xun E (2017) Ensemble of deep neural networks with probability-based fusion for facial expression recognition. Cognit Comput 9(5):597–610

    Article  Google Scholar 

  • Zeng Z, Pantic M, Roisman GI, Huang TS (2009) A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans Pattern Anal Mach Intell 31(1):39–58

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61603009); Beijing Natural Science Foundation (No. 4182007); the Beijing Municipal Education Commission Foundation (No. KM201910005023); the Key Project of National Natural Science Foundation of China (No. 61533002); and Rixin Scientist” Foundation of Beijing University of Technology (No. 2017-RX(1)-04).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenjing Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, W., Chu, M. & Qiao, J. Design of a hierarchy modular neural network and its application in multimodal emotion recognition. Soft Comput 23, 11817–11828 (2019). https://doi.org/10.1007/s00500-018-03735-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-018-03735-0

Keywords

Navigation