MDAM: Multi-Dimensional Attention Module for Anomalous Sound Detection

Chen, Shengbing; Wang, Junjie; Wang, Jiajun; Xu, Zhiqi

doi:10.1007/978-981-99-8178-6_4

Shengbing Chen¹⁰,
Junjie Wang¹⁰,
Jiajun Wang¹⁰ &
…
Zhiqi Xu¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1967))

Included in the following conference series:

International Conference on Neural Information Processing

608 Accesses

Abstract

Unsupervised anomaly sound detection (ASD) is a challenging task that involves training a model to differentiate between normal and abnormal sounds in an unsupervised manner. The difficulty of the task increases when there are acoustic differences (domain shift) between the training and testing datasets. To address these issues, this paper proposes a state-of-the-art ASD model based on self-supervised learning. Firstly, we designed an effective attention module called the Multi-Dimensional Attention Module (MDAM). Given a shallow feature map of sound, this module infers attention along three independent dimensions: time, frequency, and channel. It focuses on specific frequency bands that contain discriminative information and time frames relevant to semantics, thereby enhancing the representation learning capability of the network model. MDAM is a lightweight and versatile module that can be seamlessly integrated into any CNN-based ASD model. Secondly, we propose a simple domain generalization method that increases domain diversity by blending the feature representations of different domain data, thereby mitigating domain shift. Finally, we validate the effectiveness of the proposed methods on DCASE 2022 Task 2 and DCASE 2023 Task 2.

This work was supported by the following grants:

- Universities Natural Science Research Project of Anhui Province (KJ2021ZD0118)

- Program for Scientific Research Innovation Team in Colleges and Universities of Anhui Province (2022AH010095)

- The second key orientation of open bidding for selecting the best candidates in innovation and development project supported by speech valley of china : the collaborative research of the multi-spectrum acoustic technology of monitoring and product testing system for key equipment in metallurgical industry (2202-340161-04-04-664544).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article MATH Google Scholar
Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289 (2015)
Dohi, K., et al.: Description and discussion on DCASE 2022 challenge task 2: unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. arXiv preprint arXiv:2206.05876 (2022)
Dohi, K., et al.: MIMII DG: sound dataset for malfunctioning industrial machine investigation and inspection for domain generalization task. arXiv preprint arXiv:2205.13879 (2022)
Harada, N., Niizumi, D., Ohishi, Y., Takeuchi, D., Yasuda, M.: First-shot anomaly sound detection for machine condition monitoring: a domain generalization baseline. arXiv preprint arXiv:2303.00455 (2023)
Harada, N., Niizumi, D., Takeuchi, D., Ohishi, Y., Yasuda, M., Saito, S.: Toyadmos2: another dataset of miniature-machine operating sounds for anomalous sound detection under domain shift conditions. arXiv preprint arXiv:2106.02369 (2021)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Inoue, T., et al.: Detection of anomalous sounds for machine condition monitoring using classification confidence. In: DCASE, pp. 66–70 (2020)
Google Scholar
Jiang, A., et al.: Thuee system for first-shot unsupervised anomalous sound detection for machine condition monitoring. Technical report, DCASE2023 Challenge (2023)
Google Scholar
Jiang, A., Zhang, W.Q., Deng, Y., Fan, P., Liu, J.: Unsupervised anomaly detection and localization of machine audio: a GAN-based approach. In: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2023, pp. 1–5. IEEE (2023)
Google Scholar
Jie, J.: Anomalous sound detection based on self-supervised learning. Technical report, DCASE2023 Challenge (2023)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kuroyanagi, I., Hayashi, T., Takeda, K., Toda, T.: Two-stage anomalous sound detection systems using domain generalization and specialization techniques. Technical report, DCASE2022 Challenge, Technical report (2022)
Google Scholar
Lv, Z., Han, B., Chen, Z., Qian, Y., Ding, J., Liu, J.: Unsupervised anomalous detection based on unsupervised pretrained models. Technical report, DCASE2023 Challenge (2023)
Google Scholar
Van der Maaten, L., Hinton, G.: Visualizing data using T-SNE. J. Mach. Learn. Res. 9(11) (2008)
Google Scholar
Mu, W., Yin, B., Huang, X., Xu, J., Du, Z.: Environmental sound classification using temporal-frequency attention based convolutional neural network. Sci. Rep. 11(1), 21552 (2021)
Article Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Reynolds, D.A., et al.: Gaussian mixture models. Encycl. Biometrics 741(659–663) (2009)
Google Scholar
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11534–11542 (2020)
Google Scholar
Wang, Y., et al.: Unsupervised anomalous sound detection for machine condition monitoring using classification-based methods. Appl. Sci. 11(23), 11128 (2021)
Article Google Scholar
Wilkinghoff, K.: Sub-cluster AdaCos: learning representations for anomalous sound detection. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Google Scholar
Wilkinghoff, K.: Design choices for learning embeddings from auxiliary tasks for domain generalization in anomalous sound detection. In: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2023, pp. 1–5. IEEE (2023)
Google Scholar
Wilkinghoff, K.: Fraunhofer FKIE submission for task 2: first-shot unsupervised anomalous sound detection for machine condition monitoring. Technical report, DCASE2023 Challenge (2023)
Google Scholar
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Google Scholar
Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The dcase2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Challenge Technical report (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Artificial Intelligence and Big Data, Hefei University, Hefei, 230000, China
Shengbing Chen, Junjie Wang, Jiajun Wang & Zhiqi Xu

Authors

Shengbing Chen
View author publications
You can also search for this author in PubMed Google Scholar
Junjie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiajun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqi Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junjie Wang .

Editor information

Editors and Affiliations

Changsha, China
Biao Luo
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Long Cheng
Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, China
Zheng-Guang Wu
School of Automation, Guangdong University of Technology, Guangdong, China
Hongyi Li
UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, S., Wang, J., Wang, J., Xu, Z. (2024). MDAM: Multi-Dimensional Attention Module for Anomalous Sound Detection. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1967. Springer, Singapore. https://doi.org/10.1007/978-981-99-8178-6_4

Download citation

DOI: https://doi.org/10.1007/978-981-99-8178-6_4
Published: 30 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8177-9
Online ISBN: 978-981-99-8178-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics