Abstract
Unsupervised anomaly sound detection (ASD) is a challenging task that involves training a model to differentiate between normal and abnormal sounds in an unsupervised manner. The difficulty of the task increases when there are acoustic differences (domain shift) between the training and testing datasets. To address these issues, this paper proposes a state-of-the-art ASD model based on self-supervised learning. Firstly, we designed an effective attention module called the Multi-Dimensional Attention Module (MDAM). Given a shallow feature map of sound, this module infers attention along three independent dimensions: time, frequency, and channel. It focuses on specific frequency bands that contain discriminative information and time frames relevant to semantics, thereby enhancing the representation learning capability of the network model. MDAM is a lightweight and versatile module that can be seamlessly integrated into any CNN-based ASD model. Secondly, we propose a simple domain generalization method that increases domain diversity by blending the feature representations of different domain data, thereby mitigating domain shift. Finally, we validate the effectiveness of the proposed methods on DCASE 2022 Task 2 and DCASE 2023 Task 2.
This work was supported by the following grants:
- Universities Natural Science Research Project of Anhui Province (KJ2021ZD0118)
- Program for Scientific Research Innovation Team in Colleges and Universities of Anhui Province (2022AH010095)
- The second key orientation of open bidding for selecting the best candidates in innovation and development project supported by speech valley of china : the collaborative research of the multi-spectrum acoustic technology of monitoring and product testing system for key equipment in metallurgical industry (2202-340161-04-04-664544).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289 (2015)
Dohi, K., et al.: Description and discussion on DCASE 2022 challenge task 2: unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. arXiv preprint arXiv:2206.05876 (2022)
Dohi, K., et al.: MIMII DG: sound dataset for malfunctioning industrial machine investigation and inspection for domain generalization task. arXiv preprint arXiv:2205.13879 (2022)
Harada, N., Niizumi, D., Ohishi, Y., Takeuchi, D., Yasuda, M.: First-shot anomaly sound detection for machine condition monitoring: a domain generalization baseline. arXiv preprint arXiv:2303.00455 (2023)
Harada, N., Niizumi, D., Takeuchi, D., Ohishi, Y., Yasuda, M., Saito, S.: Toyadmos2: another dataset of miniature-machine operating sounds for anomalous sound detection under domain shift conditions. arXiv preprint arXiv:2106.02369 (2021)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Inoue, T., et al.: Detection of anomalous sounds for machine condition monitoring using classification confidence. In: DCASE, pp. 66–70 (2020)
Jiang, A., et al.: Thuee system for first-shot unsupervised anomalous sound detection for machine condition monitoring. Technical report, DCASE2023 Challenge (2023)
Jiang, A., Zhang, W.Q., Deng, Y., Fan, P., Liu, J.: Unsupervised anomaly detection and localization of machine audio: a GAN-based approach. In: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2023, pp. 1–5. IEEE (2023)
Jie, J.: Anomalous sound detection based on self-supervised learning. Technical report, DCASE2023 Challenge (2023)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kuroyanagi, I., Hayashi, T., Takeda, K., Toda, T.: Two-stage anomalous sound detection systems using domain generalization and specialization techniques. Technical report, DCASE2022 Challenge, Technical report (2022)
Lv, Z., Han, B., Chen, Z., Qian, Y., Ding, J., Liu, J.: Unsupervised anomalous detection based on unsupervised pretrained models. Technical report, DCASE2023 Challenge (2023)
Van der Maaten, L., Hinton, G.: Visualizing data using T-SNE. J. Mach. Learn. Res. 9(11) (2008)
Mu, W., Yin, B., Huang, X., Xu, J., Du, Z.: Environmental sound classification using temporal-frequency attention based convolutional neural network. Sci. Rep. 11(1), 21552 (2021)
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Reynolds, D.A., et al.: Gaussian mixture models. Encycl. Biometrics 741(659–663) (2009)
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11534–11542 (2020)
Wang, Y., et al.: Unsupervised anomalous sound detection for machine condition monitoring using classification-based methods. Appl. Sci. 11(23), 11128 (2021)
Wilkinghoff, K.: Sub-cluster AdaCos: learning representations for anomalous sound detection. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Wilkinghoff, K.: Design choices for learning embeddings from auxiliary tasks for domain generalization in anomalous sound detection. In: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2023, pp. 1–5. IEEE (2023)
Wilkinghoff, K.: Fraunhofer FKIE submission for task 2: first-shot unsupervised anomalous sound detection for machine condition monitoring. Technical report, DCASE2023 Challenge (2023)
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The dcase2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Challenge Technical report (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chen, S., Wang, J., Wang, J., Xu, Z. (2024). MDAM: Multi-Dimensional Attention Module for Anomalous Sound Detection. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1967. Springer, Singapore. https://doi.org/10.1007/978-981-99-8178-6_4
Download citation
DOI: https://doi.org/10.1007/978-981-99-8178-6_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8177-9
Online ISBN: 978-981-99-8178-6
eBook Packages: Computer ScienceComputer Science (R0)