Skip to main content

MDAM: Multi-Dimensional Attention Module for Anomalous Sound Detection

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1967))

Included in the following conference series:

  • 608 Accesses

Abstract

Unsupervised anomaly sound detection (ASD) is a challenging task that involves training a model to differentiate between normal and abnormal sounds in an unsupervised manner. The difficulty of the task increases when there are acoustic differences (domain shift) between the training and testing datasets. To address these issues, this paper proposes a state-of-the-art ASD model based on self-supervised learning. Firstly, we designed an effective attention module called the Multi-Dimensional Attention Module (MDAM). Given a shallow feature map of sound, this module infers attention along three independent dimensions: time, frequency, and channel. It focuses on specific frequency bands that contain discriminative information and time frames relevant to semantics, thereby enhancing the representation learning capability of the network model. MDAM is a lightweight and versatile module that can be seamlessly integrated into any CNN-based ASD model. Secondly, we propose a simple domain generalization method that increases domain diversity by blending the feature representations of different domain data, thereby mitigating domain shift. Finally, we validate the effectiveness of the proposed methods on DCASE 2022 Task 2 and DCASE 2023 Task 2.

This work was supported by the following grants:

- Universities Natural Science Research Project of Anhui Province (KJ2021ZD0118)

- Program for Scientific Research Innovation Team in Colleges and Universities of Anhui Province (2022AH010095)

- The second key orientation of open bidding for selecting the best candidates in innovation and development project supported by speech valley of china : the collaborative research of the multi-spectrum acoustic technology of monitoring and product testing system for key equipment in metallurgical industry (2202-340161-04-04-664544).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)

    Google Scholar 

  2. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  MATH  Google Scholar 

  3. Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289 (2015)

  4. Dohi, K., et al.: Description and discussion on DCASE 2022 challenge task 2: unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. arXiv preprint arXiv:2206.05876 (2022)

  5. Dohi, K., et al.: MIMII DG: sound dataset for malfunctioning industrial machine investigation and inspection for domain generalization task. arXiv preprint arXiv:2205.13879 (2022)

  6. Harada, N., Niizumi, D., Ohishi, Y., Takeuchi, D., Yasuda, M.: First-shot anomaly sound detection for machine condition monitoring: a domain generalization baseline. arXiv preprint arXiv:2303.00455 (2023)

  7. Harada, N., Niizumi, D., Takeuchi, D., Ohishi, Y., Yasuda, M., Saito, S.: Toyadmos2: another dataset of miniature-machine operating sounds for anomalous sound detection under domain shift conditions. arXiv preprint arXiv:2106.02369 (2021)

  8. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

    Google Scholar 

  9. Inoue, T., et al.: Detection of anomalous sounds for machine condition monitoring using classification confidence. In: DCASE, pp. 66–70 (2020)

    Google Scholar 

  10. Jiang, A., et al.: Thuee system for first-shot unsupervised anomalous sound detection for machine condition monitoring. Technical report, DCASE2023 Challenge (2023)

    Google Scholar 

  11. Jiang, A., Zhang, W.Q., Deng, Y., Fan, P., Liu, J.: Unsupervised anomaly detection and localization of machine audio: a GAN-based approach. In: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2023, pp. 1–5. IEEE (2023)

    Google Scholar 

  12. Jie, J.: Anomalous sound detection based on self-supervised learning. Technical report, DCASE2023 Challenge (2023)

    Google Scholar 

  13. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  14. Kuroyanagi, I., Hayashi, T., Takeda, K., Toda, T.: Two-stage anomalous sound detection systems using domain generalization and specialization techniques. Technical report, DCASE2022 Challenge, Technical report (2022)

    Google Scholar 

  15. Lv, Z., Han, B., Chen, Z., Qian, Y., Ding, J., Liu, J.: Unsupervised anomalous detection based on unsupervised pretrained models. Technical report, DCASE2023 Challenge (2023)

    Google Scholar 

  16. Van der Maaten, L., Hinton, G.: Visualizing data using T-SNE. J. Mach. Learn. Res. 9(11) (2008)

    Google Scholar 

  17. Mu, W., Yin, B., Huang, X., Xu, J., Du, Z.: Environmental sound classification using temporal-frequency attention based convolutional neural network. Sci. Rep. 11(1), 21552 (2021)

    Article  Google Scholar 

  18. Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  19. Reynolds, D.A., et al.: Gaussian mixture models. Encycl. Biometrics 741(659–663) (2009)

    Google Scholar 

  20. Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11534–11542 (2020)

    Google Scholar 

  21. Wang, Y., et al.: Unsupervised anomalous sound detection for machine condition monitoring using classification-based methods. Appl. Sci. 11(23), 11128 (2021)

    Article  Google Scholar 

  22. Wilkinghoff, K.: Sub-cluster AdaCos: learning representations for anomalous sound detection. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)

    Google Scholar 

  23. Wilkinghoff, K.: Design choices for learning embeddings from auxiliary tasks for domain generalization in anomalous sound detection. In: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2023, pp. 1–5. IEEE (2023)

    Google Scholar 

  24. Wilkinghoff, K.: Fraunhofer FKIE submission for task 2: first-shot unsupervised anomalous sound detection for machine condition monitoring. Technical report, DCASE2023 Challenge (2023)

    Google Scholar 

  25. Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)

    Google Scholar 

  26. Xiao, F., Liu, Y., Wei, Y., Guan, J., Zhu, Q., Zheng, T., Han, J.: The dcase2022 challenge task 2 system: Anomalous sound detection with self-supervised attribute classification and GMM-based clustering. Challenge Technical report (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junjie Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, S., Wang, J., Wang, J., Xu, Z. (2024). MDAM: Multi-Dimensional Attention Module for Anomalous Sound Detection. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1967. Springer, Singapore. https://doi.org/10.1007/978-981-99-8178-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8178-6_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8177-9

  • Online ISBN: 978-981-99-8178-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics