Skip to main content
Log in

Anomalous sound detection for machine condition monitoring using 3D tensor representation of sound and 3D deep convolutional neural network

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This study introduces a novel approach that utilizes a three-dimensional tensor representation of machine-generated audio signals, serving as a suitable input for a three-dimensional convolutional neural network. The proposed method involves calculating the reconstructed phase space of the audio signal, followed by converting the resulting three-dimensional reconstructed phase space into a three-dimensional tensor format. This technique offers superiority by capturing nonlinear dynamic features and uncovering hidden system variables, which can improve discrimination and classification, enabling accurate detection of anomalous sound patterns, with valuable information encoded in the shape of the data cloud within the tensors. Subsequently, these tensors are employed as input to a three-dimensional deep convolutional neural network, facilitating effective analysis and classification of the audio signals. To assess the effectiveness of the proposed method, we conduct a comprehensive evaluation on three benchmark datasets: MFPT, MIMII, and ToyADAMOS, employing a 5-fold cross-validation scheme. The evaluation metrics employed include Sensitivity, Specificity, Accuracy, and F1 Score to ensure a thorough examination of the method's performance across diverse datasets, encompassing different machine types and acoustic environments. The experimental results showed a high average accuracy of 97.63% on the MFPT dataset. However, in the MIMII dataset, the slider machinery achieved the highest average accuracy rate of 92.02%, while the pump machinery had the lowest average accuracy rate of 90.54%. For the ToyADAMOS dataset, an average accuracy rate of approximately 94% was obtained. These findings underscore the method's potential for accurately detecting anomalies across various machine types and acoustic environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The experiments have been performed on three publicly available datasets; The MFPT (malfunctioning industrial machine investigation and inspection) dataset [6], The malfunctioning industrial machine investigation and inspection (MIMII) data set [30], and ToyADAMOS [18] with the following links: https://www.mfpt.org/fault-data-sets/, http://www.zenodo.org/record/3384388#.Y4SKr3bP1D8, https://paperswithcode.com/dataset/toyadmos

References

  1. Bogdanov D, Wack N, Gómez E, Gulati S, Herrera P, Mayor O, Roma G, Salamon J, Zapata J, Serra X (2013) ESSENTIA: an Audio Analysis Library for Music Information Retrieval,14th International Society for Music Information Retrieval Conference, Curitiba

  2. Chollet F (2017) Deep learning with python. Manning Publications

    Google Scholar 

  3. Coupé P, Mansencal B, Clément M, Giraud R, Denis de Senneville B, Ta V-T, Lepetit V, Manjon JV (2020) AssemblyNet: A large ensemble of CNNs for 3D whole brain MRI segmentation. NeuroImage 219:117026. https://doi.org/10.1016/j.neuroimage.2020.117026

    Article  Google Scholar 

  4. Eyben F, Wöllmer M, Schuller B (2010) Opensmile Proceedings of the 18th ACM international conference on Multimedia. https://doi.org/10.1145/1873951.1874246

  5. Farahani M, Behnam A, Ahmadian A (2021) Comparison of feature selection methods in diagnosing Alzheimer’s disease. J Med Signals Sensors 11(2):82–90. https://doi.org/10.4103/jmss.JMSS_57_20

    Article  Google Scholar 

  6. Fault data sets (2017) https://www.mfpt.org/fault-data-sets/

  7. Fengqi W, Meng G (2006) Compound rub malfunctions feature extraction based on full-spectrum cascade analysis and SVM. Mech Syst Signal Process 20(8):2007–2021. https://doi.org/10.1016/j.ymssp.2005.10.004

    Article  Google Scholar 

  8. Fraser AM, Swinney HL (1986) Independent coordinates for strange attractors from mutual information. Phys Rev A Gen Phys 33(2):1134–1140. https://doi.org/10.1103/physreva.33.1134

    Article  MathSciNet  Google Scholar 

  9. Gribbestad M, Hassan MU, Hameed IA, Sundli K (2021) Health Monitoring of Air Compressors Using Reconstruction-Based Deep Learning for Anomaly Detection with Increased Transparency. Entropy 23(1):83. https://www.mdpi.com/1099-4300/23/1/83

  10. Halder S, Bhat S, Dora BK (2022) Inverse thresholding to spectrogram for the detection of broken rotor bar in induction motor. Measurement 198:111400. https://doi.org/10.1016/j.measurement.2022.111400

    Article  Google Scholar 

  11. Hamel P, Eck D (2010) Learning Features from Music Audio with Deep Belief Networks. ISMIR

    Google Scholar 

  12. Harimi A, Fakhr HS, Bakhshi A (2016) Recognition Of emotion using reconstructed phase space of speech. Malaysian J Comput Sci 29(4), 262–271. https://doi.org/10.22452/mjcs.vol29no4.2

  13. Hong G, Suh D (2021) Supervised-Learning-Based Intelligent Fault Diagnosis for Mechanical Equipment. IEEE Access 9:116147–116162. https://doi.org/10.1109/ACCESS.2021.3104189

    Article  Google Scholar 

  14. Jombo G, Zhang Y (2023) Acoustic-based machine condition monitoring—methods and challenges. Eng 4(1):47–79. https://www.mdpi.com/2673-4117/4/1/4

  15. Justus V, Kanagachidambaresan (2022) Intelligent single-board computer for industry 4.0: Efficient real-time monitoring system for anomaly detection in CNC machines. Microprocess Microsyst 93(104629):104629. https://doi.org/10.1016/j.micpro.2022.104629

  16. Kennel MB, Brown R, Abarbanel HD (1992) Determining embedding dimension for phase-space reconstruction using a geometrical construction. Phys Rev A 45(6):3403–3411. https://doi.org/10.1103/physreva.45.3403

    Article  Google Scholar 

  17. Khurana U, Samulowitz H, Turaga D (2018) Feature engineering for predictive modeling using reinforcement learning. Proc Conf AAAI Artif Intell 32(1). https://doi.org/10.1609/aaai.v32i1.11678

  18. Koizumi Y, Saito S, Uematsu H, Harada N, Imoto K (2019) ToyADMOS: A dataset of miniature-machine operating sounds for anomalous sound detection 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).https://doi.org/10.1109/waspaa.2019.8937164

  19. Kovács PP, Schimmel J (2016) Higher-dimensional signal processing for vibrational analysis. In Proceedings of the 43rd International Congress on Noise Control Engineering (pp. 3086–3093)

  20. Krajewski J, Schnieder S, Sommer D, Batliner A, Schuller B (2012) Applying multiple classifiers and non-linear dynamics features for detecting sleepiness from speech. Neurocomputing 84:65–75. https://doi.org/10.1016/j.neucom.2011.12.021

  21. Langone R, Alzate C, De Ketelaere B, Vlasselaer J, Meert W, Suykens JAK (2015) LS-SVM based spectral clustering and regression for predicting maintenance of industrial machines. Eng Appl Artif Intell 37:268–278. https://doi.org/10.1016/j.engappai.2014.09.008

    Article  Google Scholar 

  22. Lartillot O, Toiviainen P (2007) MIR in Matlab (II): A Toolbox for Musical Feature Extraction from Audio. ISMIR

    Google Scholar 

  23. Lathrop D (2015) Nonlinear dynamics and chaos: With applications to physics, biology, chemistry, and EngineeringNonlinear dynamics and chaos: With applications to physics, biology, chemistry, and engineering, Steven H. strogatz, Westview press, 2015. 2nd ed. $60.00 paper (528 pp.). ISBN 978–0–813–34910–7 buy at Amazon. Phys Today 68(4):54–55. https://doi.org/10.1063/pt.3.2751

  24. Lei X, Ji H, Xu Q, Ye T, Zhang S, Huang C (2022) Research on data diagnosis method of acoustic array sensor device based on spectrogram. Glob Energy Interconnect 5(4):418–433. https://doi.org/10.1016/j.gloei.2022.08.008

    Article  Google Scholar 

  25. Liu C, Feng L, Liu G, Wang H, Liu S (2021) Bottom-up broadcast neural network for music genre classification. Multimed Tools Appl 80(5):7313–7331. https://doi.org/10.1007/s11042-020-09643-6

    Article  Google Scholar 

  26. Ma H-G, Han C-Z (2006) Selection of embedding dimension and delay time in phase space reconstruction. Front Electr Electron Eng China 1(1):111–114. https://doi.org/10.1007/s11460-005-0023-7

    Article  Google Scholar 

  27. Meyer A, Chlebus G, Rak M, Schindele D, Schostak M, van Ginneken B, Schenk A, Meine H, Hahn HK, Schreiber A, Hansen C (2021) Anisotropic 3D Multi-Stream CNN for Accurate Prostate Segmentation from Multi-Planar MRI. Comput Methods Programs Biomed 200:105821. https://doi.org/10.1016/j.cmpb.2020.105821

  28. Nair V, Hinton G (2010) Rectified Linear Units Improve Restricted Boltzmann Machines Vinod Nair, the 27th Internati onal Conference on Machine Learning (ICML-10), Haifa

  29. Park Y-J, Fan S-KS, Hsu C-Y (2020) A review on fault detection and process diagnostics in industrial processes. Processes (Basel) 8(9):1123. https://doi.org/10.3390/pr8091123

    Article  Google Scholar 

  30. Purohit H, Tanabe R, Ichige T, Endo T, Nikaido Y, Suefusa K, Kawaguchi Y (2019) MIMII dataset: Sound dataset for malfunctioning industrial machine investigation and inspection. Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019). https://doi.org/10.33682/m76f-d618

  31. Shah A, Mizuno A, Linghai W, Weinstein A, Aizenstein H (2021) Prediction of cognitive function based on structural mri images using a 3d convolutional neural net (cnn) among cognitively normal older adults. Bio Psychiatry 89(9, Supplement):S372. https://doi.org/10.1016/j.biopsych.2021.02.925

  32. Shahzadi A, Ahmadyfard A, Harimi A, Yaghmaie K (2015) Speech emotion recognition using nonlinear dynamics features. TURK J Electr Eng Comput Sci 23:2056–2073. https://doi.org/10.3906/elk-1302-90

    Article  Google Scholar 

  33. Shahzadi A, Ahmadyfard A, Yaghmaie K, Harimi A (2013) Recognition of emotion in speech using spectral patterns. Malaysian J Comput Sci 26(2):140–158. https://ejournal.um.edu.my/index.php/MJCS/article/view/6767

  34. Shin J, Lee S (2023) Robust and lightweight deep learning model for industrial fault diagnosis in low-quality and noisy data. Electronics 12(2):409. https://www.mdpi.com/2079-9292/12/2/409

  35. Sousa R, Antunes J, Coutinho F, Silva E, Santos J, Ferreira H (2019) Robust cepstral-based features for anomaly detection in ball bearings. Int J Adv Manuf Technol 103(5–8):2377–2390. https://doi.org/10.1007/s00170-019-03597-2

    Article  Google Scholar 

  36. Srinivasu PN, JayaLakshmi G, Jhaveri RH, Praveen SP (2022) ambient assistive living for monitoring the physical activity of diabetic adults through body area networks. Mob Inf Syst 2022:3169927. https://doi.org/10.1155/2022/3169927

    Article  Google Scholar 

  37. Tagawa Y, Maskeliūnas R, Damaševičius R (2021) acoustic anomaly detection of mechanical failures in noisy real-life factory environments. Electronics 10(19):2329. https://www.mdpi.com/2079-9292/10/19/2329

  38. Takens F (1981) Detecting strange attractors in turbulence. Dynamical Systems and Turbulence, Warwick 1980, Berlin, Heidelberg

  39. Tama BA, Vania M, Kim I, Lim S (2022) An EfficientNet-Based Weighted Ensemble Model for Industrial Machine Malfunction Detection Using Acoustic Signals. IEEE Access 10:34625–34636. https://doi.org/10.1109/ACCESS.2022.3160179

    Article  Google Scholar 

  40. Wang L, Sun G, Wang Y, Ma J, Zhao X, Liang R (2022) AFExplorer: Visual analysis and interactive selection of audio features. Vis Inform 6(1):47–55. https://doi.org/10.1016/j.visinf.2022.02.003

    Article  Google Scholar 

  41. Wang Y, Chen X, Jiang C (2019) Multidimensional representation learning for audio signal processing. In Proceedings of the 2019 International Joint Conference on Neural Networks (pp 1–7)

  42. Yu H, Wang K, Li Y, He M (2021) Deep subclass reconstruction network for fault diagnosis of rotating machinery under various operating conditions. Appl Soft Comput 112(107755):107755. https://doi.org/10.1016/j.asoc.2021.107755

    Article  Google Scholar 

  43. Yu L, Yao X, Yang J, Li C (2020) Gear fault diagnosis through vibration and acoustic signal combination based on convolutional neural network. Information 11(5)

  44. Zabin M, Choi H-J, Uddin J (2022) Hybrid deep transfer learning architecture for industrial fault diagnosis using Hilbert transform and DCNN–LSTM. J Supercomput. https://doi.org/10.1007/s11227-022-04830-8

    Article  Google Scholar 

  45. Zheng F, Zhang G, Song Z (2001) Comparison of different implementations of MFCC. J Comput Sci Technol 16(6):582–589. https://doi.org/10.1007/bf02943243

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Azita Azarfar.

Ethics declarations

Ethics approval

The authors did not receive support from any organization for the submitted work.

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khanjari, M., Azarfar, A., Abardeh, M.H. et al. Anomalous sound detection for machine condition monitoring using 3D tensor representation of sound and 3D deep convolutional neural network. Multimed Tools Appl 83, 44101–44119 (2024). https://doi.org/10.1007/s11042-023-17043-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17043-9

Keywords

Navigation