Deep convolutional neural network based secure wireless voice communication for underground mines

Dey, Prasanjit; Kumar, Chandan; Mitra, Mitrabarun; Mishra, Richa; Chaulya, S. K.; Prasad, G. M.; Mandal, S. K.; Banerjee, G.

doi:10.1007/s12652-020-02700-w

Deep convolutional neural network based secure wireless voice communication for underground mines

Original Research
Published: 02 January 2021

Volume 12, pages 9591–9610, (2021)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Prasanjit Dey¹,
Chandan Kumar¹,
Mitrabarun Mitra¹,
Richa Mishra¹,
S. K. Chaulya ORCID: orcid.org/0000-0002-5396-0086¹,
G. M. Prasad¹,
S. K. Mandal¹ &
…
G. Banerjee¹

406 Accesses
4 Citations
Explore all metrics

Abstract

A secure wireless voice communication system for underground miners is an essential gadget for efficient and safe mining. Voice over internet protocol is a proven solution for wireless communication in underground mines where other cellular and satellite networks cannot be deployed. However, the wireless network's security is the major issue for the reliable operation of the system. A secure voice communication system has been developed by integrating voice over internet protocol system and deep convolutional neural network (DCNN) based trained model. Experimental results indicated that voice recognition accuracy of the DCNN based developed model was 93.7% for the noiseless environment. In contrast, it was 82.1 and 79% for the existing K-nearest-neighbour (KNN) and support vector machine (SVM) algorithms, respectively. Voice recognition response time of the DCNN, KNN, and SVM algorithms was 178, 220, and 228 ms, respectively. Thus, deployment of the developed secure and robust voice communication system would improve safety and productivity in underground mines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Voice Identification of Spanish-Speakers Using a Convolution Neural Network in the Audio Interface of a Computer Attack Analysis Tool

Application of voice recognition system oriented to wireless sensor network in National Defense Education

Article 31 July 2023

Two-Layer Perceptron for Voice Recognition of Speaker’s Identity

References

Andrearczyk V, Whelan PF (2016) Deep learning methods for texture analysis in medical imaging. In: Proceedings of the 18th Irish machine vision and image processing conference, Galway, Ireland, pp 1–6
Ardila D, Resnick C, Roberts A, Eck D (2016) Audio deepdream: optimizing raw audio with convolutional networks. In: Proceedings of the International Society for music information retrieval conference, USA, pp 7–11
Bahn S (2013) Workplace hazard identification and management the case of an underground mining operation. Saf Sci 57:129–137
Article Google Scholar
Caldwell CE, Linkola JP (2018) US Patent Application No. 15/944749
Chauhan N, Isshiki T, Li D (2019) Speaker recognition using LPC, MFCC, ZCR features with ANN and SVM classifier for large input database. In: Proceedings of IEEE international conference on Computer and Communication Systems, Singapore, pp 130–133
Chorowski JK, Bahdanau D, Serdyuk D, Cho K, Bengio Y (2015) Attention-based models for speech recognition. In: Proceedings of advances in neural information processing systems, Canada, pp 577–585
Chorowski J, Weis RJ, Saurou RA, Bengio S (2018) On using backpropagation for speech texture generation and voice conversion. In: Proceedings of IEEE International Conference on acoustics speech and signal, Canada, pp 2256–2260.
Dantu R, Fahmy S, Schulzrinne H, Cangussu J (2009) Issues and challenges in securing VoIP. Comput Secur 28(8):743–753
Article Google Scholar
Dewi SP, Prasasti AL, Irawan B (2019) Analysis of LFCC feature extraction in baby crying classification using KNN. In: Proceedings of IEEE International Conference on Internet of Things and Intelligence System, Indonesia, pp 86–91
Dutoit T (1997) An introduction to text-to-speech synthesis, vol 3. Springer Science and Business Media, London
Book Google Scholar
Fadlilah AF, Djamal EC (2019) Speaker and speech recognition using hierarchy support vector machine and backpropagation. In: Proceedings of IEEE International Conference on Electrical Engineering, Computer Science and Informatics, Indonesia, pp 404–409.
Goetz CG, Poewe W, Rascol O, Sampaio C, Stebbins GT (2003) The unified Parkinson’s Disease Rating Scale (UPDRS): status and recommendations. Mov Disord 18(7):738–750. https://doi.org/10.1002/mds.10473
Article Google Scholar
Goode B (2002) Voice over internet protocol (VoIP). In: Proceedings of IEEE, pp 1495–1517
Graves A, Jaitly N (2014) Towards end-to-end speech recognition with recurrent neural networks. In: Proceedings of International Conference on machine learning, China, pp 1764–1772
Gupta DK, Gupta VK, Chandra M, Mishra AN, Srivastava PK (2019) Hardware co-simulation of adaptive noise cancellation system using LMS and leaky LMS algorithms. In: Proceedings of IEEE International Conference on internet of things: smart innovation and usages, India, pp 1–6.
Hsieh WB, Leu JS (2018) Implementing a secure VoIP communication over SIP-based networks. Wirel Netw 24(8):2915–2926
Article Google Scholar
Ikeda H, Kawamura Y, Jang, H, Mokhtar NEB, Yokokura J, Paul Z, Tungol L(2019) Development of an underground in-situ stress monitoring system for mining safety using multi sensor cell and wi-fi direct technology. In: Proceedings of International Symposium on mine planning and equipment Selection, Springer, Cham, pp 236–244
Jiang H, Bai J, Zhang S, Xu B (2005) SVM-based audio scene classification. In: Proceedings of IEEE International Conference on natural language processing and knowledge engineering, China, pp 131–136
Kaul S, Jain A (2019) Opus and session initiation protocol security in voice over IP (VOIP). Eur J Eng Res Sci 4(12):27–37
Article Google Scholar
Kekre HB, Kulkarni GP, Gupta N (2012) Speaker identification using spectrograms of varying frame sizes. Int J oComput Appl 50(20):27–33
Google Scholar
Khan HM, Gunnalan R, Mahabhashyam (2013) SPM US Patent No. 8,385,326. Washington DC US Patent and Trademark Office.
Khunarsal P, Lursinsa C, Raicharoen T (2009) Singing voice recognition based on matching of spectrogram pattern. In: Proceedings of International Joint Conference on neural networks, USA, pp 1595–1599
Kitajima H (1980) A symmetric cosine transform. IEEE Trans Comput 4:317–323
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceeding of advances in neural information processing systems, United States, pp 1097–1105
Kubichek R (1993) Mel-cepstral distance measure for objective speech quality assessment. In: Proceedings of IEEE Pacific Rim Conference on communications computers and signal processing, Canada, pp 125–128
Liu H, Li L, Ma J (2016) Rolling bearing fault diagnosis based on STFT-deep learning and sound signals. Shock Vib 2:1–12
Google Scholar
Matveykin V, Nemtinov V, Dmitrievsky B, Praveen K (2019) Development and implementation of network based underground mines safety, rescue and aided rescue system. In: Proceedings of journal of physics: conference series, Russia, pp 1–12
Misra P, Kanhere S, Ostry D, Jha S (2010) Safety assurance and rescue communication systems in high-stress environments a mining case study. IEEE Commun Mag 48(4):66–739
Article Google Scholar
Mohammadi SH, Kain A (2017) An overview of voice conversion systems. Speech Commun 88:65–82
Article Google Scholar
Oord AVD, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K (2016) Wavenet a generative model for raw audio.arXiv preprint arXiv:1609.03499
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434
Salau AO, Oluwafemi I, Faleye KF, Jain S (2019) audio compression using a modified discrete cosine transform with temporal auditory masking. In: Proceedings of IEEE International Conference on signal processing and communication (ICSC), India, pp 135–142
Saleh JH, Cummings AM (2011) Safety in the mining industry and the unfinished legacy of mining accidents safety levers and defense-in-depth for addressing mining hazards. Saf Sci 49(6):764–777
Article Google Scholar
Sanmiquel L, Rossell JM, Vintró C (2015) Study of Spanish mining accidents using data mining techniques. Saf Sci 75:49–55
Article Google Scholar
Schuller B, Reiter S, Muller R, Al-Hames M, Lang M, Rigoll G (2005) Speaker independent speech emotion recognition by ensemble classification. In: Proceedings of IEEE International Conference on multimedia and expo, Netherlands, pp 864–867
Schulzrinne H, Wedlun E (2000) Application-layer mobility using SIP. Mob Comput Commun Rev 4(3):47–57
Article Google Scholar
Shunyi RXZ (2001) Next generation network architecture based on softswitch. Telecommn Sci 8:25–31
Google Scholar
Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034
Sing S, Keil D, Loeser M, South M, Villani P (2018) US Patent Application No. 15/925,063
Ulyanov D, Lebedev V (2016) Audio texture synthesis and style transfer. https://www.dmitryulyanov.github.io/audio-texture-synthesis-and-style-transfer. Accessed 13 Dec 2016
Wang X, Tang H, Zhao X (2004) Noisy speech pitch detection based on mathematical morphology and weighted MACF. In: Proceedings of Chinese Conference on biometric recognition, Berlin, pp 594–601
Wang Y, Xie Z, Xu K, Dou Y, Lei Y (2016) An efficient and effective convolutional auto-encoder extreme learning machine network for 3D feature learning. Neurocomputing 174:988–998
Article Google Scholar
Winursito A, Hidayat R, Bejo A (2018) Improvement of mfcc feature extraction accuracy using PCA in Indonesian speech recognition. In: Proceedings of IEEE international Conference on information and communications technology, Indonesia, pp 379–383
Yu J, Al Ajarmeh I (2008) Design and traffic engineering of VoIP for enterprise and carrier networks. Int J Adv Telecommun 1(1):1–13
Google Scholar
Yue J, Wang Z, Ran Y (2019) SIP-based interactive voice response system using frees witch EPBX. In: Proceedings of International Conference on intelligent and interactive systems and applications, Thailand, pp 614–621
Zen H, Agiomyrgiannakis Y, Egberts N, Henderson, F, Szczepaniak P (2016) Fast, compact, and high quality LSTM-RNN based statistical parametric speech synthesizers for mobile devices. arXiv preprint arXiv:1606.06061

Download references

Acknowledgements

Authors are grateful to Dr. Pradeep K. Singh, Director of CSIR-Central Institute of Mining and Fuel Research, Dhanbad, India, to publish this paper. The authors are also thankful to the Ministry of Electronics and Information Technology, Government of India, for supporting this study.

Author information

Authors and Affiliations

CSIR-Central Institute of Mining and Fuel Research, Barwa Road, Dhanbad, 826001, India
Prasanjit Dey, Chandan Kumar, Mitrabarun Mitra, Richa Mishra, S. K. Chaulya, G. M. Prasad, S. K. Mandal & G. Banerjee

Authors

Prasanjit Dey
View author publications
You can also search for this author in PubMed Google Scholar
Chandan Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Mitrabarun Mitra
View author publications
You can also search for this author in PubMed Google Scholar
Richa Mishra
View author publications
You can also search for this author in PubMed Google Scholar
S. K. Chaulya
View author publications
You can also search for this author in PubMed Google Scholar
G. M. Prasad
View author publications
You can also search for this author in PubMed Google Scholar
S. K. Mandal
View author publications
You can also search for this author in PubMed Google Scholar
G. Banerjee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. K. Chaulya.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dey, P., Kumar, C., Mitra, M. et al. Deep convolutional neural network based secure wireless voice communication for underground mines. J Ambient Intell Human Comput 12, 9591–9610 (2021). https://doi.org/10.1007/s12652-020-02700-w

Download citation

Received: 02 May 2020
Accepted: 16 November 2020
Published: 02 January 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s12652-020-02700-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep convolutional neural network based secure wireless voice communication for underground mines

Abstract

Access this article

Similar content being viewed by others

Voice Identification of Spanish-Speakers Using a Convolution Neural Network in the Audio Interface of a Computer Attack Analysis Tool

Application of voice recognition system oriented to wireless sensor network in National Defense Education

Two-Layer Perceptron for Voice Recognition of Speaker’s Identity

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deep convolutional neural network based secure wireless voice communication for underground mines

Abstract

Access this article

Similar content being viewed by others

Voice Identification of Spanish-Speakers Using a Convolution Neural Network in the Audio Interface of a Computer Attack Analysis Tool

Application of voice recognition system oriented to wireless sensor network in National Defense Education

Two-Layer Perceptron for Voice Recognition of Speaker’s Identity

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation