DSP-based voice activity detection and background noise reduction
- 28 Downloads
These days’ speech processing devices like voice-controlled devices, radio, and cell phones have gained more popularity in the area of military, audio forensics, speech recognition, education and health sectors. In the real world, speech signal during communication always contains background noise. The main task of speech related applications is voice activity detection (VAD) which include speech communication, speech recognition, and speech coding. Noise-reduction schemes for speech communication may increase the quality of speech and improve working efficiency in military aviation. Most of the developed algorithms can improve the quality of speech but unable to remove the background noise from the speech. This study provides researchers with a summary of the challenges in speech communication with background noise and provides research directions in the area of military personnel and workforces who work in noisy environments. Results of the study reveal that the DSP-based voice activity detection and background noise reduction algorithm reduced the spurious values of the speech signal.
KeywordsDsPIC VAD DSC Voice activity detection Speech technology Speech processing Military communications
- Bhooshan, S., Kumar, V., Verma, U., Vatsyayan, H., & Rohit, K. (2008). T-Law: A new suggestion for signal companding. In 2008 Congress on Image and Signal Processing (Vol. 3, pp. 190–194). https://doi.org/10.1109/CISP.2008.700.
- dsPIC DSC Noise Suppression Library User’s Guide (2004-2011). Microchip Technology Inc, DS70133E. Retrieved from http://ww1.microchip.com/downloads/en/ DeviceDoc/ DS-70133E.pdf.
- dsPIC33F Family Data Sheet, High-Performance, 16-bit Digital Signal Controllers, Microchip Technical Literature. Retrieved February 15, 2018, from http://ww1.microchip.com/downloads/ en/DeviceDoc/70165d.pdf.
- G.711 Speech Encoding/Decoding Library for 16-bit MCUs and DSCs User’s Guide, 2011 Microchip Technology. Retrieved February 15, 2018, from http://ww1.microchip.com/downloads/en//softwarelibrary/g.711%20speech%20encodingdecoding/70666a.pdf.
- García, M., Patiño, D., & Quintana, R. (2015). DSP implementation of the FxLMS algorithm for active noise control: Texas instruments TSM320C6713DSK, 2015 IEEE 2nd Colombian Conference on Automatic Control (CCAC). https://doi.org/10.1109/CCAC.2015.7345188.
- Graf, S., Herbig, T., Buck, M., Schmidt, G. (2016). Voice activity detection based on modulation-phase differences. In Proceedings of Speech Communication; 12. ITG Symposium. Retrieved from https://ieeexplore.ieee.org/document/7776151/.
- Haykin, S., & Moher, M. (2007). Introduction to analog & digital communications (2nd ed., pp. 207–208). Hoboken: John Wiley and Sons, Inc.Google Scholar
- Khoa P. C. (2012). Noise robust voice activity detection, Master thesis, The Nanyang Technological University, 2012. Retrieved from https://pdfs.semanticscholar.org/fc3/27b8a7df7b99341637506d3f0eba4845d753.pdf.
- Liang, J., Ahmad, M. O., & Swamy, M. N. S. (2005). Implementation of a voice activity detection and comfort noise generation Algorithm. In 48th Midwest Symposium on Circuits and Systems, Vol. 1, pp. 440–443. https://doi.org/10.1109/MWSCAS.2005.1594132.
- MPLAB integrated development environment. Retrieved February 15, 2018, from: http://www.microchip.com/mplab/mplab-ide-home.
- New Microchip dsPIC33 Digital Signal Controller Family (2005). Retrieved from http://www.microcontroller.com/news/microchip_dsPIC33.asp.
- Pasad, A., Sabu, K., & Rao, P.(2017). Voice Activity detection for children’s read speech recognition in noisy conditions. In 2017 Twenty-third National Conference on Communications (NCC), IEEE. https://doi.org/10.1109/NCC.2017.8077072.
- Pearce,D., & Hirsch, H. (2000). The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy condition. In aICSLP 2000, 6th International Conference on Spoken Language Processing. Beijing, China, 16–20 October 2000.Google Scholar
- Sat-Com (PTY) Ltd, Windhoek, Namibia, http://www.sat.com.na/.
- Singh, R., Seltzer, M. L., Raj, B., & Stern, R. M. (2001). Speech in Noisy Environments: Robust automatic segmentation, feature extraction, and hypothesis combination. In February 2001 Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on 1, pp. 273–276. https://doi.org/10.1109/ICASSP.2001.940820.
- Speech Coding Solutions User’s Guide, DS70295A. (2007). Microchip Technology Inc. Retrieved February 15, 2018, from http://ww1.microchip.com/ downloads/en/DeviceDoc/70295A.pdf, dsPIC® DSC.
- Smith, S.W. (2018), The breadth and depth of DSP-the roots of DSP, The Scientist and Engineer’s Guide to Digital Signal Processing. Retrieved April 11, 2018, from http://www.dspguide.com/ch1/1.htm.
- Vajda, S., & Santosh, K. C. (2017). A fast k-nearest neighbor classifier using unsupervised clustering. In K. Santosh, M. Hangarge, V. Bevilacqua & A. Negi (Eds.), Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016 (Vol. 709). Singapore: Springer.Google Scholar