Skip to main content
Log in

Improvement of speech signal extraction method using detection filter of energy spectrum entropy

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

In speech recognition system research, recognition system performance has been significantly improved through research and development in the speech recognition area, but environmental noise is still a favorite subject for research due to its numerous environmental changes. And speech extraction techniques, which are widely applied, improve speech signals that are mixed with noise. A least mean square (LMS) adaptation filter is commonly used to help noise estimation and detection algorithms adapt to changing environments. But an LMS filter needs some time to adapt and estimate signals. That weakness can be overcome by using energy spectrum entropy and an average estimate LMS (AELMS) filter to detect robust voice activity in a noisy environment. In this paper, we propose a speech signal extraction method using a detection filter of energy spectrum entropy. The proposed method is polluted speech–signal noise extraction to reduce noise with an AELMS filter to detect robust voice activity. An AELMS filter maintains source features of speech, decreases speech information degradation, and reduces noise in a polluted speech signal. To improve adaptation speed, we calculated an average estimator, and controlled the LMS filter step size with a frame measure. For speech detection of signals synthesized with low-speed and high-speed driving noise, an energy spectrum entropy method was used. Compared to an existing method of using frame energy, the proposed method improved the starting point of the resulting speech by 1.7 % of an error rate and by 3.7 % of an end point error rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Haykin, S.: Adaptive Filter Theory. Prentice Hall, Upper Saddle River (2002)

    Google Scholar 

  2. Homer, J.: Detection guided NLMS estimation of sparsely parameterized channels. In: IEEE Transactions on Circuits and Systems U: Analog and Signal Processing, vol. 47, No. 12 (2000)

  3. Kang, S.K., Chung, K.Y., Lee, J.H.: Development of head detection and tracking systems for visual surveillance. Pers. Ubiquitous Comput. 18, 515–522 (2014)

    Article  Google Scholar 

  4. Kim, G.H., Kim, Y.G., Chung, K.Y.: towards virtualized and automated software performance test architecture. Multimed. Tools Appl. (2013). doi:10.1007/s11042-013-1536-3

  5. Baek, S.J., Han, J.S., Chung, K.Y.: Dynamic reconfiguration based on goal-scenario by adaptation strategy. Wirel. Pers. Commun. 73(2), 309–318 (2013)

    Article  Google Scholar 

  6. Chung, K.Y.: Effect of facial makeup style recommendation on visual sensibility. Multimed. Tools Appl. 71(2), 843–853 (2014)

    Article  Google Scholar 

  7. Kim, S.H., Chung, K.Y.: Medical information service system based on human 3D anatomical model. Multimed. Tools Appl. (2013). doi:10.1007/s11042-013-1584-8

  8. Ko, J.W., Chung, K.Y., Han, J.S.: Model transformation verification using similarity and graph comparison algorithm. Multimed. Tools Appl. (2013). doi:10.1007/s11042-013-1581-y

  9. Han, J.S., Chung, K.Y., Kim, G.J.: Policy on literature content based on software as service. Multimed. Tools Appl. (2013). doi:10.1007/s11042-013-1664-9

  10. Boutaba, R., Chung, K., Gen, M.: Recent trends in interactive multimedia computing for industry. Clust. Comput. 17(3), 723–726 (2014)

    Article  Google Scholar 

  11. Oh, S.Y., Ghose, S., Jang, H.J., Chung, K.: Recent trends in mobile communication systems. Int. J. Comput. Virol. Hacking 10(2), 67–70 (2014)

    Article  Google Scholar 

  12. Oh, S.Y., Ghose, S., Chung, K.Y.: Recent trends in intelligent information system for convergence. Int. J. Intell. Inf. Database Syst. 8(2), 81–84 (2014)

    Google Scholar 

  13. Kim, S.H., Chung, K.Y.: 3D simulator for stability analysis of finite slope causing plane activity. Multimed. Tools Appl. 68(2), 455–463 (2014)

    Article  Google Scholar 

  14. Kim, J.H., Chung, K.Y.: Ontology-based healthcare context information model to implement ubiquitous environment. Multimed. Tools Appl. 71(2), 873–888 (2014)

    Article  Google Scholar 

  15. Park, R.C., Jung, H., Jo, S.M.: ABS scheduling technique for interference mitigation of M2M based medical WBAN service. Wirel. Pers. Commun. 79(4), 2685–2700 (2014)

    Article  Google Scholar 

  16. Park, R.C., Jung, H., Shin, D.K., Cho, Y.H., Lee, K.D.: Telemedicine health service using LTE-advanced relay antenna. Pers. Ubiquitous Comput. 18(6), 1325–1335 (2014)

    Article  Google Scholar 

  17. Wang, K.C., Tsai, Y.H.: Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy. In: Proceedings of the International Symposium on Universal Communication, pp. 423–428 (2008)

  18. Yi, Hu, Loizou, P.C.: Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16(1), 229–238 (2008)

    Article  Google Scholar 

  19. Homer, J., Mareels, I.: LS Detection guided NLMS estimation of sparse system. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (2004)

  20. Wu, B.F., Wang, K.C.: Robust endpoint detection algorithm based on the adaptive band-partitioning spectral entropy in adverse environments. IEEE Transactions on Speech and Audio Processing 13(5), 762–775 (2005)

    Article  Google Scholar 

  21. Li, Q., Zheng, J., Tsai, A., Zhou, Q.: Robust endpoint detection and energy normalization for real-time speech and speaker recognition. IEEE Transactions on Speech and Audio Processing 10(3), 146–157 (2002)

    Article  Google Scholar 

  22. Sumit K.B., Om P.S., Prabhakar, A.: Speech/music discriminator based on frequency energy, spectrogram and autocorrelation. IJSCE, vol. 1, Issue. 6 (2012)

  23. Sumit, K.B., Dekate, S.K.: Text-dependent method for gender identification through synthesis of voiced segments. Int. J. Eng. Sci. Technol. 3(6) (2011)

  24. Scart, P., Filho, J.: Speech enhancement based on a priori signal to noise estimation. In: Proceedings of IEEE International Conference on Acoustic Speech Signal Processing, pp. 629–632, (1996)

  25. Kamarth, S., Loizou, P.: A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In: Proceedings of IEEE International Conference on Acoustic Speech Signal Processing, pp. 101–111 (2002)

  26. Quiroz, A., Gnanasambandam, N., Parashar, M., Sharma, N.: Robust clustering analysis for the management of self-monitoring distributed systems. Clust. Comput. 12(1), 73–85 (2009)

    Article  Google Scholar 

  27. Jung, Y.G., Han, M.S., Chung, K.Y., Lee, S.J.: Monotonicity and performance evaluation: applications to high speed and mobile networks. Clust. Comput. 15(4), 401–414 (2012)

    Article  Google Scholar 

  28. Oh, S.Y., Chung, K.Y.: Target speech feature extraction using non-parametric correlation coefficient. Clust. Comput. 17(3), 893–899 (2014)

    Article  Google Scholar 

  29. Chung, K.Y., Na, Y.J., Lee, J.H.: Interactive design recommendation using sensor based smart wear and weather webbot. Wirel. Pers. Commun. 73, 243–256 (2013)

    Article  Google Scholar 

  30. Jung, E.Y., Kim, J.H., Chung, K.Y., Park, D.K.: Home health gateway based healthcare services through U-health platform. Wirel. Pers. Commun. 73, 207–218 (2013)

    Article  Google Scholar 

  31. Park, J.H.: Subscriber authentication technology of AAA mechanism for mobile IPTV service offer. Telecommun. Syst. 45, 37–45 (2010)

    Article  Google Scholar 

  32. Chung, K.: Recent trends on convergence and ubiquitous computing. Pers. Ubiquitous Comput. 18(6), 1291–1293 (2014)

    Article  Google Scholar 

  33. ETSI standard document, Speech Processing, Transmission and Quality aspects (STQ); Distributed speech recognition; Advanced front-end feature extraction algorithm; Compression algorithms, ETSI ES 202 050 Vol. 1, No. 1 (2003)

  34. Abdallah I., Montresor S., Baudry, M.: Robust speech/non-speech detection in adverse conditions using an entropy based estimator. In: Proceedings of the IEEE International Conference on Digital Signal Processing, pp. 752–760 (1997)

  35. Ahmed, B., Holmes, P.H.: A voice activity detector using the chi-square test. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 625–628 (2004)

  36. Zhu, Q., Iseli, M., Cui, X., Alwan, A.: Noise robust feature extraction for ASR using the Aurora 2 database. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (2006)

  37. Tüske, Z., Mihajlik, P., Tobler, Z., Fegyó, T.: Robust voice activity detection based on the entropy of noise suppressed spectrum. Interspeech, pp. 245–248 (2005)

  38. Kozel, D., Apostoaia, C.: Colored noise reduction using Bark scale spectral subtraction, statistics, and multiple time frames. In: Proceedings of the IEEE International Conference on Electro/Information Technology, pp. 416–421 (2007)

Download references

Acknowledgments

This research was supported by Sangji University Research Fund, 2014.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to SangYeob Oh.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chung, K., Oh, S. Improvement of speech signal extraction method using detection filter of energy spectrum entropy. Cluster Comput 18, 629–635 (2015). https://doi.org/10.1007/s10586-015-0429-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-015-0429-9

Keywords

Navigation