Skip to main content
Log in

Vocabulary optimization process using similar phoneme recognition and feature extraction

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

In processing voice with environment noise, the noise must be eliminated to improve the vocabulary recognition rate. In this process, noise elimination and feature extraction for model-estimate technologies are utilized. Concerning these noise-elimination and model-estimate technologies, the most important part is to estimate mixed noise in the source signal and eliminate it. In a vocabulary recognition system, if unexpected noise appears in the signal, or if quantization noise is basically added to digital signals, the source signal is changed or damaged, which decreases the recognition rate. If a source signal is transformed or changed by being mixed with diverse kinds of noise, the hidden Markov model (HMM) is used for effective noise elimination. The HMM forms a model by extracting features to flexibly respond to diverse vocabulary changes found in voice and text, etc. The method is applicable to data changing over time, and can establish a more effective model as the number of parameters constituting the model grows larger. It can provide a robust model estimate by using a parameter set for structured models. HMM-based vocabulary recognition shows discriminating distribution of recognition probability regarding recognition vocabulary models, and has lower computational complexity for recognition. But it produces a relatively lower recognition rate. To solve that problem, a vocabulary recognition-model optimization method is proposed based on a similar phoneme–recognition process and efficient feature extraction. In vocabulary recognition, a similar phoneme–recognition process is applied to HMM to recognize models adjacent to the model group. Efficient feature extraction is used to optimize the recognition model to enhance the recognition rate. For vocabulary composition, a Gaussian-mixture feature-extraction model is optimized and used as a vocabulary recognition model. Then, it is processed with similar-phoneme recognition regarding the vocabulary recognition model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Oh, S.Y.: Bayesian method recognition rates improvement using HMM vocabulary recognition model optimization. J. Digit. Converg. 12(7), 273–278 (2014)

    Article  Google Scholar 

  2. Ahn, C.S., Oh, S.Y.: Gaussian model optimization using configuration thread control in CHMM vocabulary recognition. J. Digit. Policy Manag. 10(7), 167–172 (2012)

    Google Scholar 

  3. Oh, S.Y.: Speech recognition optimization learning model using HMM feature extraction in the Bhattacharyya algorithm. J. Digit. Policy Manag. 11(6), 199–204 (2013)

    Google Scholar 

  4. Srinivasan, A.: Speech recognition using hidden Markov model. Appl. Math. Sci. 5(79), 3943–3948 (2011)

    Google Scholar 

  5. Wang, K.C., Tsai, Y.H.: Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy. Second international symposium on universal communication, pp. 423–428 (2008)

  6. Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE transactions on acoustics, speech, signal processing, vol. ASSP-27, pp. 113–120 (1979)

  7. Yi, Hu, Loizou, P.C.: Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16(1), 229–238 (2008)

    Article  Google Scholar 

  8. Kim, W., Hansen, J.H.L.: Feature compensation in the Cepstral domain employing model combination. Speech Commun. 51(2), 83–96 (2009)

    Article  Google Scholar 

  9. ETSI Standard Document, ETSI ES 202 050 v1.1.1 (2002–2010) (2002)

  10. Rangachari, S., Loizou, P.C.: A noise-estimation algorithm for highly non-stationary environments. Speech Commun. 48(2), 220–231 (2006)

    Article  Google Scholar 

  11. Oh, S.Y.: Decision tree for likely phoneme model schema support. J. Digit. Policy Manag. 11(10), 367–372 (2013)

    Google Scholar 

  12. Young, S.: HTK: hidden Markov model Toolkit V3.4.1. Cambridge University, Engineering Department, Speech Group (1993)

  13. Wang, C.C., Pan, C.A., Hung, J.W.: Silence feature normalization for robust speech recognition in additive noise environments. In: Proceedings of the Conference on the International Speech Communication Association, vol. 9, pp. 1028–1031 (2008)

  14. Lieb, M., Fischer, A.: Experiments with the Philips continuous ASR system on the AURORA noisy digits database. In: Proceedings of the 7th European Conference on Speech Communication and Technology, pp. 625–628 (2001)

  15. Chung, K., Boutaba, R., Hariri, S.: Recent trends in digital convergence information system. Wirel. Pers. Commun. 79(4), 2409–2413 (2014)

    Article  Google Scholar 

  16. Oh, S., Chung, K.Y.: Target speech feature extraction using non-parametric correlation coefficient. Clust. Comput. 17(3), 893–899 (2014)

    Article  Google Scholar 

  17. Jung, H., Chung, K.: Life style improvement mobile service for high risk chronic disease based on PHR platform. Clust. Comput. 19(2), 967–977 (2016)

    Article  Google Scholar 

  18. Kim, S.H., Chung, K.: Emergency situation monitoring service using context motion tracking of chronic disease patients. Clust. Comput. 18(2), 747–759 (2015)

    Article  Google Scholar 

  19. Jung, H., Chung, K.: Knowledge-based dietary nutrition recommendation for obese management. Inf. Technol. Manag. 17(1), 29–42 (2016)

    Article  Google Scholar 

  20. Kim, J.H., Chung, K.Y.: Ontology-based healthcare context information model to implement ubiquitous environment. Multimed. Tools Appl. 71(2), 873–888 (2014)

    Article  Google Scholar 

  21. Jung, H., Chung, K.: Ontology-driven slope modeling for disaster management service. Clust. Comput. 18(2), 677–692 (2015)

    Article  Google Scholar 

  22. Jung, H., Chung, K.Y.: Discovery of automotive design paradigm using relevance feedback. Pers. Ubiquitous Comput. 18(6), 1363–1372 (2014)

    Article  Google Scholar 

  23. Trentin, E., Matassoni, M., Gori, M.: Evaluation on the Aurora 2 database of acoustic models that are less noise-sensitive. Speech Commun. Technol. EUROSPEECH 2003, 1805–1808 (2003)

    Google Scholar 

  24. Flynn, R., Jones, E.: A comparative study of auditory-based front-ends for robust speech recognition using the Aurora 2 database. Irish Signals and Systems Conference, vol. 2016, pp. 111–116 (2006)

  25. Saon, G., Huerta, J.: Improvements to the IBM Aurora 2 multi-condition system. In: Proceedings of the 7th International Conference on Spoken Language Processing, pp. 469–472 (2002)

  26. Oh, S.Y., Chung, K.Y.: Improvement of speech detection using ERB feature extraction. Wirel. Pers. Commun. 79(4), 2439–2451 (2014)

    Article  Google Scholar 

  27. Kim, K., Hong, M., Chung, K., Oh, S.Y.: Estimating unreliable objects and system reliability in P2P network. Peer-to-Peer Netw. Appl. 8(4), 610–619 (2015)

    Article  Google Scholar 

  28. Spriet, A., Moonen, M., Wouters, J.: Spatially preprocessed speech distortion weighted multi-channel Wiener filtering for noise reduction. Signal Process. 84(12), 2367–2387 (2004)

    Article  Google Scholar 

  29. Chung, K., Oh, S.Y.: Voice activity detection using improvement unvoiced feature normalization process in noisy environment. Wirel. Pers. Commun. 89(3), 747–759 (2016)

    Article  Google Scholar 

  30. Jung, H., Chung, K.: P2P context awareness based sensibility design recommendation using color and bio-signal analysis. Peer-to-Peer Netw. Appl. 9(3), 546–557 (2016)

    Article  Google Scholar 

  31. Chung, K., Kim, J.C., Park, R.C.: Knowledge-based health service considering user convenience using hybrid Wi-Fi P2P. Inf. Technol. Manag. 17(1), 67–80 (2016)

    Article  Google Scholar 

  32. Oh, S.Y., Chung, K., Han, J.S.: Towards ubiquitous health with convergence. Int. J. Technol. Health Care 24(3), 411–413 (2016)

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2013R1A1A2059964).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kyungyong Chung.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Oh, S.Y., Chung, K. Vocabulary optimization process using similar phoneme recognition and feature extraction. Cluster Comput 19, 1683–1690 (2016). https://doi.org/10.1007/s10586-016-0619-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-016-0619-0

Keywords

Navigation