Abstract
In processing voice with environment noise, the noise must be eliminated to improve the vocabulary recognition rate. In this process, noise elimination and feature extraction for model-estimate technologies are utilized. Concerning these noise-elimination and model-estimate technologies, the most important part is to estimate mixed noise in the source signal and eliminate it. In a vocabulary recognition system, if unexpected noise appears in the signal, or if quantization noise is basically added to digital signals, the source signal is changed or damaged, which decreases the recognition rate. If a source signal is transformed or changed by being mixed with diverse kinds of noise, the hidden Markov model (HMM) is used for effective noise elimination. The HMM forms a model by extracting features to flexibly respond to diverse vocabulary changes found in voice and text, etc. The method is applicable to data changing over time, and can establish a more effective model as the number of parameters constituting the model grows larger. It can provide a robust model estimate by using a parameter set for structured models. HMM-based vocabulary recognition shows discriminating distribution of recognition probability regarding recognition vocabulary models, and has lower computational complexity for recognition. But it produces a relatively lower recognition rate. To solve that problem, a vocabulary recognition-model optimization method is proposed based on a similar phoneme–recognition process and efficient feature extraction. In vocabulary recognition, a similar phoneme–recognition process is applied to HMM to recognize models adjacent to the model group. Efficient feature extraction is used to optimize the recognition model to enhance the recognition rate. For vocabulary composition, a Gaussian-mixture feature-extraction model is optimized and used as a vocabulary recognition model. Then, it is processed with similar-phoneme recognition regarding the vocabulary recognition model.
Similar content being viewed by others
References
Oh, S.Y.: Bayesian method recognition rates improvement using HMM vocabulary recognition model optimization. J. Digit. Converg. 12(7), 273–278 (2014)
Ahn, C.S., Oh, S.Y.: Gaussian model optimization using configuration thread control in CHMM vocabulary recognition. J. Digit. Policy Manag. 10(7), 167–172 (2012)
Oh, S.Y.: Speech recognition optimization learning model using HMM feature extraction in the Bhattacharyya algorithm. J. Digit. Policy Manag. 11(6), 199–204 (2013)
Srinivasan, A.: Speech recognition using hidden Markov model. Appl. Math. Sci. 5(79), 3943–3948 (2011)
Wang, K.C., Tsai, Y.H.: Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy. Second international symposium on universal communication, pp. 423–428 (2008)
Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE transactions on acoustics, speech, signal processing, vol. ASSP-27, pp. 113–120 (1979)
Yi, Hu, Loizou, P.C.: Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16(1), 229–238 (2008)
Kim, W., Hansen, J.H.L.: Feature compensation in the Cepstral domain employing model combination. Speech Commun. 51(2), 83–96 (2009)
ETSI Standard Document, ETSI ES 202 050 v1.1.1 (2002–2010) (2002)
Rangachari, S., Loizou, P.C.: A noise-estimation algorithm for highly non-stationary environments. Speech Commun. 48(2), 220–231 (2006)
Oh, S.Y.: Decision tree for likely phoneme model schema support. J. Digit. Policy Manag. 11(10), 367–372 (2013)
Young, S.: HTK: hidden Markov model Toolkit V3.4.1. Cambridge University, Engineering Department, Speech Group (1993)
Wang, C.C., Pan, C.A., Hung, J.W.: Silence feature normalization for robust speech recognition in additive noise environments. In: Proceedings of the Conference on the International Speech Communication Association, vol. 9, pp. 1028–1031 (2008)
Lieb, M., Fischer, A.: Experiments with the Philips continuous ASR system on the AURORA noisy digits database. In: Proceedings of the 7th European Conference on Speech Communication and Technology, pp. 625–628 (2001)
Chung, K., Boutaba, R., Hariri, S.: Recent trends in digital convergence information system. Wirel. Pers. Commun. 79(4), 2409–2413 (2014)
Oh, S., Chung, K.Y.: Target speech feature extraction using non-parametric correlation coefficient. Clust. Comput. 17(3), 893–899 (2014)
Jung, H., Chung, K.: Life style improvement mobile service for high risk chronic disease based on PHR platform. Clust. Comput. 19(2), 967–977 (2016)
Kim, S.H., Chung, K.: Emergency situation monitoring service using context motion tracking of chronic disease patients. Clust. Comput. 18(2), 747–759 (2015)
Jung, H., Chung, K.: Knowledge-based dietary nutrition recommendation for obese management. Inf. Technol. Manag. 17(1), 29–42 (2016)
Kim, J.H., Chung, K.Y.: Ontology-based healthcare context information model to implement ubiquitous environment. Multimed. Tools Appl. 71(2), 873–888 (2014)
Jung, H., Chung, K.: Ontology-driven slope modeling for disaster management service. Clust. Comput. 18(2), 677–692 (2015)
Jung, H., Chung, K.Y.: Discovery of automotive design paradigm using relevance feedback. Pers. Ubiquitous Comput. 18(6), 1363–1372 (2014)
Trentin, E., Matassoni, M., Gori, M.: Evaluation on the Aurora 2 database of acoustic models that are less noise-sensitive. Speech Commun. Technol. EUROSPEECH 2003, 1805–1808 (2003)
Flynn, R., Jones, E.: A comparative study of auditory-based front-ends for robust speech recognition using the Aurora 2 database. Irish Signals and Systems Conference, vol. 2016, pp. 111–116 (2006)
Saon, G., Huerta, J.: Improvements to the IBM Aurora 2 multi-condition system. In: Proceedings of the 7th International Conference on Spoken Language Processing, pp. 469–472 (2002)
Oh, S.Y., Chung, K.Y.: Improvement of speech detection using ERB feature extraction. Wirel. Pers. Commun. 79(4), 2439–2451 (2014)
Kim, K., Hong, M., Chung, K., Oh, S.Y.: Estimating unreliable objects and system reliability in P2P network. Peer-to-Peer Netw. Appl. 8(4), 610–619 (2015)
Spriet, A., Moonen, M., Wouters, J.: Spatially preprocessed speech distortion weighted multi-channel Wiener filtering for noise reduction. Signal Process. 84(12), 2367–2387 (2004)
Chung, K., Oh, S.Y.: Voice activity detection using improvement unvoiced feature normalization process in noisy environment. Wirel. Pers. Commun. 89(3), 747–759 (2016)
Jung, H., Chung, K.: P2P context awareness based sensibility design recommendation using color and bio-signal analysis. Peer-to-Peer Netw. Appl. 9(3), 546–557 (2016)
Chung, K., Kim, J.C., Park, R.C.: Knowledge-based health service considering user convenience using hybrid Wi-Fi P2P. Inf. Technol. Manag. 17(1), 67–80 (2016)
Oh, S.Y., Chung, K., Han, J.S.: Towards ubiquitous health with convergence. Int. J. Technol. Health Care 24(3), 411–413 (2016)
Acknowledgments
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2013R1A1A2059964).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Oh, S.Y., Chung, K. Vocabulary optimization process using similar phoneme recognition and feature extraction. Cluster Comput 19, 1683–1690 (2016). https://doi.org/10.1007/s10586-016-0619-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-016-0619-0