Vocabulary optimization process using similar phoneme recognition and feature extraction

Oh, Sang Yeob; Chung, Kyungyong

doi:10.1007/s10586-016-0619-0

Vocabulary optimization process using similar phoneme recognition and feature extraction

Published: 26 August 2016

Volume 19, pages 1683–1690, (2016)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Sang Yeob Oh¹ &
Kyungyong Chung²

172 Accesses
5 Citations
3 Altmetric
Explore all metrics

Abstract

In processing voice with environment noise, the noise must be eliminated to improve the vocabulary recognition rate. In this process, noise elimination and feature extraction for model-estimate technologies are utilized. Concerning these noise-elimination and model-estimate technologies, the most important part is to estimate mixed noise in the source signal and eliminate it. In a vocabulary recognition system, if unexpected noise appears in the signal, or if quantization noise is basically added to digital signals, the source signal is changed or damaged, which decreases the recognition rate. If a source signal is transformed or changed by being mixed with diverse kinds of noise, the hidden Markov model (HMM) is used for effective noise elimination. The HMM forms a model by extracting features to flexibly respond to diverse vocabulary changes found in voice and text, etc. The method is applicable to data changing over time, and can establish a more effective model as the number of parameters constituting the model grows larger. It can provide a robust model estimate by using a parameter set for structured models. HMM-based vocabulary recognition shows discriminating distribution of recognition probability regarding recognition vocabulary models, and has lower computational complexity for recognition. But it produces a relatively lower recognition rate. To solve that problem, a vocabulary recognition-model optimization method is proposed based on a similar phoneme–recognition process and efficient feature extraction. In vocabulary recognition, a similar phoneme–recognition process is applied to HMM to recognize models adjacent to the model group. Efficient feature extraction is used to optimize the recognition model to enhance the recognition rate. For vocabulary composition, a Gaussian-mixture feature-extraction model is optimized and used as a vocabulary recognition model. Then, it is processed with similar-phoneme recognition regarding the vocabulary recognition model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detection and Classification Methods for Animal Sounds

Milestones in speaker recognition

Article Open access 15 February 2024

Chinese dialect speech recognition: a comprehensive survey

Article Open access 31 January 2024

References

Oh, S.Y.: Bayesian method recognition rates improvement using HMM vocabulary recognition model optimization. J. Digit. Converg. 12(7), 273–278 (2014)
Article Google Scholar
Ahn, C.S., Oh, S.Y.: Gaussian model optimization using configuration thread control in CHMM vocabulary recognition. J. Digit. Policy Manag. 10(7), 167–172 (2012)
Google Scholar
Oh, S.Y.: Speech recognition optimization learning model using HMM feature extraction in the Bhattacharyya algorithm. J. Digit. Policy Manag. 11(6), 199–204 (2013)
Google Scholar
Srinivasan, A.: Speech recognition using hidden Markov model. Appl. Math. Sci. 5(79), 3943–3948 (2011)
Google Scholar
Wang, K.C., Tsai, Y.H.: Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy. Second international symposium on universal communication, pp. 423–428 (2008)
Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE transactions on acoustics, speech, signal processing, vol. ASSP-27, pp. 113–120 (1979)
Yi, Hu, Loizou, P.C.: Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16(1), 229–238 (2008)
Article Google Scholar
Kim, W., Hansen, J.H.L.: Feature compensation in the Cepstral domain employing model combination. Speech Commun. 51(2), 83–96 (2009)
Article Google Scholar
ETSI Standard Document, ETSI ES 202 050 v1.1.1 (2002–2010) (2002)
Rangachari, S., Loizou, P.C.: A noise-estimation algorithm for highly non-stationary environments. Speech Commun. 48(2), 220–231 (2006)
Article Google Scholar
Oh, S.Y.: Decision tree for likely phoneme model schema support. J. Digit. Policy Manag. 11(10), 367–372 (2013)
Google Scholar
Young, S.: HTK: hidden Markov model Toolkit V3.4.1. Cambridge University, Engineering Department, Speech Group (1993)
Wang, C.C., Pan, C.A., Hung, J.W.: Silence feature normalization for robust speech recognition in additive noise environments. In: Proceedings of the Conference on the International Speech Communication Association, vol. 9, pp. 1028–1031 (2008)
Lieb, M., Fischer, A.: Experiments with the Philips continuous ASR system on the AURORA noisy digits database. In: Proceedings of the 7th European Conference on Speech Communication and Technology, pp. 625–628 (2001)
Chung, K., Boutaba, R., Hariri, S.: Recent trends in digital convergence information system. Wirel. Pers. Commun. 79(4), 2409–2413 (2014)
Article Google Scholar
Oh, S., Chung, K.Y.: Target speech feature extraction using non-parametric correlation coefficient. Clust. Comput. 17(3), 893–899 (2014)
Article Google Scholar
Jung, H., Chung, K.: Life style improvement mobile service for high risk chronic disease based on PHR platform. Clust. Comput. 19(2), 967–977 (2016)
Article Google Scholar
Kim, S.H., Chung, K.: Emergency situation monitoring service using context motion tracking of chronic disease patients. Clust. Comput. 18(2), 747–759 (2015)
Article Google Scholar
Jung, H., Chung, K.: Knowledge-based dietary nutrition recommendation for obese management. Inf. Technol. Manag. 17(1), 29–42 (2016)
Article Google Scholar
Kim, J.H., Chung, K.Y.: Ontology-based healthcare context information model to implement ubiquitous environment. Multimed. Tools Appl. 71(2), 873–888 (2014)
Article Google Scholar
Jung, H., Chung, K.: Ontology-driven slope modeling for disaster management service. Clust. Comput. 18(2), 677–692 (2015)
Article Google Scholar
Jung, H., Chung, K.Y.: Discovery of automotive design paradigm using relevance feedback. Pers. Ubiquitous Comput. 18(6), 1363–1372 (2014)
Article Google Scholar
Trentin, E., Matassoni, M., Gori, M.: Evaluation on the Aurora 2 database of acoustic models that are less noise-sensitive. Speech Commun. Technol. EUROSPEECH 2003, 1805–1808 (2003)
Google Scholar
Flynn, R., Jones, E.: A comparative study of auditory-based front-ends for robust speech recognition using the Aurora 2 database. Irish Signals and Systems Conference, vol. 2016, pp. 111–116 (2006)
Saon, G., Huerta, J.: Improvements to the IBM Aurora 2 multi-condition system. In: Proceedings of the 7th International Conference on Spoken Language Processing, pp. 469–472 (2002)
Oh, S.Y., Chung, K.Y.: Improvement of speech detection using ERB feature extraction. Wirel. Pers. Commun. 79(4), 2439–2451 (2014)
Article Google Scholar
Kim, K., Hong, M., Chung, K., Oh, S.Y.: Estimating unreliable objects and system reliability in P2P network. Peer-to-Peer Netw. Appl. 8(4), 610–619 (2015)
Article Google Scholar
Spriet, A., Moonen, M., Wouters, J.: Spatially preprocessed speech distortion weighted multi-channel Wiener filtering for noise reduction. Signal Process. 84(12), 2367–2387 (2004)
Article Google Scholar
Chung, K., Oh, S.Y.: Voice activity detection using improvement unvoiced feature normalization process in noisy environment. Wirel. Pers. Commun. 89(3), 747–759 (2016)
Article Google Scholar
Jung, H., Chung, K.: P2P context awareness based sensibility design recommendation using color and bio-signal analysis. Peer-to-Peer Netw. Appl. 9(3), 546–557 (2016)
Article Google Scholar
Chung, K., Kim, J.C., Park, R.C.: Knowledge-based health service considering user convenience using hybrid Wi-Fi P2P. Inf. Technol. Manag. 17(1), 67–80 (2016)
Article Google Scholar
Oh, S.Y., Chung, K., Han, J.S.: Towards ubiquitous health with convergence. Int. J. Technol. Health Care 24(3), 411–413 (2016)
Article Google Scholar

Download references

Acknowledgments

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2013R1A1A2059964).

Author information

Authors and Affiliations

Department of Computer Engineering, Gachon University, Bokjeong-dong, Sujeong-gu, Seongnam-si, Gyeonggi-do, 461-701, Korea
Sang Yeob Oh
School of Computer Information Engineering, Sangji University, 83, Sangjidae-gil, Wonju-si, Gangwon-do, 220-702, Korea
Kyungyong Chung

Authors

Sang Yeob Oh
View author publications
You can also search for this author in PubMed Google Scholar
Kyungyong Chung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kyungyong Chung.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Oh, S.Y., Chung, K. Vocabulary optimization process using similar phoneme recognition and feature extraction. Cluster Comput 19, 1683–1690 (2016). https://doi.org/10.1007/s10586-016-0619-0

Download citation

Received: 27 March 2016
Revised: 19 July 2016
Accepted: 10 August 2016
Published: 26 August 2016
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10586-016-0619-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Vocabulary optimization process using similar phoneme recognition and feature extraction

Abstract

Access this article

Similar content being viewed by others

Detection and Classification Methods for Animal Sounds

Milestones in speaker recognition

Chinese dialect speech recognition: a comprehensive survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Vocabulary optimization process using similar phoneme recognition and feature extraction

Abstract

Access this article

Similar content being viewed by others

Detection and Classification Methods for Animal Sounds

Milestones in speaker recognition

Chinese dialect speech recognition: a comprehensive survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation