Skip to main content

Advertisement

Log in

Intelligent Genetic Fuzzy Inference System for Speech Recognition: An Approach from Low Order Feature Based on Discrete Cosine Transform

  • Published:
Journal of Control, Automation and Electrical Systems Aims and scope Submit manuscript

Abstract

In this paper an intelligent methodology for speech recognition, is proposed. In addition to processing, with mel-frequency cepstral coefficients, the discrete cosine transform is used to generate a two-dimensional time matrix for each pattern to be recognized. A Mamdani fuzzy inference recognition system is optimized by genetic algorithm to maximize the hits of patterns with minimum number of encoding parameters. Experimental results for digit recognition applied to Brazilian language show the efficiency of the proposed methodology compared to others techniques widely cited in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Abushariah, A., Gunawan, T., Khalifa, O., & Abushariah, M. (2010). English digits speech recognition system based on hidden Markov models. In International Conference on Computer and Communication Engineer-ICCCE 2010 (pp. 1–5).

  • Aggarwal, R. K., & Dave, M. (2011). Application of genetically optimized neural networks for hind speech recognition system. In Word Congress on Informatic and Communication Technologies (pp. 512–517).

  • Ahmed, T. N. N., & Rao, K. (1974). Discrete cosine transform. IEEE Transaction on Computers, c-23, 90–93.

  • Alencar, V. F. S., & Alcaim, A. (2008). LSF and LPC–derived features for large vocabulary distributed continuous speech recognition in Brazilian Portuguese. In Conference on Signals, Systems and Computers (pp. 1237–1241).

  • Bresolin, A. A., Doria Neto, A. D., & Alsina, P. J. (2008a). Digit recognition using wavelet and SVM in Brazilian Portuguese. In EEE–International Conference on Acoustics, Speech and Signal Processing (pp. 1545–1548).

  • Bresolin, A., Doria Neto, A., & Alsina, P. J. (2008b). Consonantal recognition using SVM and new hierarchical decision structure based in the articulatory phonetics. In Tenth IEEE International Symposium on Multimedia (pp. 545–550).

  • Deng, J., Bouchard, M., & Yeap, T. (2008). Feature enhancement for noisy speech recognition with a time-variant linear predictive HMM structure. IEEE Transactions on Audio, Speech, and Language Processing, 16, 891–899.

    Article  Google Scholar 

  • Fissore, P. L. L., & Rivera, E. (1997). Using word temporal structure in HMM speech recongnition. International Conference on Acoustics, Speech, and Signal, 2, 975–978.

    Google Scholar 

  • Ganesh, C., Kumar, H., & Vanathi, P. (2012). Performance analysis of hybrid robust automatic speech recognition. In IEEE International Conference on Signal Processing, Computing and Control (pp. 1–4).

  • Hanchate, D., Nalawade, M., Pawar, M., Pohale, V., & Maurya, P. (2010). Vocal digit recognition using artificial neural network. IEEE International Conference on Signal Processing, Computing and Control, 6, 88–91.

    Google Scholar 

  • Haupt, R., & Haupt, S. (2004). Pratical genetic algorithms. New York: Wiley.

    Google Scholar 

  • Hejazi, S., Kazemi, R., & Ghaemmaghami, S. (2008). Isolated persian digit recognition using a hybrid HMM-SVM. In International Symposium on Intelligent Signal Processing and Communication Systems (pp. 1–4).

  • Juang, C. F., Chiou, C., & Lai, C. (2007). Hierarchical Singleton-type recurrent neural fuzzy networks for noisy speech recognition. IEEE Transaction on Neural Networks, 18, 833–843.

    Article  Google Scholar 

  • Melin, P., & Castillo, O. (2005). Hybrid intelligent systems for pattern recognition using soft computing. Springer, 172, 223–240.

    Google Scholar 

  • Milner, B.P., & Vaseghi, S. (2011). Speech modeling using cepstral-time feature and hidden Markov models. In Proceedings of Conference on Acustic Speech and Signal Processing (pp. 317–320).

  • Mohammed, M., Bijov, E., Xavier, C., Yasif, A., & Supriya, V. (2012). Robust automatic speech recognition systems:HMM versus sparse. In Third International Conference on Intelligent Systems Modelling and Simulation (pp. 339–342).

  • Montalvão, J., & Araujo, M. R. R. (2012). Is masking a relevant aspect lacking in MFCC? A speaker verification perspective. Pattern Recognition Letters, 33, 2156–2165.

    Article  Google Scholar 

  • Picone, J. (1993). Signal modeling techiniques in speech recognition. IEEE Transactions on Computer, 81(1), 1215–1247.

    Google Scholar 

  • Rabiner, L. (1933). Fundamentals of speech recognition. Englewood Cliffs, NJ: Prentice hall.

  • Revathi, A., & Venkataramani, Y. (2011). Speaker independent continuous speech and isolated digit recognition using VQ and HMM. In International Conference on Communications and Signal Processing (ICCSP) (pp. 198–202).

  • Sakr, G., & Elhajj, I. (2011). Digit recognition with confidence. In IEEE Workshop on Signal Processing Systems (pp. 299–304).

  • Shenouda, D., & Goneid, D. (2006). Hybrid fuzzy HMM system for Arabic connectionist speech recognition. The 23rd National U.Jio Science Conference.

  • Silva, D. C., Vasconcelos, C. R., Neto, B. G. A., & Fechine, J. M. (2012). Evaluation of the impact in reducing the number of parameters for continuous speech recognition for Brazilian Portuguese. In Biosignals and Biorobotics Conference (pp. 1–6).

  • Silva, D., Souza, V., & Batista, G. (2013). Spoken digit recognition in Portuguese using line spectral frequencies. In Lectures Notes in Artificial Intelligence-LNAI (Vol. 7637, pp. 241–250). Heidelberg: Springer.

  • Silva, D. F., de Souza, V. M. A., & Batista, G. (2013). A comparative study between MFCC and LSF coefficients in automatic recognition of isolated digits pronounced in Portuguese and English. Acta Scientiarum, 35, 621–628.

    Google Scholar 

  • Tamgo, J., Bernard, E., Lishou, C., & Richome, M. (2012). Wolof speech recognition model of digits and limited-vocabulary based on HMM and ToolKit. In 14th International Conference on Computer Modelling and Simulation (pp. 389–395).

  • Tarihi, M., Taheri, A., & Bababeyk, H. (2005). A new method for fuzzy hidden Markov models in speech recognition. In International Conference on Emnerging Technologies (pp. 36–40).

  • Urena, R., Moral, A. I. G., Moreno, C., Ramon, M., & Maria, F. (2012). Real-time Robust automatic speech recognition using compact suporte vector machine. IEEE Transaction on Audio, Speech and Language Processing, 20, 1347–1361.

    Article  Google Scholar 

  • Ying, Y., & Woo, P. (1999). Speech recognition using fuzzy logic. IEEE-International Joint Conference on Neural Networks, 5, 2962–2964.

    Google Scholar 

  • Yuqiang, H. Y., & Xueying, Z. (2010). A HMM-based fuzzy affective model for emotional speech synthesis. International Conference on Signal Processing Systems, 3, 525–528.

    Google Scholar 

  • Zeng, F. F., & Shi, P. (2006a). Neural network design based on isolated words. In Proceedings of the 2011 IntenationalConference on Machine Learning and Cybernetics (Vol. 14, pp. 769–772).

  • Zeng, J., & Liu, Z. (2006b). Type-2 fuzzy hidden Markov models and their application to speech recognition. IEEE Transactions on Fuzzy Systems, 14, 454–467.

    Article  Google Scholar 

  • Zhang, X., Wang, X., Zhang, S., & Yu, F. (2010). Approximating the True Domain of Fuzzy Inference Sentence with Genetic Algorithm. In Seventh International Conference on Fuzzy Systems and Knowledge Discovery (pp. 114–118).

  • Zhou, J. & Chen, P. (2009). Generalized discrete cosine transform. In Pacific-Asia Conference on Circuits, Communications and System (pp. 449–452).

Download references

Acknowledgments

The authors would like to thank FAPEMA for financial support, research group in Computational Intelligence Applied to Technology at the Federal Institute of Education, Science and Technology of the Maranhão (IFMA) by its infrastructure for this research and experimental results, and Ph.D. program in Electrical Engineering at the Federal University of Maranhão (UFMA).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Washington Silva.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Silva, W., Serra, G. Intelligent Genetic Fuzzy Inference System for Speech Recognition: An Approach from Low Order Feature Based on Discrete Cosine Transform. J Control Autom Electr Syst 25, 689–698 (2014). https://doi.org/10.1007/s40313-014-0148-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40313-014-0148-0

Keywords

Navigation