Skip to main content
Log in

A strategic approach to recognize the speech of the children with hearing impairment: different sets of features and models

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The automatic speech recognition system is developed and tested for recognizing the speeches of a normal person in various languages. This paper mainly emphasizes the need for the development of a more challenging speaker independent speech recognition system for hearing impaired to recognize the speeches uttered by any Hearing Impaired (HI) speaker. In this work, Gamma tone energy features with filters spaced an equivalent rectangular bandwidth (ERB), MEL & BARK scale, and MFPLPC features are used at the front end and vector quantization (VQ) & multivariate hidden Markov models (MHMM) at the back end for recognizing the speeches uttered by any hearing impaired speaker. Performance of the system is compared for the three modeling techniques VQ, FCM (Fuzzy C means) clustering and MHMM for the recognition of isolated digits and simple continuous sentences in Tamil. Recognition accuracy (RA) is 81.5% with speeches of eight speakers considered for training and speeches of the remaining two speakers considered for testing for speaker independent isolated digit recognition system. Accuracy is found to be 91% and 87.5% for considering 90% of the data for training and 10% for testing for speaker independent isolated digit and continuous speech recognition systems respectively. Accuracy can be further enhanced by having an extensive database for creating models/templates. Receiver operating characteristics (ROC) drawn between True Positive Rate and False Positive Rate is used to assess the performance of the system for HI. This system can be utilized to understand the speech uttered by any hearing impaired speaker and the system facilitates the provision of necessary assistance to them. It ultimately improves the social status of the hearing impaired people and their confidence level will be enhanced.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Data availability

All relevant data are within the paper and its supporting information files.

References

  1. Brookes C (2000) Speech-to-text systems for deaf, deafened and hard-of-hearing people. IEE Seminar on speech and language processing for disabled and elderly people (ref. no. 2000/025), . https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=846943

  2. Chee LS, Ai OC, Hariharan M (2009) MFCC based recognition of repetitions and prolongations in stuttered speech using k-NN and LDA. IEEE Stud Conf Res Dev (SCOReD) 1:146–149 https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5443210

    Article  Google Scholar 

  3. Deller JR, Hsu D, Ferrier LJ (1991) On the use of hidden Markov modeling for recognition of dysarthric speech. Int JComput Methods Prog Biomed 35(2):125–139 https://www.sciencedirect.com/science/article/pii/016926079190071Z

    Article  Google Scholar 

  4. Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874

    Article  MathSciNet  Google Scholar 

  5. Girgin MC, Ozsoy B (2008) The relationship between formant frequency and duration characteristics of vowels and speech intelligibility in Turkish hearing impaired children. World Appl Sci J 4(6):891–899 https://www.idosi.org/wasj/wasj4(6)/20.pdf

    Google Scholar 

  6. Gudi AB, Shreedhar HK, Nagaraj HC (2010) Signal processing techniques to estimate the speech disability in children. Int J Eng Technol 2(2):169–176 http://www.ijetch.org/papers/117%2D%2DT262.pdf

  7. Han Z, Wang X, Wang J (2008) Pathological speech deformation degree assessment based on dynamic and static feature integration. The 2nd International Conference on Bioinformatics and Biomedical Engineering (ICBBE): 2036–2039, . https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4535718

  8. Han Z-Y, Wang X, Wang J (2009) Speech recognition system based on visual features and neural network for persons with speech impairments. International Journal on Modeling, Identification and Control 8(3). https://www.inderscienceonline.com/doi/abs/10.1504/IJMIC.2009.029269

  9. Hawley MS, Cunningham SP, Green PD, Enderby P, Palmer R, Sehgal S, O'Neill P (2012) A voice-input voice-output communication aid for people with severe speech impairment. IEEE Trans Neural Syst Rehab Eng 21(1):23–31 https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6259889

    Article  Google Scholar 

  10. Hermansky H (1990) Perceptual linear predictive (PLP) analysis for speech. J Acoust Soc Am 87(4):1738–1752 https://asa.scitation.org/doi/10.1121/1.399423

    Article  Google Scholar 

  11. Hermansky, H. and Morgan, N. (1994) ‘RASTA processing of speech’, Speech Audio Process IEEE Trans, Vol. 2, No. 4, October, 578–589. https://ieeexplore.ieee.org/iel4/89/7749/00326616.pdf

  12. Jamil MHM, Al-Haddad SAR, Ng CK (2011) A flexible speech recognition system for cerebral palsy disabled. Int Conf Inform Eng Inform Sci 251:42–55 https://link.springer.com/chapter/10.1007/978-3-642-25327-0_5

    Article  Google Scholar 

  13. Jeyalakshmi C, Revathi A (2018) Efficient speech recognition system for hearing impaired children in classical Tamil language. Int J Biomed Eng Technol (IJBET), Vol. 26, No. 1, . http://www.inderscience.com/offer.php?id=89261

  14. Jeyalakshmi C, Krishnamurthi V, Revathi A (2010) Deaf speech assessment using digital processing techniques. Int J Signal Image Processing 1(1):14–25 http://www.aircconline.com/sipij/V1N1/0910sipij02.pdf

    Article  Google Scholar 

  15. Jeyalakshmi C, Krishnamurthi V, Revathi A (2010) Speech recognition of deaf and hard of hearing people using hybrid neural network. 2nd Int Conf Mech Electronics Eng (ICMEE) 1:83–87 https://ieeexplore.ieee.org/document/5558589

    Google Scholar 

  16. Jeyalakshmi C, Revathi A, Krishnamurthi V (2012) Building robust HMM models for speech recognition of hearing impaired. International Journal on EE Times-India: 1–11, . https://archive.eetindia.co.in/www.eetindia.co.in/ART_8800675640_1800001_TA_722520c8.HTM

  17. Jeyalakshmi C, Revathi A, Krishnamurthi V Effect of states and mixtures in HMM model and connected word recognition of profoundly deaf and hard of hearing speech. Int J Eng Technol (IJET), Volume 5, Issue 6, pp. 4938 to 4946, 2013. http://www.enggjournals.com/ijet/docs/IJET13-05-06-314.pdf

  18. Jeyalakshmi C, Krishnamurthi V, Revathi A (2014) Development of speech recognition system in native language for hearing impaired. J Eng Res 2(2):6 https://www.globalsciencejournals.com/article/10.7603/s40632-014-0006-z

    Article  Google Scholar 

  19. Jeyalakshmi C, Revathi A, Krishnamurthi V Investigation of voice disorders and recognising the speech of children with hearing impairment in classical Tamil language”, Int J Biomed Eng Technol 17, Issue 4, pp. 356-370 2015. https://www.inderscienceonline.com/doi/abs/10.1504/IJBET.2015.069402

  20. Jeyalakshmi C, Revathi A, Krishnamurthi V “Alphabet model-based short vocabulary speech recognition for the assessment of profoundly deaf and hard of hearing speeches”, Int J Model Ident Control (IJMIC), Volume 23, Issue 3, pp. 278–286, 2015. https://www.inderscienceonline.com/doi/abs/10.1504/IJMIC.2015.069932

  21. Karjalainen M, Pal Boda P, Somervuo P, Altosaar T (1997) Applications for the hearing impaired: evaluation of Finnish phoneme recognition methods. 5th European conference on speech communication and Technology. https://pdfs.semanticscholar.org/f665/007d4a78cd901903ef0f41769774d6f6561a.pdf?_ga=2.122539531.1822128985.1528710252-1332967019.1520309814

  22. Le Prell CG, Clavier OH (2017) Performance evaluation of the hearing impaired speech recognition in noisy environment. Int J Hear Res 349:76–89 https://www.sciencedirect.com/science/article/pii/S0378595516303513

    Article  Google Scholar 

  23. Levitt H (1972) Acoustic analysis of deaf speech using digital processing techniques. IEEE Trans Audio Electroacoust AU-20(1):35–41 https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1162351

    Article  Google Scholar 

  24. Mahmoudi Z, Rahati S, Ghasemi MM (2010) Classification of voice disorder in children with cochlear implantation and a hearing aid using multiple classifier fusion. 10th Int Conf Inform Sci Signal Process Applic (ISSPA) 1:304–307 https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5605466

    Google Scholar 

  25. Mengistu KT, Rudzicz F (2011) Adapting acoustic and lexical models to dysarthric speech. IEEE Int Conf Acoustics, Speech Signal Process (ICASSP) 1:4924–4927 https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5947460

    Google Scholar 

  26. Newman CW, Sandridge SA Hearing loss is often undiscovered, but screening is easy, audiology research laboratory, department of otolaryngology and communicative disorders, The Cleveland Clinic Foundation

  27. Pickett J (1969) Some applications of speech analysis to communication aids for the deaf. Trans Audio Electroacoust 17(4):283–289 https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1162064

    Article  Google Scholar 

  28. Pitts AB (2010) Comparing Speech Assessments: The Usefulness of the DEAP as Compared to the GFTA-2. Independent Studies and Capstones, Program in Audiology and Communication Sciences. https://digitalcommons.wustl.edu/cgi/viewcontent.cgi?article=1604&context=pacs_capstones

  29. Polur PD, Miller GE (2005) Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals. Med Eng Phys 28(8):741–748 https://www.ncbi.nlm.nih.gov/pubmed/16359906

    Article  Google Scholar 

  30. Rabiner L, Juang BH (1993) Fundamentals of speech recognition. Prentice Hall, NJ

    Google Scholar 

  31. Revathi, A., Jeyalakshmi, C .A challenging task in recognizing the speech of the hearing impaired using normal hearing models in classical Tamil language , J Eng Res, Vol. 5 No. (2) June 2017 pp. 110–128. http://kuwaitjournals.org/jer/index.php/JER/article/view/1632/145

  32. Revathi A, Venkataramani Y (2011) Speaker independent continuous speech and isolated digit recognition using VQ and HMM. International Conference on Communications and Signal Processing (ICCSP), IEEE: 198–202. https://ieeexplore.ieee.org/document/5739300/

  33. Revathi A, Venkataramani Y (2012) Evaluate multi-speaker isolated word recognition using concatenated perceptual feature’ EE Times-India: 1–10. https://archive.eetindia.co.in/www.eetindia.co.in/ART_8800674654_1800002_TA_10e0ee00.HTM

  34. Stevens CG (1988) Bernstein and Jared, “automatic speech recognition of impaired speech ”. Int J Rehabil Res 11(4):396–397 https://journals.lww.com/intjrehabilres/Citation/1988/12000/Automatic_speech_recognition_of_impaired_speech.13.aspx

    Article  Google Scholar 

  35. Tseng S-C (2011) Speech production of Mandarin-speaking children with hearing impairment and normal hearing. 17th Int Congress Phonetic Sci 1:2030–2033 https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2011/index.htm

    Google Scholar 

  36. Yamada Y, Javkin H, Youdelman K (2000) Assistive speech technology for persons with speech impairments. Int J Speech Comm 30(2–3):179–187 https://www.sciencedirect.com/science/article/pii/S0167639399000394

    Article  Google Scholar 

Download references

Acknowledgements

It is our work - no grant & contribution numbers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Revathi Arunachalam.

Ethics declarations

Conflict of interests

As the authors of the manuscript, we do not have a direct financial relationship with the commercial Identity mentioned in our paper that might lead to a conflict of interest for any of the authors.

Competing interest.

The authors have declared that no competing interest exists.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Arunachalam, R. A strategic approach to recognize the speech of the children with hearing impairment: different sets of features and models. Multimed Tools Appl 78, 20787–20808 (2019). https://doi.org/10.1007/s11042-019-7329-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-7329-6

Keywords

Navigation