Cognitively Inspired Feature Extraction and Speech Recognition for Automated Hearing Loss Testing

Nisar, Shibli; Tariq, Muhammad; Adeel, Ahsan; Gogate, Mandar; Hussain, Amir

doi:10.1007/s12559-018-9607-4

Cognitively Inspired Feature Extraction and Speech Recognition for Automated Hearing Loss Testing

Published: 13 February 2019

Volume 11, pages 489–502, (2019)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Shibli Nisar¹,
Muhammad Tariq^1,2,
Ahsan Adeel ORCID: orcid.org/0000-0001-9153-6756^3,4,5,
Mandar Gogate³ &
…
Amir Hussain^6,7

523 Accesses
17 Citations
Explore all metrics

Abstract

Hearing loss, a partial or total inability to hear, is one of the most commonly reported disabilities. A hearing test can be carried out by an audiologist to assess a patient’s auditory system. However, the procedure requires an appointment, which can result in delays and practitioner fees. In addition, there are often challenges associated with the unavailability of equipment and qualified practitioners, particularly in remote areas. This paper presents a novel idea that automatically identifies any hearing impairment based on a cognitively inspired feature extraction and speech recognition approach. The proposed system uses an adaptive filter bank with weighted Mel-frequency cepstral coefficients for feature extraction. The adaptive filter bank implementation is inspired by the principle of spectrum sensing in cognitive radio that is aware of its environment and adapts to statistical variations in the input stimuli by learning from the environment. Comparative performance evaluation demonstrates the potential of our automated hearing test method to achieve comparable results to the clinical ground truth, established by the expert audiologist’s tests. The overall absolute error of the proposed model when compared with the expert audiologist test is less than 4.9 dB and 4.4 dB for the pure tone and speech audiometry tests, respectively. The overall accuracy achieved is 96.67% with a hidden Markov model (HMM). The proposed method potentially offers a second opinion to audiologists, and serves as a cost-effective pre-screening test to predict hearing loss at an early stage. In future work, authors intend to explore the application of advanced deep learning and optimization approaches to further enhance the performance of the automated testing prototype considering imperfect datasets with real-world background noise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Automatic speech recognition: a survey

Article 10 November 2020

Mishaim Malik, Muhammad Kamran Malik, … Imran Makhdoom

A comprehensive survey on automatic speech recognition using neural networks

Article 15 August 2023

Amandeep Singh Dhanjal & Williamjeet Singh

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Article Open access 03 January 2024

Mahendra Kumar Gourisaria, Rakshit Agrawal, … Pradeep Kumar Singh

References

Organization WH, et al. 2013. Millions of People in the World have Hearing Loss that can be Treated or Prevented. Awareness is the Key to Prevention.
Dalton DS, Cruickshanks KJ, Klein BE, Klein R, Wiley TL, Nondahl DM. The impact of hearing loss on quality of life in older adults. Gerontol 2003;43(5):661–668.
Article Google Scholar
Davis A, Smith P, Ferguson M, Stephens D, Gianopoulos I. Acceptability, benefit and costs of early screening for hearing disability: a study of potential screening tests and models. Health Technology Assessment-Southampton-. 2007;11(42).
Fagan J. 2014. Open access guide to audiology and hearing aids for otolaryngologists.
Association ASLH, et al. 2005. Guidelines for manual pure-tone threshold audiometry.
Hudgins CV, Hawkins J, Kaklin J, Stevens S. The development of recorded auditory tests for measuring hearing loss for speech. Laryngoscope 1947;57(1):57–89.
Article CAS PubMed Google Scholar
Probst R, Lonsbury-Martin B, Martin G, Coats A. Otoacoustic emissions in ears with hearing loss. Amer J Otolaryngol 1987;8(2):73–81.
Article CAS Google Scholar
Wilson DF, Hodgson RS, Gustafson MF. Auditory brainstem response testing. Laryngoscope 1993;103 (5):580–581.
Article Google Scholar
Schlauch RS, Han HJ, Tzu-Ling JY, Carney E. Pure-tone–spondee threshold relationships in functional hearing loss: a test of loudness contribution. J Speech Language Hear Res 2017;60(1):136–143.
Article Google Scholar
Martin FN, Clark JG. Introduction to audiology. Boston: Allyn and Bacon; 1997.
Google Scholar
Brandy WT. Speech audiometry. Handb Clin Audiol 2002;5:96–110.
Google Scholar
Franks JR. Hearing measurement. National Institute for Occupational Safety and Health. 2001; p. 183–232.
Carhart R. Clinical application of bone conduction audiometry. Arch Otolaryngol 1950;51(6):798–808.
Article CAS PubMed Google Scholar
Stapells DR, Oates P. Estimation of the pure-tone audiogram by the auditory brainstem response: A review. Audiol Neurotol 1997;2(5):257–280.
Article CAS Google Scholar
Loss CH. 2012. Sensorineural hearing loss. Diseases Ear Nose Throat.
Pensak ML, Adelman RA. 1993. Conductive hearing loss. Otolaryngology-head and neck surgery St Louis: Mosby Year Book.
Ramsay HA, Linthicum JF. Mixed hearing loss in otosclerosis: indication for long-term follow-up. Amer J Otol 1994;15(4):536–539.
CAS Google Scholar
Sreedhar J, Venkatesh L, Nagaraja M, Srinivasan P. Development and evaluation of paired words for testing of speech recognition threshold in Telugu A preliminary report. J Indian Speech Lang Hear Assoc 2011;25 (2):128–136.
Google Scholar
Van Tasell DJ, Yanz JL. Speech recognition threshold in noise: effects of hearing loss, frequency response, and speech materials. J Speech Lang Hear Res 1987;30(3):377–386.
Article Google Scholar
Association ASLH, et al. 1988. Determining threshold level for speech.
Martin FN, Champlin CA, Chambers JA. Seventh survey of audiometric practices in the United States. J-Amer Acad Audiol 1998;9:95–104.
CAS Google Scholar
MD R. 2000. Audiological survey.
Schoepflin JR. 2015. Back to basics: speech audiometry.
Boothroyd A. Developments in speech audiometry. Br J Audiol 1968;2(1):3–10.
Article Google Scholar
Renda L, Selċuk ÖT, Eyigör H, Osma Ü, Yılmaz MD. Smartphone based audiometric test for confirming the level of hearing; Is it useable in underserved areas? J Int Adv Otol 2016;12(1):61–6.
Article PubMed Google Scholar
Szudek J, Ostevik A, Dziegielewski P, Robinson-Anagor J, Gomaa N, Hodgetts B, et al. Can Uhear me now? Validation of an iPod-based hearing loss screening test. Journal of Otolaryngology–Head & Neck Surgery. 2012; p. 41.
Wong TW, Yu T, Chen W, Chiu Y, Wong C, Wong A. Agreement between hearing thresholds measured in non-soundproof work environments and a soundproof booth. Occup Environ Med 2003;60(9):667–671.
Article CAS PubMed PubMed Central Google Scholar
Kam ACS, Gao H, Li LKC, Zhao H, Qiu S, Tong MCF. Automated hearing screening for children: a pilot study in China. Int J Audiol 2013;52(12):855–860.
Article PubMed Google Scholar
Foulad A, Bui P, Djalilian H. Automated audiometry using Apple iOS-based application technology. Otolaryngol–Head Neck Surg 2013;149(5):700–706.
Article PubMed Google Scholar
Ananthi S, Dhanalakshmi P. SVM and HMM modeling techniques for speech recognition using LPCC and MFCC features. Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014. Springer; 2015. p. 519–526.
Chen Ch. Handbook of pattern recognition and computer vision. Singapore: World Scientific; 2015.
Google Scholar
Anagnostopoulos CN, Iliou T, Giannoukos I. Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011. Artif Intell Rev 2015;43(2):155–177.
Article Google Scholar
Rabiner LR. A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 1989;77(2):257–286.
Article Google Scholar
Carhart R, Jerger J. 1959. Preferred method for clinical determination of pure-tone thresholds. Journal of Speech & Hearing Disorders.
Franks JR. Hearing measurement. National Institute for Occupational Safety and Health. 2001; p. 183–232.
Ezeiza A, de Ipiña KL, Hernández C, Barroso N. Enhancing the feature extraction process for automatic speech recognition with fractal dimensions. Cogn Comput 2013;5(4):545–550.
Article Google Scholar
Alam MJ, Kenny P, O’shaughnessy D. Low-variance multitaper mel-frequency cepstral coefficient features for speech and speaker recognition systems. Cogn Comput 2013;5(4):533–544.
Article Google Scholar
Hei Y, Li W, Li M, Qiu Z, Fu W. Optimization of multiuser MIMO cooperative spectrum sensing in cognitive radio networks. Cogn Comput 2015;7(3):359–368.
Article Google Scholar
Nisar S, Khan OU, Tariq M. An efficient adaptive window size selection method for improving spectrogram visualization. Computational intelligence and neuroscience. 2016.
Dobie RA, Van Hemel S, Council NR, et al. 2004. Basics of Sound, the Ear, and Hearing.
Schoepflin JR. 2015. Back to Basics: Speech Audiometry.
Kapul A, Zubova E, Torgaev SN, Drobchik V, Vol. 881. Pure-tone audiometer. In: Journal of Physics: Conference Series. UK: IOP Publishing; 2017, p. 012010.
Google Scholar
Behgam M, Grant SL. Echo cancellation for bone conduction transducers. 2014 48th Asilomar Conference on Signals, Systems and Computers. IEEE; 2014. p. 1629–1632.
Zhong W, Kong X, You X, Wang B. 2015. Recording Device Identification Based on Cepstral Mixed Features.
Hsu CW, Chang CC, Lin CJ, et al. 2003. A practical guide to support vector classification.
Shady Y, Zayed SHH. Speaker independent Arabic speech recognition using support vector machine. Department of Electrical Engineering, Shoubra Faculty of Engineering. Cairo: Benha University; 2009.
Google Scholar
Priya TL, Raajan N, Raju N, Preethi P, Mathini S. Speech and non-speech identification and classification using KNN Algorithm. Proced Eng 2012;38:952–958.
Article Google Scholar
Bhatia N, et al. 2010. Survey of nearest neighbor techniques. arXiv:abs/10070085.
Breiman L. Bagging predictors. Mach Learn 1996;24(2):123–140.
Google Scholar
Freund Y, Schapire RE. Game theory, on-line prediction and boosting. Proceedings of the ninth annual conference on Computational learning theory. ACM; 1996. p. 325–332.
Freund Y, Schapire RE, et al. Experiments with a new boosting algorithm. icml; 1996. p. 148–156.
Rokach L. Ensemble-based classifiers. Artif Intell Rev 2010;33(1):1–39.
Article Google Scholar
Dietterich TG. Ensemble methods in machine learning. International workshop on multiple classifier systems. Springer; 2000. p. 1–15.
Vimala C, Radha V. Isolated speech recognition system for Tamil language using statistical pattern matching and machine learning techniques. J Eng Sci Technol (JESTEC) 2015;10(5):617–632.
Google Scholar
Juang BH, Rabiner LR. Hidden Markov models for speech recognition. Technometrics 1991;33(3):251–272.
Article Google Scholar
Organization WH, et al. 2014. Deafness and hearing loss. 2015. http://www.who.int/mediacentre/factsheets/fs300/en/ http://www.who.int/mediacentre/factsheets/fs300/en/ (visited on 01/16/ 2016).
Eddins DA, Walton JP, Dziorny AE, Frisina RD. Comparison of pure tone thresholds obtained via automated audiometry and standard pure tone audiometry. J Acoust Soc Amer 2012;131(4):3518–3518.
Article Google Scholar

Download references

Acknowledgments

The authors would like to acknowledge the support of Prof. Hidayat Ullah from Khyber Teaching Hospital (ENT department) in Pakistan, for his kind support with pure tone and speech audiometry. The authors would like to acknowledge audiologists, Kamran Mulk from Rehman Medical Institute and Muhammad Saeed from Khyber Teaching Hospital for highlighting pure tone and speech audiometry-related issues. The authors would like to acknowledge the support of Dr. Muhammad Arsalan Khan from Khyber Teaching Hospital for inviting participants (patients). Lastly, the authors would like to gratefully acknowledge the support of deepCI and Taibah Valley (Taibah University, Madinah, Saudi Arabia).

Funding

This research was supported by deepCI and Taibah Valley (Taibah University, Saudi Arabia).

Author information

Authors and Affiliations

National University of Computer and Emerging Sciences, H-11/4, Islamabad, Pakistan
Shibli Nisar & Muhammad Tariq
Princeton University, Princeton, NJ, 085447, USA
Muhammad Tariq
University of Stirling, Stirling, FK9 4LA, UK
Ahsan Adeel & Mandar Gogate
deepCI, Edinburgh, EH16 5XW, UK
Ahsan Adeel
School of Mathematics and Computer Science, University of Wolverhampton, Wolverhampton, UK
Ahsan Adeel
Edinburgh Napier University, School of Computing, Edinburgh, EH10 5DT, UK
Amir Hussain
Taibah Valley, Taibah University, Madinah, Saudi Arabia
Amir Hussain

Authors

Shibli Nisar
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Tariq
View author publications
You can also search for this author in PubMed Google Scholar
Ahsan Adeel
View author publications
You can also search for this author in PubMed Google Scholar
Mandar Gogate
View author publications
You can also search for this author in PubMed Google Scholar
Amir Hussain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ahsan Adeel.

Ethics declarations

This manuscript has not been published in whole or in part elsewhere, which has also not currently being considered for publication in another journal. All authors have been personally and actively involved in substantive work leading to the manuscript, and will hold themselves jointly and individually responsible for its content.

Conflict of interests

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nisar, S., Tariq, M., Adeel, A. et al. Cognitively Inspired Feature Extraction and Speech Recognition for Automated Hearing Loss Testing. Cogn Comput 11, 489–502 (2019). https://doi.org/10.1007/s12559-018-9607-4

Download citation

Received: 05 April 2018
Accepted: 23 October 2018
Published: 13 February 2019
Issue Date: 15 August 2019
DOI: https://doi.org/10.1007/s12559-018-9607-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Cognitively Inspired Feature Extraction and Speech Recognition for Automated Hearing Loss Testing

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

A comprehensive survey on automatic speech recognition using neural networks

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Ethical Approval

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cognitively Inspired Feature Extraction and Speech Recognition for Automated Hearing Loss Testing

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

A comprehensive survey on automatic speech recognition using neural networks

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Ethical Approval

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation