Isolated Speech Recognition and Its Transformation in Visual Signs

Mian Qaisar, Saeed

doi:10.1007/s42835-018-00071-z

Isolated Speech Recognition and Its Transformation in Visual Signs

Original Article
Published: 23 January 2019

Volume 14, pages 955–964, (2019)
Cite this article

Journal of Electrical Engineering & Technology Aims and scope Submit manuscript

Saeed Mian Qaisar¹

185 Accesses
9 Citations
Explore all metrics

Abstract

This paper proposes a precise approach of achieving a visual transformation of the isolated speech commands. The idea is to smartly combine the effective speech processing and analysis methods with a systematic image display. In this context, an effective approach for automatic isolated speech based message recognition is proposed. The incoming speech segment is enhanced by applying the appropriate pre-emphasis filtering, noise thresholding and zero alignment operations. The Mel-frequency cepstral coefficients (MFCCs), Delta coefficients and Delta–Delta coefficients are extracted from the enhanced speech segment. Later on, the dynamic time warping (DTW) technique is employed to compare these extracted features with the reference templates. The comparison outcomes are used to make the classification decision. The classification decision is transformed into a systematic sign. The system functionality is tested with an experimental setup and results are presented. An average isolated word recognition accuracy of 99% is achieved. The proposed approach has a potential to be employed in potential applications like visual arts, industrial and noisy environments, integration of people with impaired hearing, education, etc.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

A review of hand gesture and sign language recognition techniques

Article 08 August 2017

Sign Language Recognition Systems: A Decade Systematic Literature Review

Article 17 December 2019

Optical Character Recognition Systems

References

Huang X, Baker J, Reddy R (2014) A historical perspective of speech recognition. Commun ACM 57(1):94–103
Article Google Scholar
Sarma M, Sarma KK (2015) Speech recognition in indian languages—a survey. In: Proceeings of recent trends in intelligent and emerging systems. Springer, New Delhi, pp 173–187
Chapter Google Scholar
Bennett IM, Babu BR, Morkhandikar K et al (2015) Distributed real time speech recognition system. U.S. Patent and Trademark Office, Washington, DC, US Patent 9,076,448
McGraw I, Prabhavalkar R, Alvarez R, Arenas MG, Rao K et al (2016) Personalized speech recognition on mobile devices. arXiv. https://arxiv.org/abs/1603.03185
Wang K, An N, Li BN, Zhang Y, Li L (2015) Speech emotion recognition using Fourier parameters. IEEE Trans Affect Comput 6(1):69–75
Article Google Scholar
Rodomagoulakis I, Kardaris N, Pitsikalis V, Mavroudi E, Katsamanis A, Tsiami A, Maragos P (2016) Multimodal human action recognition in assistive human-robot interaction. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, Lujiazui, China, pp 2702–2706
Chapter Google Scholar
Castelli M, Manzoni L, Vanneschi L, Popovič A (2017) An expert system for extracting knowledge from customers’ reviews: the case of Amazon. com, Inc. Expert Syst Appl 84:117–126
Article Google Scholar
Ramĺrez M, Sotaquirá M, De La Cruz A, Maria E, Avellaneda G, Ochoa A (2016) An automatic speech recognition system for helping visually impaired children to learn Braille. In: 2016 XXI symposium on signal processing, images and artificial vision (STSIVA). IEEE, Bucaramanga, Colombia, pp 1–4
Google Scholar
Darabkh KA, Haddad L, Sweidan SZ, Hawa M, Saifan R, Alnabelsi SH (2016) An eicient speechrecognition system for arm-disabled students based on isolated words. Comp Appl Eng Educ 26(2):285–301
Article Google Scholar
Price M, Glass J, Chandrakasan AP (2015) A 6 mW, 5000-word real-time speech recognizer using WFST models. IEEE J Solid-State Circuits 50(1):102–112
Article Google Scholar
Li L, Xu W, Wu J, He S, Li X (2014) The Hokkien isolated word recognition system based on FPGA. In: 2014 International Conference on Anti-Counterfeiting, Security and Identification (ASID). IEEE, Macau, China, pp 1–5
Google Scholar
Liu L (2015) Acoustic models for speech recognition using deep neural networks based on approximate math. Doctoral dissertation, Massachusetts Institute of Technology
Boulic R, Ahn J, Gobron S, Wang N, Silvestre Q, Thalmann D (2016) Towards the instantaneous expression of emotions with avatars. Cyberemotions. pp 255–278 (Springer, Cham)
Chung JS, Zisserman A (2016) Signs in time: encoding human motion as a temporal image. University of Oxford, Oxford, England, pp 1–5. arXiv. https://arxiv.org/abs/1608.02059
Denmark T, Marshall J, Mummery C, Roy P, Woll B, Atkinson J (2016) detecting memory impairment in deaf people: a new test of verbal learning and memory in British sign language. Arch Clin Neuropsychol 31(8):855–867
Google Scholar
Padden CA (2016) Interaction of morphology and syntax in American sign language. Routledge, Abingdon
Book Google Scholar
Quesada L, López G, Guerrero L (2016) Automatic recognition of the American sign language finger spelling alphabet to assist people living with speech or hearing impairments. J Ambient Intell Humaniz Comput 8(4):625–635
Article Google Scholar
Rayner E, Bouillon P, Gerlach J, Strasly I, Tsourakis N, Ebling S (2016) An open web platform for rule-based speech-to-sign translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics, vol 2. Association for Computational, Linguistics, Berlin, Germany, pp 162–168
Fatima S, Agarwal A, Gupta P (2016) Different approaches to convert speech into sign language. In: Proceedings of computing for sustainable global development (INDIACom). IEEE, New Delhi, pp 180–183
Google Scholar
Zhao N, Yang H (2016) Realizing speech to gesture conversion by keyword spotting. Chinese spoken language processing (ISCSLP). IEEE, Tianjin, pp 1–5
Google Scholar
Sarria-Paja M, Senoussaoui M, Falk TH (2015) The efects of whispered speech on state-of-the-art voice based biometrics systems. In: Proceedings of electrical and computer engineering (CCECE). IEEE, Canada, pp 1254–1259
Google Scholar
Wang CY, Shih M, Tai TC, Lin PC, Huang ST, Zhao JH, Wang JC (2015) VLSI design for SC-based speaker recognition. In: Proceedings of industrial electronics and applications (ICIEA). IEEE, Auckland, New Zealand, pp 335–338
Google Scholar
Jukić A, van Waterschoot T, Gerkmann T, Doclo S (2015) Multi-channel linear prediction-based speech dereverberation with sparse priors. IEEE/ACM Trans Audio, Speech Lang Process (TASLP) 23(9):1509–1520
Article Google Scholar
Bhattacharjee U (2013) A comparative study of LPCC and MFCC features for the recognition of Assamese phonemes. Int J Eng Res Technol 2(1):1–6
Article Google Scholar
Mansour AH, Salh GZA, Mohammed KA (2015) Voice recognition using dynamic time warping and Mel-frequency cepstral coefficients algorithms. Int J CompAppl 116(2):34–41
Google Scholar
Ylinen S, Huuskonen M, Mikkola K, Saure E, Sinkkonen T, Paavilainen P (2016) Predictive coding of phonological rules in auditory cortex: a mismatch negativity study. Brain Lang 162:72–80
Article Google Scholar
Antoniou A (ed) (2016) Digital signal processing. McGraw-Hill, New York
Google Scholar
Eljhani MM (ed) (2015) Front-end of wake-up-word speech recognition system design on FPGA. Florida Institute of Technology, Melbourne
Google Scholar
Almayouf N, Qaisar SM et al (2017) A speech to machine interface based on the frequency domain command recognition. In: Proceedings of international conference on signal and image processing (ICSIP). IEEE, Singapore, pp 356–360
Google Scholar
Qaisar SM (2015) An event driven technique for iltering computational complexity reduction. In: Proceedings of event-based control, communication, and signal processing. IEEE, Krakow, Poland, pp 1–4
Google Scholar
Qaisar SM (2018) An efficient isolated speech recognition based on the adaptive rate processing and analysis. Preprints. https://doi.org/10.20944/preprints201810.0739.v1
Article Google Scholar
Wang Xiang, Song Xiaodong (2015) New medical image fusion approach with coding based on SCD in wireless sensor network. J Electr Eng Technol 10(6):2384–2392
Article Google Scholar
Song XD, Wang X (2015) Extended AODV routing method based on distributed minimum transmission (DMT) for WSN. Int J Electron Commun 69(1):371–381
Article Google Scholar
Zhang Degan, Li Guang, Zheng Ke (2014) An energy-balanced routing method based on forward-aware factor for wireless sensor network. IEEE Trans Industr Inf 10(1):766–773
Article Google Scholar
Zhang Degan, Wang Xiang, Song Xiaodong (2014) A novel approach to mapped correlation of id for RFID anti-collision. IEEE Trans Serv Comput 7(4):741–748
Article Google Scholar
Zheng Ke, Zhang Ting (2015) A novel multicast routing method with minimum transmission for WSN of cloud computing service. Soft Comput 19(7):1817–1827
Article Google Scholar
Zhang Xiaodan (2012) Design and implementation of embedded uninterruptible power supply system (EUPSS) for web-based mobile application. Enterp Inf Syst 6(4):473–489
Article Google Scholar
Zhang Degan (2012) A new approach and system for attentive mobile learning based on seamless migration. Appl Intell 36(1):75–89
Article Google Scholar
Zheng Ke, Zhao Dexin (2016) Novel quick start (QS) method for optimization of TCP. Wirel Netw 22(1):211–222
Article Google Scholar
Kang XJ (2012) A novel image de-noising method based on spherical coordinates system. EURASIP J Adv Signal Process 2012(110):1–10. https://doi.org/10.1186/1687-6180-2012-110
Article Google Scholar
Wang Xiang, Song Xiaodong (2015) New clustering routing method based on PECE for WSN. EURASIP J Wirel Commun Netw 2015(162):1–13. https://doi.org/10.1186/s13638-015-0399-x
Article Google Scholar
XiaodongSong Xiang Wang (2015) New agent-based proactive migration method and system for big data environment (BDE). Eng Computations 32(8):2443–2466
Article Google Scholar
Zhu Yanan (2012) A new constructing approach for a weighted topology of wireless sensor networks based on local-world theory for the internet of things (IOT). Comput Math Appl 64(5):1044–1055
Article MATH Google Scholar
Liang Yanping (2013) A kind of novel method of service aware computing for uncertain mobile applications. Math Comp Model 57(3–4):344–356
Google Scholar
Zhao CP (2012) A new medium access control protocol based on perceived data reliability and spatial correlation in wireless sensor network. Comput Electr Eng 38(3):694–702
Article Google Scholar
Li WB (2016) Novel Fusion Computing Method for Bio-Medical Image of WSN Based on Spherical Coordinate. J Vibroeng 18(1):522–538
Google Scholar
Ma Z (2017) Shadow detection of moving objects based on multisource information in internet of things. J Exp Theor Artif Intell 29(3):649–661
Article MathSciNet Google Scholar
Ma Z (2016) A novel compressive sensing method based on SVD sparse random measurement matrix in wireless sensor network. Eng Comput 33(8):2448–2462
Article Google Scholar
Liu Si, Zhang Ting (2017) Novel unequal clustering routing protocol considering energy balancing based on network partition and distance for mobile education. J Netw Comp Appli 88(15):1–9. https://doi.org/10.1016/j.jnca.2017.03.025
Article Google Scholar
Zhou S, Tang Y (2017) A low duty cycle efficient MAC protocol based on self-adaption and predictive strategy. Mob Netw Appl. https://doi.org/10.1007/s11036-017-0878-x
Article Google Scholar
Niu HL, Liu S (2017) Novel PEECR-based clustering routing approach. Soft Comput 21(24):7313–7323. https://doi.org/10.1007/s00500-016-2270-3
Article Google Scholar
Liu S, Liu XH (2018) Novel dynamic source routing protocol (DSR) based on genetic algorithm-bacterial foraging optimization (GA-BFO). Int J Commun Syst 31:e3824
Article Google Scholar

Download references

Acknowledgements

Author is thankful to anonymous reviewers for their valuable comments. This project is funded by the Effat University under the approval number UC#7/28Feb 2018/10.2-44f.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Effat University, Jeddah, Saudi Arabia
Saeed Mian Qaisar

Authors

Saeed Mian Qaisar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saeed Mian Qaisar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mian Qaisar, S. Isolated Speech Recognition and Its Transformation in Visual Signs. J. Electr. Eng. Technol. 14, 955–964 (2019). https://doi.org/10.1007/s42835-018-00071-z

Download citation

Received: 02 March 2018
Revised: 02 November 2018
Accepted: 06 December 2018
Published: 23 January 2019
Issue Date: 01 March 2019
DOI: https://doi.org/10.1007/s42835-018-00071-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Isolated Speech Recognition and Its Transformation in Visual Signs

Abstract

Access this article

Similar content being viewed by others

A review of hand gesture and sign language recognition techniques

Sign Language Recognition Systems: A Decade Systematic Literature Review

Optical Character Recognition Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Isolated Speech Recognition and Its Transformation in Visual Signs

Abstract

Access this article

Similar content being viewed by others

A review of hand gesture and sign language recognition techniques

Sign Language Recognition Systems: A Decade Systematic Literature Review

Optical Character Recognition Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation