Abstract
Speech experiences different acoustic obstructions in normal environment, whereas numerous of the applications require a compelling way to partitioned the original dominant speech from the impedance, a perfect hearing framework ought to be able to isolated and recognize sound-related occasions precisely from complex sound-related scenes and in unfavorable conditions. Difficulty in distinguishing a particular speech from a mixture of other unwanted conversations is one of the problems faced by people wearing hearing aid. The possibility of partition of overwhelming discourse from other discourse signals and its enhancement from that point will be accommodating for individuals with hearing disability. The recent literature in the Computational auditory scene analysis (CASA) systems are based on gammatone filter bank and Short time Fourier transform (STFT). But higher computational complexity associated with those models adversely affect the implementation of digital hearing aids. This paper introduces a cochlear model using Wavelet packet transform (WPT) and a novel approach for dominant voiced speech segregation. The experiments confirmed the enhancement of our model in terms of computational complexity and recognition rate when compared to competitive models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hayat, M., Khan, S.H., Werghi, N., Goecke, R.: Joint registration and representation learning for unconstrained face identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1551–1560 (2017). https://doi.org/10.1109/CVPR.2017.169
Taha, B., Hayat, M., Berretti, S., Hatzinakos, D., Werghi, N.: Learned 3D shape representations using fused geometrically augmented images: application to facial expression and action unit detection. IEEE Trans. Circuits Syst. Video Technol. 30(9), 2900–2916 (2020). https://doi.org/10.1109/TCSVT.2020.2984241
Xiao, Y., Siebert, P., Werghi, N.: Topological segmentation of discrete human body shapes in various postures based on geodesic distance. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 3, pp. 131–135 (2004). https://doi.org/10.1109/ICPR.2004.1334486
Werghi, N., Xiao, Y.: Recognition of human body posture from a cloud of 3D data points using wavelet transform coefficients. In: Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition, pp. 77–82 (2002). https://doi.org/10.1109/AFGR.2002.1004135
Hamsa, S., Shahin, I., Iraqi, Y., Werghi, N.: Emotion recognition from speech using wavelet packet transform cochlear filter bank and random forest classifier. IEEE Access 8, 96994–97006 (2020). https://doi.org/10.1109/ACCESS.2020.2991811
Li, P., Guan, Y., Wang, S., Xu, B., Liu, W.: Monaural speech separation based on MAXVQ and CASA for robust speech recognition. Comput. Speech Lang. 24(1), 30–44 (2010). https://doi.org/10.1007/s00521-018-3760-2
Shahin, I., Nassif, A.B., Hamsa, S.: Novel cascaded gaussian mixture model-deep neural network classifier for speaker identification in emotional talking environments. Neural Comput. Appl. 32(7), 2575–2587 (2020). https://doi.org/10.1109/ICPR.2002.1044704
Werghi, N., Xiao, Y.: Wavelet moments for recognizing human body posture from 3D scans. In: Object Recognition Supported by User Interaction for Service Robots, vol. 1. IEEE, pp. 319–322 (2002). https://doi.org/10.1016/j.patrec.2004.09.018
Werghi, N.: A discriminative 3D wavelet-based descriptors: application to the recognition of human body postures. Pattern Recogn. Lett. 26(5), 663–677 (2005). https://doi.org/10.1016/j.patrec.2004.09.018
Taha, B., Dias, J., Werghi, N.: Classification of cervical-cancer using pap-smear images: a convolutional neural network approach. Communications in Computer and Information Science, vol. 723 (2017). https://doi.org/10.1109/ACCESS.2019.2901352
Hardcastle, W.J., Laver, J., Gibbon, F.E.: The Handbook of Phonetic Sciences, vol. 116. John Wiley & Sons, Hoboken (2012)
Zwicker, E.: Subdivision of the audible frequency range into critical bands. J. Acoust. Soc. Am. 33(2), 248–248 (1961)
Subasi, A., Ercelebi, E.: Classification of EEG signals using neural network and logistic regression. Comput. Methods Programs Biomed. 78(2), 87–99 (2005)
Mahmoodzadeh, A., Abutalebi, H.R., Soltanian-Zadeh, H., Sheikhzadeh, H.: Single channel speech separation with a frame-based pitch range estimation method in modulation frequency. In: 2010 5th International Symposium on Telecommunications. IEEE, pp. 609–613 (2010)
Drullman, R., Festen, J.M., Plomp, R.: Effect of temporal envelope smearing on speech reception. J. Acoust. Soc. Am. 95(2), 1053–1064 (1994)
Mahmoodzadeh, A., Abutalebi, H.R., Soltanian-Zadeh, H., Sheikhzadeh, H.: Single channel speech separation in modulation frequency domain based on a novel pitch range estimation method. EURASIP J. Adv. Signal Process. 2012(1), 67 (2012)
Patterson, R.D., Nimmo-Smith, I., Holdsworth, J., Rice, P.: An efficient auditory filterbank based on the gammatone function. In: A Meeting of the IOC Speech Group on Auditory Modelling at RSRE, vol. 2, no. 7 (1987)
Shahin, I., Nassif, A.B., Hamsa, S.: Emotion recognition using hybrid Gaussian mixture model and deep neural network. IEEE Access, 7, 26777–26787 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hamsa, S., Iraqi, Y., Shahin, I., Werghi, N. (2021). Dominant Voiced Speech Segregation and Noise Reduction Pre-processing Module for Hearing Aids and Speech Processing Applications. In: Abraham, A., et al. Proceedings of the 12th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2020). SoCPaR 2020. Advances in Intelligent Systems and Computing, vol 1383. Springer, Cham. https://doi.org/10.1007/978-3-030-73689-7_38
Download citation
DOI: https://doi.org/10.1007/978-3-030-73689-7_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73688-0
Online ISBN: 978-3-030-73689-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)