An Artificial Voice Box that Makes Use of Unconventional Methods of Machine Learning

Chadha, Raman; Singla, Sanjay; Singh, Nongmeikapam Thoiba

doi:10.1007/978-981-99-5997-6_3

Raman Chadha⁴⁰,
Sanjay Singla⁴⁰ &
Nongmeikapam Thoiba Singh⁴⁰

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1079))

Included in the following conference series:

International Conference on Information Technology

122 Accesses

Abstract

Patients who are subjected to intensive medical treatments in order to identify voice problems may suffer a considerable amount during the diagnostic process. As a result, automated speech recognition and methods for disorder diagnosis have garnered a lot of attention in recent years, and both of these approaches have been effective. Voice recordings were collected from the Saarbrucken Voice Database to help with the completion of this inquiry. To de-noise and remove any potential noise, the signals are first preprocessed using a hybrid Wiener filter discrete wavelet transform (HWFDWT). Melanie Frequency Cestrum Cat Swarm Optimization is used to extract features while accounting for coefficients. Finally, the features are arranged by running the data through the modified optimized back propagation network (MOBPN) for the categorization of disordered voices. In all aspects of precision, recall, F1-score, and timespan, the classification strategy exceeds the present Support Vector Machine (SVM) and Back Propagation Neural Network (BPNN) methodologies. Due to the cognitive speech system, people who lack the ability to speech can now convey their feelings and ideas to others in a form that can now communicate their thoughts and feelings to others in a way that is simple to grasp. It is a device that can record the electrical pulses the brain emits and then transform those records into synthetic speech. Provide a thorough explanation of the idea or solution you intend to develop. After recording the brain's electrical activity, it will be sent to a synthesizer for further processing. When it has completed decoding the signal, the synthesizer will transform it into a voice. After the voice has been decrypted, it is sent into an artificial voice box for further processing. Brain signals are employed to create an artificial voice. This voice is processed before being transmitted through the box.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Mohanchandra K, Saha S (2016) A communication paradigm using subvocalized speech: translating brain signals into speech. Augmented Human Res 1:3
Article Google Scholar
Koctúrová M, Juhár J (2021) A novel approach to EEG speech activity detection with visual stimuli and mobile BCI. Appl Sci 11(2):674
Article Google Scholar
Cao Z (2020) A review of artificial intelligence for EEG-based brain−computer interfaces and applications. Brain Sci Adv 6(3):167–170
Article Google Scholar
Abid S, Fnaiech F, Najim M (2001) A fast feedforward training algorithm using a modified form of the standard backpropagation algorithm. IEEE Trans Neural Networks 12(2):424–430
Article Google Scholar
Ahmad M (1992) Supervised learning using the cauchy energy function. In: Proceedings of the international conference on fuzzy logic and neural network. lizuka, Japan, pp 721–724
Google Scholar
Al Mojaly M, Muhammad G, Alsulaiman M (2014) Detection and classification of voice pathology using feature selection. In: 2014 IEEE/ACS 11th international conference on computer systems and applications (AICCSA) . IEEE, Doha, Qatar, pp 571–577
Google Scholar
Alarifi A, Tolba A, Al-Makhadmeh Z, Said W (2020) A big data approach to sentiment analysis using greedy feature selection with cat swarm optimization-based long short-term memory neural networks. The J Supercomput 76:4414–4429
Google Scholar
Al-Nasheri A, Muhammad G, Alsulaiman M, Ali Z, Malki KH, Mesallam TA, Ibrahim MF (2018) Voice pathology detection and classification using auto-correlation and entropy features in different frequency regions. IEEE Access 6:6961–6974
Google Scholar
Amin SU, Hossain MS, Muhammad G, Alhussein M, Rahman MA (2019) Cognitive smart healthcare for pathology detection and monitoring. IEEE Access 7:10745–10753
Article Google Scholar
Sharon RA, Narayanan S, Sur M, Murthy HA (2019) An empirical study of speech processing in the brain by analyzing the temporal syllable structure in speech-input induced EEG. In: 2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, Brighton, UK, pp 4090–4094
Google Scholar
Arjmandi MK, Pooyan M (2012) An optimum algorithm in pathological voice quality assessment using wavelet-packet-based features, linear discriminant analysis and support vector machine. Biomed Signal Process Control 7(1):3–19
Google Scholar
Trivedi KR, Thakker RA (2016) Brainwave enabled multifunctional, communication, controlling and speech signal generating system. In: 2016 international conference on electrical, electronics, and optimization techniques (ICEEOT) . IEEE, Chennai, India, pp 4889–4893
Google Scholar
Suhaimi NS, Mountstephens J, Teo J (2020) EEG-based emotion recognition: a state-of-the-art review of current trends and opportunities. computational intelligence and neuroscience, Article ID 8875426, 19 pages
Google Scholar
Brigham K, Kumar BV (2010) Imagined speech classification with EEG signals for silent communication: a preliminary investigation into synthetic telepathy. In: 2010 4th international conference on bioinformatics and biomedical engineering. IEEE, Chengdu, China, pp 1–4
Google Scholar
Rosinová M, Lojka M, Staš J, Juhár J (2017) Voice command recognition using EEG signals. In: 2017 international symposium ELMAR. IEEE, Zadar, Croatia, pp 153–156
Google Scholar
Nakashika T, Takiguchi T, Ariki Y (2015) Voice conversion using RNN pre-trained by recurrent temporal restricted Boltzmann machines. IEEE/ACM Trans Audio, Speech, Lang Process 23(3):580–587
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science Engineering, UIE, Chandigarh University, Punjab, India
Raman Chadha, Sanjay Singla & Nongmeikapam Thoiba Singh

Authors

Raman Chadha
View author publications
You can also search for this author in PubMed Google Scholar
Sanjay Singla
View author publications
You can also search for this author in PubMed Google Scholar
Nongmeikapam Thoiba Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raman Chadha .

Editor information

Editors and Affiliations

Innovation and Technology Foundation (IBITF), Indian Institute of Technology Bhilai, Sejbahar, Chhattisgarh, India
B. K. Murthy
Director, NIT Kurukshetra, Thanesar, Haryana, India
B. V. R. Reddy
Amity School of Engineering and Technology, Amity University Uttar Pradesh, Noida, Uttar Pradesh, India
Nitasha Hasteer
Department of Information System, University of Cape Town, Cape Town, South Africa
Jean-Paul Van Belle

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chadha, R., Singla, S., Singh, N.T. (2023). An Artificial Voice Box that Makes Use of Unconventional Methods of Machine Learning. In: Murthy, B.K., Reddy, B.V.R., Hasteer, N., Van Belle, JP. (eds) Decision Intelligence. InCITe 2023. Lecture Notes in Electrical Engineering, vol 1079. Springer, Singapore. https://doi.org/10.1007/978-981-99-5997-6_3

Download citation

DOI: https://doi.org/10.1007/978-981-99-5997-6_3
Published: 25 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-5996-9
Online ISBN: 978-981-99-5997-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics