Skip to main content

An Artificial Voice Box that Makes Use of Unconventional Methods of Machine Learning

  • Conference paper
  • First Online:
Decision Intelligence (InCITe 2023)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1079))

Included in the following conference series:

  • 122 Accesses

Abstract

Patients who are subjected to intensive medical treatments in order to identify voice problems may suffer a considerable amount during the diagnostic process. As a result, automated speech recognition and methods for disorder diagnosis have garnered a lot of attention in recent years, and both of these approaches have been effective. Voice recordings were collected from the Saarbrucken Voice Database to help with the completion of this inquiry. To de-noise and remove any potential noise, the signals are first preprocessed using a hybrid Wiener filter discrete wavelet transform (HWFDWT). Melanie Frequency Cestrum Cat Swarm Optimization is used to extract features while accounting for coefficients. Finally, the features are arranged by running the data through the modified optimized back propagation network (MOBPN) for the categorization of disordered voices. In all aspects of precision, recall, F1-score, and timespan, the classification strategy exceeds the present Support Vector Machine (SVM) and Back Propagation Neural Network (BPNN) methodologies. Due to the cognitive speech system, people who lack the ability to speech can now convey their feelings and ideas to others in a form that can now communicate their thoughts and feelings to others in a way that is simple to grasp. It is a device that can record the electrical pulses the brain emits and then transform those records into synthetic speech. Provide a thorough explanation of the idea or solution you intend to develop. After recording the brain's electrical activity, it will be sent to a synthesizer for further processing. When it has completed decoding the signal, the synthesizer will transform it into a voice. After the voice has been decrypted, it is sent into an artificial voice box for further processing. Brain signals are employed to create an artificial voice. This voice is processed before being transmitted through the box.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mohanchandra K, Saha S (2016) A communication paradigm using subvocalized speech: translating brain signals into speech. Augmented Human Res 1:3

    Article  Google Scholar 

  2. Koctúrová M, Juhár J (2021) A novel approach to EEG speech activity detection with visual stimuli and mobile BCI. Appl Sci 11(2):674

    Article  Google Scholar 

  3. Cao Z (2020) A review of artificial intelligence for EEG-based brain−computer interfaces and applications. Brain Sci Adv 6(3):167–170

    Article  Google Scholar 

  4. Abid S, Fnaiech F, Najim M (2001) A fast feedforward training algorithm using a modified form of the standard backpropagation algorithm. IEEE Trans Neural Networks 12(2):424–430

    Article  Google Scholar 

  5. Ahmad M (1992) Supervised learning using the cauchy energy function. In: Proceedings of the international conference on fuzzy logic and neural network. lizuka, Japan, pp 721–724

    Google Scholar 

  6. Al Mojaly M, Muhammad G, Alsulaiman M (2014) Detection and classification of voice pathology using feature selection. In: 2014 IEEE/ACS 11th international conference on computer systems and applications (AICCSA) . IEEE, Doha, Qatar, pp 571–577

    Google Scholar 

  7. Alarifi A, Tolba A, Al-Makhadmeh Z, Said W (2020) A big data approach to sentiment analysis using greedy feature selection with cat swarm optimization-based long short-term memory neural networks. The J Supercomput 76:4414–4429

    Google Scholar 

  8. Al-Nasheri A, Muhammad G, Alsulaiman M, Ali Z, Malki KH, Mesallam TA, Ibrahim MF (2018) Voice pathology detection and classification using auto-correlation and entropy features in different frequency regions. IEEE Access 6:6961–6974

    Google Scholar 

  9. Amin SU, Hossain MS, Muhammad G, Alhussein M, Rahman MA (2019) Cognitive smart healthcare for pathology detection and monitoring. IEEE Access 7:10745–10753

    Article  Google Scholar 

  10. Sharon RA, Narayanan S, Sur M, Murthy HA (2019) An empirical study of speech processing in the brain by analyzing the temporal syllable structure in speech-input induced EEG. In: 2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, Brighton, UK, pp 4090–4094

    Google Scholar 

  11. Arjmandi MK, Pooyan M (2012) An optimum algorithm in pathological voice quality assessment using wavelet-packet-based features, linear discriminant analysis and support vector machine. Biomed Signal Process Control 7(1):3–19

    Google Scholar 

  12. Trivedi KR, Thakker RA (2016) Brainwave enabled multifunctional, communication, controlling and speech signal generating system. In: 2016 international conference on electrical, electronics, and optimization techniques (ICEEOT) . IEEE, Chennai, India, pp 4889–4893

    Google Scholar 

  13. Suhaimi NS, Mountstephens J, Teo J (2020) EEG-based emotion recognition: a state-of-the-art review of current trends and opportunities. computational intelligence and neuroscience, Article ID 8875426, 19 pages

    Google Scholar 

  14. Brigham K, Kumar BV (2010) Imagined speech classification with EEG signals for silent communication: a preliminary investigation into synthetic telepathy. In: 2010 4th international conference on bioinformatics and biomedical engineering. IEEE, Chengdu, China, pp 1–4

    Google Scholar 

  15. Rosinová M, Lojka M, Staš J, Juhár J (2017) Voice command recognition using EEG signals. In: 2017 international symposium ELMAR. IEEE, Zadar, Croatia, pp 153–156

    Google Scholar 

  16. Nakashika T, Takiguchi T, Ariki Y (2015) Voice conversion using RNN pre-trained by recurrent temporal restricted Boltzmann machines. IEEE/ACM Trans Audio, Speech, Lang Process 23(3):580–587

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raman Chadha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chadha, R., Singla, S., Singh, N.T. (2023). An Artificial Voice Box that Makes Use of Unconventional Methods of Machine Learning. In: Murthy, B.K., Reddy, B.V.R., Hasteer, N., Van Belle, JP. (eds) Decision Intelligence. InCITe 2023. Lecture Notes in Electrical Engineering, vol 1079. Springer, Singapore. https://doi.org/10.1007/978-981-99-5997-6_3

Download citation

Publish with us

Policies and ethics