Improving Naturalness in Speech Synthesis Using Fuzzy Logic

Shah, B. Gargi; Sajja, S. Priti

doi:10.1007/978-981-99-0769-4_22

B. Gargi Shah¹² &
S. Priti Sajja¹³

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 645))

Included in the following conference series:

International Conference on Smart Trends in Computing and Communications

294 Accesses

Abstract

The TTS (Text to Speech) synthesis systems have been developed for Indian languages for a few decades. Very little work has been done specifically for the Gujarati Language. The synthesized speech doesn’t sound as similar to human natural speech. Naturalness is the key parameter to achieving a natural-sounding effect in speech synthesis. This paper proposes a method for improving the naturalness of speech synthesis for the Gujarati language using fuzzy logic. The pause (silence) in-between words is also an important feature of a speech. The pause may not be the same after each word in a sentence. It is dependent upon the characteristics of the language and other parameters of the sentence. In the classic architecture of TTS, fuzzy logic is proposed as a new approach to calculate the pause to be applied after each word. The system takes a sentence or paragraph as an input which has the words Importance, Sentence Size, and Position in Sentence derived variables. The fuzzy logic produces the pause in seconds that can be applied after each word. The membership value of derived variables is calculated using straight-line formula. The developed TTS system is tested on a SARS-CoV-2 Covid-19 news dataset in the Gujarati language. The dataset is designed by collecting the news lines from websites of popular news channels in the Gujarati language. The fuzzy logic is proposed in solving the problem of naturalness in synthesized speech and aiming to achieve a more natural-sounding effect in generated speech. This paper describes the implementation of fuzzy logic in achieving naturalness in speech synthesis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Rule-Based Concatenative Approach to Speech Synthesis in Indian Language Text-to-Speech Systems

A survey on speech synthesis techniques in Indian languages

Article 28 May 2020

Fuzzy-based algorithm for Fongbe continuous speech segmentation

Article 19 January 2017

References

Brito JA (2009) A fuzzy-genetic approach for the computational modeling of speech articulatory processes. Sci Inform Syst 21(3):269–276. Retrieved from http://www.redalyc.org/articulo.oa?id=427739442009
Chala TD, Guta AC, Asebel MH (2022) Design and development of a text-to-speech synthesizer for Afan Oromo. SN Comput Sci A Springer Nat J 1–7. https://doi.org/10.1007/s42979-022-01306-7
Cuzzocrea A, Mumolo E, Grasso GM (2019) An effective and efficient genetic-fuzzy algorithm for supporting advanced human-machine interfaces in big data settings. MDPI 13(1):1–31. Retrieved from https://www.mdpi.com/1999-4893/13/1/13/htm
Jitca D, Apopei V, Grigoras F (2002) Improved speech synthesis using fuzzy logic. Int J Speech Technol 227–235. https://doi.org/10.1023/A:1020288622651
Lago E, Honijosa MA, Jimenez CJ, Barriga A, Sanchez-Solano S (1997) FPGA implementation of fuzzy controllers. In: XII conference on design of circuits and integrated systems (DCIS’97), pp 715–720
Google Scholar
Lakra S, Prasad T, Sharma D, Atrey SH, Sharma A (2012) Application of fuzzy mathematics to. Retrieved from arxiv.org, Cornell University: https://arxiv.org/pdf/1209.4535
Li YA, Han C, Mesgarani N (2022) StyleTTS: a style-based generative model for. https://doi.org/10.48550/arXiv.2205.15439
Manic M, Cvetkovic D, Trascevic M (1999) Intelligibility speech estimation using fuzzy logic inferencing. Sci J Facta Universitatis 1(4):27–37
Google Scholar
Massaro DW, Cohen MM (1990) Perception of synthesized audible and visible speech. Psychol Sci 55–63. https://doi.org/10.1111/j.1467-9280.1990.tb00068.x
Mathworks (n.d.) Type-2 fuzzy inference system. Retrieved from https://www.mathworks.com/. https://www.mathworks.com/help/fuzzy/type-2-fuzzy-inference-systems.html
Necibi K (2020) Fuzzy logic applied for pronunciation assessment. Int J Comput Assisted Lang Learn Teach 10(1):60–72. https://doi.org/10.4018/IJCALLT.2020010105
Article Google Scholar
Ode TA, Jobi O, Beaumont AJ, Sylvia Wong S (2006) Intonation contour realisation for Standard Yorùbá text-to-speech synthesis: a fuzzy computational approach. Comput Sci Res Group 20(4):563–588. https://doi.org/10.1016/j.csl.2005.08.006
Rapits S, Carayannis G (2015) Fuzzy logic for rule based formant speech synthesis. Retrieved from https://www.yumpu.com/: https://www.yumpu.com/en/document/view/38938586/fuzzy-logic-for-rule-based-formant-speech-synthesis
Tan X, Chen J, Liu H, Cong J, Zhang C, Liu Y, Liu T-Y (2022) Electrical engineering and systems science > audio and speech processing. Retrieved from https://arxiv-export1.library.cornell.edu/. https://arxiv-export1.library.cornell.edu/abs/2205.04421
Torre Toledano D, Rodríguez Crespo MA, Escalada Sardina JG (1998) Trying to mimic human segmentation of speech using HMM and fuzzy logic correction rules. Third ESCA/COCOSDA workshop for speech synthesis. Jenolan Caves (Australia), pp 1–7
Google Scholar
Williams JB (2005) Prosody in text-to-speech synthesis using fuzzy logic. West Virginia University, Morgantown, West Virginia
Book Google Scholar
Zhang Z (2022) Application of intelligent speech synthesis technology assisted by mobile intelligent terminal in foreign language teaching. Math Probl Eng 2022:1–10. https://doi.org/10.1155/2022/9751094
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Veer Narmad South Gujarat University, Udhna-Magdalla Road, Surat, 395007, India
B. Gargi Shah
Department of Computer Science and Technology, Sardar Patel University, Nana Bazaar, Vallabh Vidhyanagar, Dist., Anand, India
S. Priti Sajja

Authors

B. Gargi Shah
View author publications
You can also search for this author in PubMed Google Scholar
S. Priti Sajja
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to B. Gargi Shah .

Editor information

Editors and Affiliations

University of the Ryukyus, Nishihara, Japan
Tomonobu Senjyu
Khon Kaen University, Khon Kaen, Thailand
Chakchai So–In
Global Knowledge Research Foundation, Ahmedabad, India
Amit Joshi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shah, B., Sajja, S. (2023). Improving Naturalness in Speech Synthesis Using Fuzzy Logic. In: Senjyu, T., So–In, C., Joshi, A. (eds) Smart Trends in Computing and Communications. SMART 2023. Lecture Notes in Networks and Systems, vol 645. Springer, Singapore. https://doi.org/10.1007/978-981-99-0769-4_22

Download citation

DOI: https://doi.org/10.1007/978-981-99-0769-4_22
Published: 15 June 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-0768-7
Online ISBN: 978-981-99-0769-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Improving Naturalness in Speech Synthesis Using Fuzzy Logic

Abstract

Access this chapter

Similar content being viewed by others

A Rule-Based Concatenative Approach to Speech Synthesis in Indian Language Text-to-Speech Systems

A survey on speech synthesis techniques in Indian languages

Fuzzy-based algorithm for Fongbe continuous speech segmentation

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Improving Naturalness in Speech Synthesis Using Fuzzy Logic

Abstract

Access this chapter

Similar content being viewed by others

A Rule-Based Concatenative Approach to Speech Synthesis in Indian Language Text-to-Speech Systems

A survey on speech synthesis techniques in Indian languages

Fuzzy-based algorithm for Fongbe continuous speech segmentation

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation