Approximation Neural Network for Phoneme Synthesis

Crisan, Marius

doi:10.1007/978-3-319-11933-5_1

Approximation Neural Network for Phoneme Synthesis

Marius Crisan⁶

Conference paper

3670 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 327))

Abstract

The paper presents a dynamic method for phoneme synthesis using an elemental-based concatenation approach. The vocal sound waveform can be decomposed into elemental patterns that have slight modifications of the shape as they chain one after another in time but keep the same dynamics which is specific to each phoneme. An approximation or RBF network is used to generate elementals in time with the possibility of controlling the characteristics of the sound signals. Based on this technique a quite realistic mimic of a natural sound was obtained.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Edgington, M.: Investigating the limitations of concatenative synthesis. In: Proceedings of Eurospeech 1997, Rhodes/Athens, Greece, pp. 593–596 (1997)
Google Scholar
Bulut, M., Narayanan, S.S., Syrdal, A.: Expressive speech synthesis using a concatenative synthesizer. In: Proceedings of InterSpeech, Denver, CO, pp. 1265–1268 (2002)
Google Scholar
Balyan, A., Agrawal, S.S., Dev, A.: Speech Synthesis: A Review. International Journal of Engineering Research & Technology (IJERT) 2(6), 57–75 (2013)
Google Scholar
Banbrrok, M., McLaughlin, S., Mann, I.: Speech characterization and synthesis by nonlinear methods. IEEE Trans. Speech Audio Process 7(1), 1–17 (1999)
Article Google Scholar
Pitsikalis, V., Kokkinos, I., Maragos, P.: Nonlinear analysis of speech signals: Generalized dimensions and Lyapunov exponents. In: Proc. European Conf. on Speech Communication and Technology-Eurospeech-03, pp. 817–820 (September 2003)
Google Scholar
McLaughlin, S., Maragos, P.: Nonlinear methods for speech analysis and synthesis. In: Marshall, S., Sicuranza, G. (eds.) Advances in Nonlinear Signal and Image Processing, vol. 6, p. 103. Hindawi Publishing Corporation (2007)
Google Scholar
Tao, C., Mu, J., Xu, X., Du, G.: Chaotic characteristic of speech signal and its LPC residual. Acoust. Sci. & Tech. 25(1), 50–53 (2004)
Article Google Scholar
Koga, H., Nakagawa, M.: Chaotic and Fractal Properties of Vocal Sounds. Journal of the Korean Physical Society 40(6), 1027–1031 (2002)
Google Scholar
Lo, W.K., Ching, P.C.: Phone-Based Speech Synthesis With Neural Network And Articulatory Control. In: Proceedings of Fourth International Conference on Spoken Language (ICSLP 1996), vol. 4, pp. 2227–2230 (1996)
Google Scholar
Malcangi, M., Frontini, D.: A Language-Independent Neural Network-Based Speech Synthesizer. Neurocomputing 73(1-3), 87–96 (2009)
Article Google Scholar
Raghavendra, E.V., Vijayaditya, P., Prahallad, K.: Speech synthesis using artificial neural networks. In: National Conference on Communications (NCC), Chennai, India, pp. 1–5 (2010)
Google Scholar
Frank, R.J., Davey, N., Hunt, S.P.: Time Series Prediction and Neural Networks. Journal of Intelligent and Robotic Systems 31, 91–103 (2001)
Article MATH Google Scholar
Kinzel, W.: Predicting and generating time series by neural networks: An investigation using statistical physics. Computational Statistical Physics, 97–111 (2002)
Google Scholar
Priel, A., Kanter, I.: Time series generation by recurrent neural networks. Annals of Mathematics and Artificial Intelligence 39, 315–332 (2003)
Article MathSciNet MATH Google Scholar
Medsker, L.R., Jain, L.C.: Recurrent Neural Networks: Design and Applications. CRC Press (2001)
Google Scholar
Kalinli, A., Sagiroglu, S.: Elman Network with Embedded Memory for System Identification. Journal of Information Science and Engineering 22, 1555–1568 (2006)
Google Scholar
Coca, A.E., Romero, R.A.F., Zhao, L.: Generation of composed musical structures through recurrent neural networks based on chaotic inspiration. In: The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 3220–3226 (July 2011)
Google Scholar
Röbel, A.: Morphing Sound Attractors. In: Proc. of the 3rd. World Multiconference on Systemics, Cybernetics and Informatics (SCI 1999) AES 31st International Conference (1999)
Google Scholar
Crisan, M.: A Neural Network Model for Phoneme Generation. Applied Mechanics and Materials 367, 478–483 (2013)
Article Google Scholar
Takens, F.: Detecting strange attractors in turbulence. Lecture Notes in Mathematics 898, 366–381 (1981)
Article MathSciNet Google Scholar
Small, M.: Applied nonlinear time series analysis: applications in physics, physiology and finance. World Scientific Publishing Co., NJ (2005)
Google Scholar
Sprott, J.C.: Chaos and Time-Series Analysis. Oxford University Press, NY (2003)
Google Scholar
Kononov, E.: Visual Recurrence Analysis Software Package, Version 4.9 (accessed 2013), http://nonlinear.110mb.com/vra/

Download references

Author information

Authors and Affiliations

Department of Computer and Software Engineering, Polytechnic University of Timisoara, Blvd. V. Parvan 2, 300223, Timisoara, Romania
Marius Crisan

Authors

Marius Crisan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marius Crisan .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Anil Neerukonda Institute of Technology and Sciences, Vishakapatnam, Andhra Pradesh, India
Suresh Chandra Satapathy
Bhubaneswar Engineering College, Bhubaneswar, Odisha, India
Bhabendra Narayan Biswal
University of Hyderabad, Hyderabad, Andhra Pradesh, India
Siba K. Udgata
Department of Computer Science and Engineering, University of Kalyanai, Faculty of Engg., Tech. & Management, Kalyanai, West Bengal, India
J.K. Mandal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Crisan, M. (2015). Approximation Neural Network for Phoneme Synthesis. In: Satapathy, S., Biswal, B., Udgata, S., Mandal, J. (eds) Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014. Advances in Intelligent Systems and Computing, vol 327. Springer, Cham. https://doi.org/10.1007/978-3-319-11933-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-11933-5_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11932-8
Online ISBN: 978-3-319-11933-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics