Skip to main content

Approximation Neural Network for Phoneme Synthesis

  • Conference paper
  • 3670 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 327))

Abstract

The paper presents a dynamic method for phoneme synthesis using an elemental-based concatenation approach. The vocal sound waveform can be decomposed into elemental patterns that have slight modifications of the shape as they chain one after another in time but keep the same dynamics which is specific to each phoneme. An approximation or RBF network is used to generate elementals in time with the possibility of controlling the characteristics of the sound signals. Based on this technique a quite realistic mimic of a natural sound was obtained.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Edgington, M.: Investigating the limitations of concatenative synthesis. In: Proceedings of Eurospeech 1997, Rhodes/Athens, Greece, pp. 593–596 (1997)

    Google Scholar 

  2. Bulut, M., Narayanan, S.S., Syrdal, A.: Expressive speech synthesis using a concatenative synthesizer. In: Proceedings of InterSpeech, Denver, CO, pp. 1265–1268 (2002)

    Google Scholar 

  3. Balyan, A., Agrawal, S.S., Dev, A.: Speech Synthesis: A Review. International Journal of Engineering Research & Technology (IJERT) 2(6), 57–75 (2013)

    Google Scholar 

  4. Banbrrok, M., McLaughlin, S., Mann, I.: Speech characterization and synthesis by nonlinear methods. IEEE Trans. Speech Audio Process 7(1), 1–17 (1999)

    Article  Google Scholar 

  5. Pitsikalis, V., Kokkinos, I., Maragos, P.: Nonlinear analysis of speech signals: Generalized dimensions and Lyapunov exponents. In: Proc. European Conf. on Speech Communication and Technology-Eurospeech-03, pp. 817–820 (September 2003)

    Google Scholar 

  6. McLaughlin, S., Maragos, P.: Nonlinear methods for speech analysis and synthesis. In: Marshall, S., Sicuranza, G. (eds.) Advances in Nonlinear Signal and Image Processing, vol. 6, p. 103. Hindawi Publishing Corporation (2007)

    Google Scholar 

  7. Tao, C., Mu, J., Xu, X., Du, G.: Chaotic characteristic of speech signal and its LPC residual. Acoust. Sci. & Tech. 25(1), 50–53 (2004)

    Article  Google Scholar 

  8. Koga, H., Nakagawa, M.: Chaotic and Fractal Properties of Vocal Sounds. Journal of the Korean Physical Society 40(6), 1027–1031 (2002)

    Google Scholar 

  9. Lo, W.K., Ching, P.C.: Phone-Based Speech Synthesis With Neural Network And Articulatory Control. In: Proceedings of Fourth International Conference on Spoken Language (ICSLP 1996), vol. 4, pp. 2227–2230 (1996)

    Google Scholar 

  10. Malcangi, M., Frontini, D.: A Language-Independent Neural Network-Based Speech Synthesizer. Neurocomputing 73(1-3), 87–96 (2009)

    Article  Google Scholar 

  11. Raghavendra, E.V., Vijayaditya, P., Prahallad, K.: Speech synthesis using artificial neural networks. In: National Conference on Communications (NCC), Chennai, India, pp. 1–5 (2010)

    Google Scholar 

  12. Frank, R.J., Davey, N., Hunt, S.P.: Time Series Prediction and Neural Networks. Journal of Intelligent and Robotic Systems 31, 91–103 (2001)

    Article  MATH  Google Scholar 

  13. Kinzel, W.: Predicting and generating time series by neural networks: An investigation using statistical physics. Computational Statistical Physics, 97–111 (2002)

    Google Scholar 

  14. Priel, A., Kanter, I.: Time series generation by recurrent neural networks. Annals of Mathematics and Artificial Intelligence 39, 315–332 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  15. Medsker, L.R., Jain, L.C.: Recurrent Neural Networks: Design and Applications. CRC Press (2001)

    Google Scholar 

  16. Kalinli, A., Sagiroglu, S.: Elman Network with Embedded Memory for System Identification. Journal of Information Science and Engineering 22, 1555–1568 (2006)

    Google Scholar 

  17. Coca, A.E., Romero, R.A.F., Zhao, L.: Generation of composed musical structures through recurrent neural networks based on chaotic inspiration. In: The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 3220–3226 (July 2011)

    Google Scholar 

  18. Röbel, A.: Morphing Sound Attractors. In: Proc. of the 3rd. World Multiconference on Systemics, Cybernetics and Informatics (SCI 1999) AES 31st International Conference (1999)

    Google Scholar 

  19. Crisan, M.: A Neural Network Model for Phoneme Generation. Applied Mechanics and Materials 367, 478–483 (2013)

    Article  Google Scholar 

  20. Takens, F.: Detecting strange attractors in turbulence. Lecture Notes in Mathematics 898, 366–381 (1981)

    Article  MathSciNet  Google Scholar 

  21. Small, M.: Applied nonlinear time series analysis: applications in physics, physiology and finance. World Scientific Publishing Co., NJ (2005)

    Google Scholar 

  22. Sprott, J.C.: Chaos and Time-Series Analysis. Oxford University Press, NY (2003)

    Google Scholar 

  23. Kononov, E.: Visual Recurrence Analysis Software Package, Version 4.9 (accessed 2013), http://nonlinear.110mb.com/vra/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marius Crisan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Crisan, M. (2015). Approximation Neural Network for Phoneme Synthesis. In: Satapathy, S., Biswal, B., Udgata, S., Mandal, J. (eds) Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014. Advances in Intelligent Systems and Computing, vol 327. Springer, Cham. https://doi.org/10.1007/978-3-319-11933-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11933-5_1

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11932-8

  • Online ISBN: 978-3-319-11933-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics