Skip to main content

AM-FM: Modulation and Demodulation Techniques

  • Chapter
  • First Online:
Advances in Non-Linear Modeling for Speech Processing

Part of the book series: SpringerBriefs in Electrical and Computer Engineering ((BRIEFSSPEECHTECH))

  • 876 Accesses

Abstract

Analysis of speech signals is usually carried out using STFT. The most successful features currently being used in both speech recognition and speaker recognition systems are cepstral features. The cepstral features in one way or another are based on the source-filter model of speech production. However, it is well known that a significant part of the acoustic information cannot be modeled by the linear source-filter model. The source-filter model assumes that the sound source for the voiced speech is localized in the larynx and the vocal tract acts as a convolution filter for the emitted sound. Examples of phenomena not well-captured by the source-filter model include unstable airflow, turbulence and nonlinearities arising from oscillators with time-varying masses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Rabiner LR, Shafer RW (1989) Digital signal processing of speech signals. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  2. Rao A, Kumaresan R (2000) On decomposing speech into modulated components. IEEE Trans Speech Audio Process 8(3):240–254

    Article  Google Scholar 

  3. Dimitriadis D, Maragos P (2003) Robust energy demodulation based on continuous models with application to speech recognition. In: Proceedings of EUROSPEECH’03, Geneva, pp 2853–2856

    Google Scholar 

  4. Maragos P, Kaiser JF, Quatieri TF (1993) Energy separation in signal modulations with application to speech analysis. IEEE Trans Signal Process 41(10):3024–3051

    Article  MATH  Google Scholar 

  5. Teager HM (1980) Some observations on oral air flow during phonation. IEEE Trans Speech Audio Process 28(5):599–601

    Article  Google Scholar 

  6. Patterson RD (1987) A pulse ribbon model of monoaural phase perception. J Acoust Soc Am 82(5):1560–1586

    Article  Google Scholar 

  7. Paliwal K, Arslan L (2003) Usefulness of phase spectrum in human speech perception. In: Proceeding of EUROSPEECH’03, Geneva, pp 2117–2120

    Google Scholar 

  8. Paliwal K, Alsteris L (2005) On the usefulness of stft phase spectrum in human listening tests. Speech Commun 45(2):153–170

    Article  Google Scholar 

  9. Alsteris L, Paliwal K (2006) Further intelligibility results from human listening tests using the short-time phase spectrum. Speech Commun 48(6):727–736

    Article  Google Scholar 

  10. Loughlin PJ, Tacer B (1996) On the amplitude and frequency modulation decomposition of signals. J Acoust Soc Am 100(3):1594–1601

    Article  Google Scholar 

  11. Potamianos A, Maragos P (1996) Speech formant frequency and bandwidth tracking using multiband energy demodulation. J Acoust Soc Am 99(6):3795–3806

    Article  Google Scholar 

  12. Li G, Qiu L, Ng LK (2000) Signal representation based on instantaneous amplitude models with application to speech synthesis. IEEE Trans Speech Audio Process 8(3):353–357

    Article  Google Scholar 

  13. Dimitriadis V, Maragos P, Potamianos A (2005) Robust AM-FM features for speech recognition. IEEE Signal Process Lett 12(9):621–624

    Article  Google Scholar 

  14. Potamianos A, Maragos P (2001) Time-frequency distributions for automatic speech recognition. IEEE Trans Speech Audio Process 9(3):196–200

    Article  Google Scholar 

  15. Jankowski CR, Quatieri TF, Reynolds DA (1995) Measuring fine structure in speech: application to speaker identification. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, pp 325–328

    Google Scholar 

  16. Grimaldi M, Cummins F (2008) Speaker identification using instantaneous frequencies. IEEE Trans Audio Speech Lang Process 16(6):1097–1111

    Google Scholar 

  17. Lindemann E, Kates JM (1999) Phase relationships and amplitude envelopes in auditory perception. In: Proceedings of the IEEE workshop on applications of signal processing to audio and acouslics, New Paltz, New York, pp 17–20

    Google Scholar 

  18. Zeng FG, Nie K, Stickney GS, Kong YY, Vongphoe M, Bhargave A, Wei C, Cao K (2005) Speech recognition with amplitude and frequency modulations. Proc Natl Acad Sci U S A 102(7):2293–2298

    Article  Google Scholar 

  19. Saberi K, Hafter ER (1995) A common neural code for frequency and amplitude-modulated sounds. Nature 374:537–539

    Article  Google Scholar 

  20. Haykin S (1994) Communication systems. Wiley, New York

    Google Scholar 

  21. Boashash B (1992) Estimating and interpreting the instanteneous frequency of a signal-part 1: fundamentals. Proc IEEE 80(4):519–538

    Google Scholar 

  22. Potamianos A, Maragos P (1995) Speech formant frequency and bandwidth tracking using multiband energy demodulation. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP’95), pp 784–787

    Google Scholar 

  23. McAulay RJ, Quatieri TF (1986) Speech analysis/synthesis based on a sinusoidal representation. IEEE Trans Acoustic Speech Signal Process 34:744–754

    Article  Google Scholar 

  24. Cohen L, Lee C (1992) Instantaneous bandwidth. In: Boashash B (ed) Time frequency signal analysis-methods and applications, Longman Cheshire, London

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2012 The Author(s)

About this chapter

Cite this chapter

Holambe, R.S., Deshpande, M.S. (2012). AM-FM: Modulation and Demodulation Techniques. In: Advances in Non-Linear Modeling for Speech Processing. SpringerBriefs in Electrical and Computer Engineering(). Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-1505-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-1505-3_5

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4614-1504-6

  • Online ISBN: 978-1-4614-1505-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics