Advertisement

Prosodic Features’ Criterion for Hebrew

  • Ben Fishman
  • Itshak Lapidot
  • Irit Opher
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11107)

Abstract

Prosody provides important information about intention and meaning, and carries clues regarding dialogue turns, phrase emphasis and even the physiological or emotional condition of the speaker. Prosody has been researched extensively by linguists and speech scientists; However, little attention has been given to formulating and ranking the acoustic features that represent prosodic information. This paper aims at defining a simple methodology that allows us to test whether a feature conveys prosodic information. This way, we can compare different features and rate them as prosodic or content related (In this paper the word “content” refers to the verbal information of the utterance.). We explore many features using a Hebrew dataset especially designed for validating prosodic features, and as the first step of our research we chose two prosody classes: neutral and question. We apply our methodology successfully and find that prosodic features indeed are invariant to the content of the utterance, while correlating with prosodic manifestations. We validate our methodology by showing that our ranking of prosodic features yields similar results to classification based feature selection.

Keywords

Prosody Prosodic features Hebrew database 

Notes

Acknowledgments

The authors thank Ella Erlich, Ruth Aloni-Lavi and Noga Hellman for their help with the Hebrew dataset.

References

  1. 1.
    Ang, J., Dhillon, R., Krupski, A., Shriberg, E., Stolcke, A.: Prosody-based automatic detection of annoyance and frustration in human-computer dialog. In: Seventh International Conference on Spoken Language Processing (2002)Google Scholar
  2. 2.
    Chen, S.H., Hwang, S.H., Wang, Y.R.: An RNN-based prosodic information synthesizer for mandarin text-to-speech. IEEE Trans. Speech Audio Process. 6(3), 226–239 (1998)CrossRefGoogle Scholar
  3. 3.
    Rose, R.C.: Prosody recognition from speech utterances using acoustic and linguistic based models of prosodic events. In: Sixth European Conference on Speech Communication and Technology (1999)Google Scholar
  4. 4.
    Diehl, J.J., Paul, R.: The assessment and treatment of prosodic disorders and neurological theories of prosody. Int. J. Speech-Lang. Pathol. 11(4), 287–292 (2009)CrossRefGoogle Scholar
  5. 5.
    Eyben, F., Wöllmer, M., Schuller, B.: OpenSMILE: the Munich versatile and fast open-source audio feature extractor. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1459–1462. ACM (2010)Google Scholar
  6. 6.
    Hastie, W.H., Poesio, M., Isard, S.: Automatically predicting dialogue structure using prosodic features. Speech Commun. 36, 63–79 (2002)CrossRefGoogle Scholar
  7. 7.
    Hualde, J., Prieto, P.: Towards an international prosodic alphabet (IPrA). Lab. Phonol. 7 (2016)CrossRefGoogle Scholar
  8. 8.
    Li, S., Wang, Y., Sun, L., Lee, L.: Improved tonal language speech recognition by integrating spectro-temporal evidence and pitch information with properly chosen tonal acoustic units. In: INTERSPEECH (2011)Google Scholar
  9. 9.
    Liberman, M.: Emotional Prosody Speech and Transcripts LDC2002S28 (2002). https://catalog.ldc.upenn.edu/LDC2002S28
  10. 10.
    Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  11. 11.
    Mary, L., Yegnanarayana, B.: Extraction and representation of prosodic features for language and speaker recognition. Speech Commun. 50(10), 782–796 (2008)CrossRefGoogle Scholar
  12. 12.
    McCann, J., Peppé, S.: Prosody in autism spectrum disorders: a critical review. Int. J. Lang. & Commun. Disord. 38(4), 325–350 (2003)CrossRefGoogle Scholar
  13. 13.
    Pierre-Yves, O.: The production and recognition of emotions in speech: features and algorithms. Int. J. Hum.-Comput. Stud. 59(1–2), 157–183 (2003)CrossRefGoogle Scholar
  14. 14.
    Qavi, A., Khan, S.A., Basir, K.: Voice morphing based on spectral features and prosodic modification. In: Multi-Topic Conference (INMIC), pp. 401–405. IEEE (2014)Google Scholar
  15. 15.
    Silverman, K., et al.: ToBI: a standard for labeling English prosody. In: Second International Conference on Spoken Language Processing (1992)Google Scholar
  16. 16.
    Tong, R., Ma, B., Zhu, D., Li, H., Chng, E.S.: Integrating acoustic, prosodic and phonotactic features for spoken language identification. In: Acoustics, Speech and Signal Processing, vol. 1, p. I. IEEE (2006)Google Scholar
  17. 17.
    Vaissière, J.: Language-independent prosodic features. In: Cutler, A., Ladd, D.R. (eds.) Prosody: Models and Measurements, pp. 53–66. Springer, Heidelberg (1983).  https://doi.org/10.1007/978-3-642-69103-4_5CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Tel Aviv UniversityTel AvivIsrael
  2. 2.Afeka Academic College of EngineeringTel AvivIsrael

Personalised recommendations