Glissando: a corpus for multidisciplinary prosodic studies in Spanish and Catalan

Abstract

Literature review on prosody reveals the lack of corpora for prosodic studies in Catalan and Spanish. In this paper, we present a corpus intended to fill this gap. The corpus comprises two distinct data-sets, a news subcorpus and a dialogue subcorpus, the latter containing either conversational or task-oriented speech. More than 25 h were recorded by twenty eight speakers per language. Among these speakers, eight were professional (four radio news broadcasters and four advertising actors). The entire material presented here has been transcribed, aligned with the acoustic signal and prosodically annotated. Two major objectives have guided the design of this project: (i) to offer a wide coverage of representative real-life communicative situations which allow for the characterization of prosody in these two languages; and (ii) to conduct research studies which enable us to contrast the speakers different speaking styles and discursive practices. All material contained in the corpus is provided under a Creative Commons Attribution 3.0 Unported License.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. 1.

    http://www.cadenaser.com.

  2. 2.

    http://www.phon.ucl.ac.uk/home/sampa/spanish.htm.

  3. 3.

    http://liceu.uab.es/~joaquim/language_resources/SAMPA_Catalan.html.

References

  1. Adell, J., Escudero, D., & Bonafonte, A. (2012). Production of filled pauses in concatenative speech synthesis based on the underlying fluent sentence. Speech Communication, 54(3), 459–476.

    Article  Google Scholar 

  2. Albelda Marco, M. (2005). Sistemas de transcripción de los corpus orales del español. In M. Carrió (Ed.), Perspectivas interdisciplinares de la linguística aplicada, vol 2. Asociación Española de Lingüística Aplicada, pp. 381–388.

  3. Anderson, A., Bader, M., Bard, E., Boyle, E., Doherty, G. M., Garrod, S., et al. (1991). The hrc map task corpus. Language and Speech, 24, 351–366.

  4. Beckman, M., Hirschberg, J., & Shattuck-Hufnagel, S. (2005). The original ToBI system and the evolution of the ToBI framework. In S. A. Jun (Ed.), Prosodic typology: The phonology of intonation and phrasing (pp. 9–54). New York: Oxford University Press.

    Chapter  Google Scholar 

  5. Boersma, P., & Weenink, D. (2012). Praat: Doing phonetics by computer [computer program]. version 5.3.09, retrieved 10 march 2012 from http://www.praat.org.

  6. Botinis A., Granstrom, B., & Mobius, B. (2001). Developments and paradigms in intonation research. Speech Communication, 33(4), 263–296.

    Article  Google Scholar 

  7. Campione, E., & Veronis, J. (1998). Multext: A multilingual prosodic database. In Proceedings of ICSLP 98, vol. 7 (pp. 3163–3166).

  8. Cresti, E., & Moneglia, M. (2005). C‐ORAL‐ROM. integrated reference corpora for spoken romance languages. John Benjamins Studies in Corpus Linguistics 15.

  9. de-la-Mota, C., & Rodero, E. (2011) La entonación en la información radiofónica. In El Estudio de la prosodia en España en el siglo XXI: perspectivas y ámbitos (pp. 159–176). Annex de Quaderns de Filologia, Facultat de Filologia, Universitat de València.

  10. Escudero, D., & Cardeñoso Payo, V. (2007). Applying data mining techniques to corpus based prosodic modeling. Speech Communication 49(3), 213–229.

    Article  Google Scholar 

  11. Escudero, D., Cardeñoso, V., & Bonafonte, A. (2002). Corpus based extraction of quantitative prosodic parameters of stress groups in Spanish. In Proceedings of ICASSP 2002, vol. 1 (pp. 481–484).

  12. Escudero, D., Aguilar, L., Bonafonte, A., & Garrido, J. (2009). On the definition of a prosodically balanced copus: Combining greedy algorithms with expert guided manipulation. Revista de la Sociedad Española de Procesamiento del Lenguaje Natural, 43, 93–102.

    Google Scholar 

  13. Escudero, D., Cardeñoso, V., Vivaracho, C., Aguilar, L., de-la-Mota, C., Garrido, J., et al. (2010a). Proyecto glissando: Grabación de corpus prosódico de noticias y diálogos en español. Tech. Rep. IT-DI-2010-3, Departamento de Informática, Universidad de Valladolid.

  14. Escudero, D., Garrido, J., Aguilar, L., Bonafonte, A., González, C., & Rodero, E. (2010b). Glissando project: Bilingual Spanish and Catalan corpus radio news text contents selection. Tech. Rep. IT-DI-2010-2, Departamento de Informática, Universidad de Valladolid.

  15. Escudero, D., Gonzalez-Ferreras, C., Garrido, J. M., Rodero, E., Aguilar, L., & Bonafonte, A. (2010c). Combining greedy algorithms with expert guided manipulation for the definition of a balanced prosodic Spanish–Catalan radio news corpus. In Proceedings of Speech Prosody 2010.

  16. Escudero, D., Aguilar, L., Ferreras, C. G., Vivaracho-Pascual, C., & Cardeñoso-Payo, V. (2011a). Cross-lingual English Spanish tonal accent labeling using decision trees and neural networks. In C. M. Travieso-González & J. B. A. Hernández (Eds.), Advances in nonlinear speech processing—5th international conference on nonlinear speech processing, NOLISP 2011, Las Palmas de Gran Canaria, Spain, November 7–9, 2011. Proceedings, Springer, Lecture Notes in Computer Science, vol. 7015, (pp. 63–70).

  17. Escudero, D., Vivaracho-Pascual, C., González-Ferreras, C., Cardeñoso-Payo, V., & Aguilar, L. (2011b). Analysis of inconsistencies in cross-lingual automatic tobi tonal accent labeling. In I. Habernal & V. Matousek (Eds.), Text, speech and dialogue—14th International conference, TSD 2011, Pilsen, Czech Republic, September 1–5, 2011. Proceedings, Springer, Lecture Notes in Computer Science, vol. 6836 (pp. 41–48).

  18. Escudero, D., Aguilar, L., Vanrell, M., & Prieto, P. (2012). Analysis of inter-transcriber consistency in the Cat_ToBI prosodic labeling system. Speech Communication, 54, 566–582

    Google Scholar 

  19. Eskénazi, M. (1993). Trends in speaking styles research. In Proceedings of Eurospeech 1993, vol. 1, pp. 501–505.

  20. Fernández, A. (2005). Aspectos generales acerca del proyecto internacional AMPER en España. Estudios de Fonética Experimental, XIV, 13–27.

  21. Font, D. (2006). Corpus oral de parla espontània. Gràfics i arxius de veu. Biblioteca Phonica 4.

  22. Garrido, J. (1996). Modelling Spanish intonation for text-to-speech applications. PhD thesis.

  23. Garrido, J. (2010). A tool for automatic F0 stylisation, annotation and modelling of large corpora. In Speech Prosody 2010, Chicago.

  24. Garrido, J., & Rustullet, S. (2011). Patrones melódicos en el habla de diálogo en español: un primer análisis del corpus Glissando. Oralia: Análisis del discurso oral, 14, 129–160.

    Google Scholar 

  25. Garrido, J., Bofias, E., Laplaza, Y., Marquina, M., Aylett, M., & Ch, P. (2008). The Cerevoice speech synthesiser. In Actas de las V Jornadas de Tecnología del Habla (Bilbao).

  26. Gonzalez-Ferreras, C., Escudero‐Mancebo, D., Vivaracho‐Pascual, C., & Cardenoso-Payo, V. (2012). Improving automatic classification of prosodic events by pairwise coupling. Audio, Speech, and Language Processing, IEEE Transactions on, 20(7), 2045–2058.

    Google Scholar 

  27. Hirschberg, J. (2000). A corpus-based approach to the study of speaking style. In H. Horne (Ed.), Prosody: Theory and experiments. Studies presented to Gosta Bruce (pp. 335–350). Berlin: Kluwer Academic Publishers.

  28. Huang, X., Acero, A., & Hon, H. W. (2001). Spoken language processing: A guide to theory, algorithm and system development. Prentice Hall PTR.

  29. Maekawa, K. (2003). Corpus of spontaneous Japanese: Its design and evaluation. In Proceeding of ISCA and IEEE workshop on spontaneous speech processing and recognition, pp. 7–12.

  30. Maekawa, K., Koiso, H., Furui, S., & Isahara, H. (2000). Spontaneous speech corpus of Japanese. In Proceeding of the 2nd international conference of language resources and evaluation, Vol. 2 (pp. 947–952).

  31. McAllister, J., Sotillo, C., Bard, E., & Anderson, A. (1990). Using the Map Task to investigate variability in speech. Occasional paper.

  32. Nagorski, A., Boves, L., & Steeneken, H. (2002). Optimal selection of speech data for automatic speech recognition systems. In ICSLP, pp. 2473–2476.

  33. Ostendorf, M., Price, P., & Shattuck, S. (1995). The Boston University Radio News Corpus. Tech. rep., Boston University.

  34. Payrató, L., & Fitó, J. (2005). Corpus audiovisual plurilingüe. Tech. Rep. 35, Universitat de Barcelona.

  35. Penny, R. (2000). Variation and change in Spanish. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  36. Pitt, M., Johnson, K., Hume, E., Kiesling, S., & Raymon, W. (2005). The Buckeye corpus of conversational speech: Labeling conventions and a test of transcriber reliability. Speech Communication, 45, 89–95.

    Article  Google Scholar 

  37. Pitt, M., Dilley, L., Johnson, K., Kiesling, S., Raymond, W., Hume, E., et al. (2007). Buckeye corpus of conversational speech (2nd release). [http://www.buckeyecorpus.osu.edu]. Columbus, OH: Department of Psychology, Ohio State University (distributor).

  38. Prieto, P., & Cabré, T. (2010). (coords.). The interactive atlas of Catalan intonation. http://prosodia.upf.edu/atlesentonacio/index-english.html.

  39. Rodero, E. (2006). Analysis of intonation in news presentation on television. In Proceedings of ExLing.

  40. Rodero, E. (2007). Characterization of a proper news presentation in the audiovisuals messages. Estudios del mensaje periodístico, pp. 523–542.

  41. van Santen, J. P., & Buchsbaum, A. L. (1997). Methods for optimal text selection. In Proceedings of Eurospeech 1997, pp. 553–556.

  42. Sperber-McQueen, C., & Burnard, L. (1994) Guidelines for electronic text encoding and interchange. Chicago and Oxford: Text Encoding Initiative.

  43. Strangert, E., & Gustafson, J. (2008). What makes a good speaker? Subject ratings, acoustic measurements and perceptual evaluations. In INTERSPEECH’08, pp. 1688–1691.

  44. Taylor, P. (2009). Text-to-speech synthesis. Cambridge: Cambridge University Press.

  45. Xu, Y. (2001). Speech prosody: A methodological review. Journal of Speech Sciences, 1(1), 85–115.

    Google Scholar 

Download references

Acknowledgments

This work has been partly supported by the National R&D&I Plan of the Spanish Government (FFI2008-04982-C03-01/FILO, FFI2008-04982-C03-02/FILO and FFI2008-04982-C03-03/FILO projects).

Author information

Affiliations

Authors

Corresponding author

Correspondence to David Escudero.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Garrido, J.M., Escudero, D., Aguilar, L. et al. Glissando: a corpus for multidisciplinary prosodic studies in Spanish and Catalan. Lang Resources & Evaluation 47, 945–971 (2013). https://doi.org/10.1007/s10579-012-9213-0

Download citation

Keywords

  • Prosodic corpus
  • Radio news corpus
  • Dialogue corpus
  • Spanish corpus
  • Catalan corpus