Data-Driven Methods for Spoken Language Understanding



Spoken dialogue systems need to be able to interpret the spoken input from theuser. This is done by mapping the user’s spoken utterance to a representation ofthe meaning of that utterance, and then passing this representation to thedialogue manager. This process begins with the application of automatic speechrecognition (ASR) technology, which maps the speech to hypotheses about thesequence of words in the utterance. It is then the job of spoken languageunderstanding (SLU) to map the word recognition hypotheses to hypothesisedmeanings. The representation of this meaning is called the semantics of theutterance.


Support Vector Machine Support Vector Machine Classifier Semantic Representation Semantic Tree Markov Logic Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was partially funded by the EU FP7 Programme under grant agreements 216594 (CLASSiC), 287615 (PARLANCE) and 249119 (META-NET), by Ministry of Education, Youth and Sports of the Czech Republic under the grant agreement LK11221, and by core research funding of Charles University in Prague. The authors would like to thank François Mairesse, for discussions about the STC parser, and Lonneke van der Plas and Paola Merlo for discussions about the work on SRL for SLU.


  1. 1.
    Bonneau-Maynard, H., Ayache, C.,  Bechet, F.,  Denis, A.,  Kuhn, A.,  Lefvre, F., Mostefa, D., Qugnard, M., Rosset, S., Servan, J., Vilaneau, S.: Results of the French Evalda-Media evaluation campaign for literal understanding. In: Proceedins of the International Conference on Language Resources and Evaluation (LREC), pp. 2054–2059, 2006Google Scholar
  2. 2.
    Brill, E.: Transformation-based Error-driven Learning and natural language processing: A case study in Part-of-Speech Tagging. Computational Linguistics 21(4), 543–565 (1995)Google Scholar
  3. 3.
    Briscoe, E., Carroll, J., Watson, R.: The second release of the RASP system. In: Proceedings of COLING/ACL, 2006Google Scholar
  4. 4.
    Coppola, B., Moschitti, A., Riccardi, G.: Shallow semantic parsing for spoken language understanding. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers, pp. 85–88, 2009Google Scholar
  5. 5.
    Dahl, D.A., Bates, M., Brown, M., Fisher, W., Hunicke-Smith,  K., Pallett, D., Pao, C., Rudnicky, A., Shriberg, E.: Expanding the scope of the ATIS task: The ATIS-3 corpus. In: Proceedings of the ARPA HLT Workshop, 1994Google Scholar
  6. 6.
    Dinarelli, M., A. Moschitti, Riccardi, G.: Re-ranking models for spoken language understanding. In: Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009), pp. 202–210, 2009Google Scholar
  7. 7.
    Dinarelli, M., Moschitti, A., Riccardi, G.: Re-ranking models based-on small training data for spoken language understanding. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 1076–1085, 2009Google Scholar
  8. 8.
    Dinarelli, M., Moschitti, A.,  Riccardi, G.: Discriminative reranking for spoken language understanding. IEEE Transactions on Audio, Speech, and Language Processing 20(2), 526 –539 (2012)Google Scholar
  9. 9.
    Dinarelli, M., Quarteroni, S., Tonelli, S., Moschitti, A., Riccardi, G.: Annotating spoken dialogs: From speech segments to dialog acts and frame semantics. In: Proceedings of SRSL 2009, the 2nd Workshop on Semantic Representation of Spoken Language, pp. 34–41, 2009Google Scholar
  10. 10.
    Hahn, S., Dinarelli, M., Raymond, C., Lefevre, F., Lehnen, P., De Mori, R., Moschitti, A., Ney, H., Riccardi, G.: Comparing Stochastic Approaches to Spoken Language Understanding in Multiple Languages. IEEE Transactions on Audio, Speech, and Language Processing 19(6), 1569–1583 (2011)CrossRefGoogle Scholar
  11. 11.
    Hajič, J., Ciaramita, M., Johansson, R., Kawahara, D., Martí, M., Màrquez, L., Meyers, A., Nivre, J., Padó, S., Štěpánek, J., Straňák, P., Surdeanu, M., Xue, N.,  Zhang, Y.: The conll-2009 shared task: Syntactic and semantic dependencies in multiple languages. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task, pp. 1–18. Boulder, Colorado, June 2009Google Scholar
  12. 12.
    He, Y., Young, S.: Hidden vector state model for hierarchical semantic parsing. In: Proceedings of ICASSP, Hong Kong (2003)Google Scholar
  13. 13.
    He, Y., Young, S.: Semantic processing using the Hidden Vector State model. Computer Speech & Language 19(1), 85–106 (2005)CrossRefGoogle Scholar
  14. 14.
    Henderson, J.: Inducing history representations for broad coverage statistical parsing. In: Proc. joint meeting of North American Chapter of the Association for Computational Linguistics and the Human Language Technology Conf., pp. 103–110. Edmonton, Canada (2003)Google Scholar
  15. 15.
    Henderson, J.: Semantic decoder which exploits syntactic-semantic parsing, for the towninfo task. Technical Report Deliverable 2.2, CLASSiC Project, 2009Google Scholar
  16. 16.
    Jurčíček, F., Gašić, M., Keizer, S., Mairesse, F., Thomson, B., Yu, K., Young, S.: Transformation-based learning for semantic parsing. In: Proceedings of Interspeech, pp. 2719–2722. ISCA, 2009Google Scholar
  17. 17.
    Kate, R.J., Wong, Y.W., Mooney, R.J.: Learning to transform natural to formal languages. In: Proceedings of AAAI, 2005Google Scholar
  18. 18.
    Kate, R.J.: A dependency-based word subsequence kernel. In: Proceedings of EMNLP, 2008Google Scholar
  19. 19.
    Mairesse, F.: Training tools and semantic decoder for the towninfo task: D2.1 (prototype). Technical Report D2.1, CLASSiC, February 2009Google Scholar
  20. 20.
    Mairesse, F., Gašić, M.,  Jurčíček, F., Keizer, S.,   Thomson, B.,  Yu, K.,  Young, S.: Spoken language understanding from unaligned data using discriminative classification models. In: Proceedings of ICASSP, 2009Google Scholar
  21. 21.
    Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)Google Scholar
  22. 22.
    Merlo, P., Musillo, G.: Semantic parsing for high-precision semantic role labelling. In: Proceedings of the 20th Conference on Computational Natural Language Learning (CoNLL 2008), Manchester, UK (2008)Google Scholar
  23. 23.
    Meza-Ruiz, I.V., Riedel, S., Lemon, O.: Spoken Language Understanding in dialogue systems, using a 2-layer Markov Logic Network: Improving semantic accuracy. In: Proceedings of Londial, 2008Google Scholar
  24. 24.
    Palmer, M., Gildea, D., Kingsbury, P.: The proposition bank: An annotated corpus of semantic roles. Computational Linguistics 31(1), 71–106 (2005)CrossRefGoogle Scholar
  25. 25.
    van der Plas, L., Henderson, J., Merlo, P.: Domain adaptation with artificial data for semantic parsing of speech. In: Proceedings of the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers, pp. 125–128. Boulder, Colorado, June 2009Google Scholar
  26. 26.
    Ruppenhofer, J., Ellsworth, M., Petruck, M., Johnson, C., Scheffczyk, J.: Framenet ii: Extended theory and practice. Technical report, Berkeley, CA (2010)Google Scholar
  27. 27.
    Surdeanu, M., Johansson, R., Meyers, A., Màrquez, L., Nivre, J.: The CoNLL-2008 shared task on joint parsing of syntactic and semantic dependencies. In: Proceedings of the 12th Conference on Computational Natural Language Learning (CoNLL-2008), 2008Google Scholar
  28. 28.
    Thomson, B.,  Yu, K., Gašić, M., Keizer, S.,  Mairesse, F., Schatzmann, J., Young, S.: Evaluating semantic-level confidence scores with multiple hypotheses. In: Proceedings of the Ninth Conference of the International Speech Communication Association (INTERSPEECH 2008), pp. 1153–1156. Brisbane, Australia (2008)Google Scholar
  29. 29.
    Thomson, B., Gašić, M., Keizer, S., Mairesse, F., Schatzmann, J., Yu, K., Young, S.: User study of the Bayesian update of dialogue state approach to dialogue management. In: Proceedings of Interspeech, 2008Google Scholar
  30. 30.
    Ward, W.: Understanding spontaneous speech: the phoenix system. In: Proceedings of ICASSP, vol. 1, pp. 365–367, 1991Google Scholar
  31. 31.
    Williams, J.: Applying POMDPs to Dialog Systems in the Troubleshooting Domain. In: Proceedings of HLT/NAACL Workshop on Bridging the Gap: Academic and Industrial Research in Dialog TechnologyGoogle Scholar
  32. 32.
    Wu, W.-L., Lu, R.-Z., Duan, J.-Y., Liu, H., Gao, F., Chen, Y.-Q.: Spoken language understanding using weakly supervised learning. Computer Speech & Language, 24(2), 358–382 (2010)CrossRefGoogle Scholar
  33. 33.
    Young, S.: CUED standard dialogue acts. Technical report, Cambridge University Engineering Dept., 2007Google Scholar
  34. 34.
    Zettlemoyer, L.S., Collins, M.: Online learning of relaxed CCG grammars for parsing to logical form. In: Proceedings of EMNLP-CoNLL, 2007Google Scholar
  35. 35.
    Zhou, D., He, Y.: Discriminative Training of the Hidden Vector State Model for Semantic Parsing. IEEE Transactions on Knowledge and Data Engineering 21(1), 66–77 (2009)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2012

Authors and Affiliations

  1. 1.Département d’InformatiqueUniversité de Genéve, BattelleCarougeSwitzerland
  2. 2.Faculty of Mathematics and PhysicsCharles University in PraguePragueCzech Republic

Personalised recommendations