Language Resources and Evaluation

, Volume 48, Issue 3, pp 419–441 | Cite as

Automatic dialogue act recognition with syntactic features

  • Pavel KrálEmail author
  • Christophe Cerisara
Original Paper


This work studies the usefulness of syntactic information in the context of automatic dialogue act recognition in Czech. Several pieces of evidence are presented in this work that support our claim that syntax might bring valuable information for dialogue act recognition. In particular, a parallel is drawn with the related domain of automatic punctuation generation and a set of syntactic features derived from a deep parse tree is further proposed and successfully used in a Czech dialogue act recognition system based on conditional random fields. We finally discuss the possible reasons why so few works have exploited this type of information before and propose future research directions to further progress in this area.


Dialogue act Language model Sentence structure Speech act Speech recognition Syntax 



This work has been partly supported by the European Regional Development Fund (ERDF), project “NTIS—New Technologies for Information Society”, European Centre of Excellence, CZ.1.05/1.1.00/02.0090. We would like also to thank Ms. Michala Beranová for some implementation work.


  1. Alexandersson, J., Reithinger, N., & Maier, E. (1997). Insights into the dialogue processing of VERBMOBIL. Tech. rep. 191, Germany: Saarbrücken.Google Scholar
  2. Allen, J., & Core, M. (1997). Draft of DAMSL: Dialog act markup in several layers.
  3. Andernach, T. (1996) A machine learning approach to the classification of dialogue utterances. Computing Research Repository.Google Scholar
  4. Ang, J., Liu, Y., & Shriberg, E. (2005). Automatic dialog act segmentation and classification in multiparty meetings. In Proceedings of the ICASSP, Philadelphia, USA.Google Scholar
  5. Austin, J. L. (1962). How to do things with words. Oxford: Clarendon Press.Google Scholar
  6. Bilmes, J. (2005). Backoff model training using partially observed data: Application to dialog act tagging. Tech. rep. UWEETR-2005-0008, Department of Electrical Engineering, University of Washington.Google Scholar
  7. Blanchon, H., & Boitet, C. (2000). Speech translation for French within the C-STAR II consortium and future perspectives. In INTERSPEECH ’00 (pp. 412–417).Google Scholar
  8. Bunt, H. (1994). Context and dialogue control. Think Quarterly, 3, 19–31.Google Scholar
  9. Carberry, S. (1990). Plan recognition in natural language dialogue. Cambridge, MA: MIT Press.Google Scholar
  10. Cerisara, C., Král, P., & Gardent, C. (2011). Commas recovery with syntactic features in French and in Czech. In INTERSPEECH’11 (pp. 1413–1416), Firenze, Italy.Google Scholar
  11. Crook, N., Granell, R., & Pulman, S. (2009). Unsupervised classification of dialogue acts using a dirichlet process mixture model. In Proceedings of the 10th annual meeting of the special interest group in discourse and dialogue (SIGDIAL) (pp. 241–348).Google Scholar
  12. Dhillon, R. B. S., & Carvey, H. S. E. (2004). Meeting recorder project: Dialog act labeling guide. Tech. rep. TR-04-002, International Computer Science Institute.Google Scholar
  13. Di Eugenio, B., Xie, Z., & Serafin, R. (2010). Dialogue act classification, higher order dialogue structure, and instance-based learning. Journal of Discourse and Dialogue Research, 1(2), 1–24.CrossRefGoogle Scholar
  14. Dielmann, A., & Renals, S. (2008). Recognition of dialogue acts in multiparty meetings using a switching DBN. IEEE Transactions on Audio, Speech, and Language Processing, 16(7), 1303–1314.CrossRefGoogle Scholar
  15. Favre, B., Hakkani-Tür, D., & Shriberg, E. (2009). Syntactically-informed models for comma prediction. In ICASSP ’09 (pp. 4697–4700), Taipei, Taiwan.Google Scholar
  16. Garner, P. N., Browning, S. R., Moore, R. K., & Russel, R. J. (1996). A theory of word frequencies and its application to dialogue move recognition. In ICSLP ’96 (Vol. 3, pp. 1880–1883), Philadelphia, USA.Google Scholar
  17. Geertzen, J. (2009). Dialog act recognition and prediction. Ph.D. thesis, University of Tilburg.Google Scholar
  18. Gillick, L., Cox, S. (1989). Some statistical issues in the comparison of speech recognition algorithms. In ICASSP ’1989 (pp. 532–535).Google Scholar
  19. Grau, S., Sanchis, E., Castro, M. J., & Vilar, D. (2004). Dialogue act classification using a Bayesian approach. In 9th international conference speech and computer (SPECOM ’2004) (pp. 495–499), Saint-Petersburg, Russia.Google Scholar
  20. Guo, Y., Wang, H., & Genabith, J. V. (2010). A linguistically inspired statistical model for Chinese punctuation generation. ACM Transactions on Asian Language Information Processing, 9(2), 27.CrossRefGoogle Scholar
  21. Hajičová, E. (2000). Dependency-based underlying-structure tagging of a very large Czech corpus, 41(1), 57–78.Google Scholar
  22. Hajič, J., Böhmová, A., Hajičová, E., & Vidová-Hladká, B. (2000). The Prague dependency treebank: A three-level annotation scenario. In A. Abeillé (Ed.), Treebanks: Building and using parsed corpora (pp. 103–127). Amsterdam: Kluwer.Google Scholar
  23. Jekat, S., et al. (1995). Dialogue acts in VERBMOBIL. Verbmobil report 65.Google Scholar
  24. Jeong, M., & Lee, G. G. (2008). Triangular-chain conditional random fields. IEEE Transactions on Audio, Speech, and Language Processing, 16(7), 1287–1302.CrossRefGoogle Scholar
  25. Ji, G., & Bilmes, J. (2005). Dialog act tagging using graphical models. In Proceedings of the ICASSP (Vol. 1, pp. 33–36), Philadelphia, USA.Google Scholar
  26. Joty, S., Carenini, G., & Lin, C.-Y. (2011). Unsupervised approaches for dialog act modeling of asynchronous conversations. In Proceedings of the IJCAI, Barcelona, Spain.Google Scholar
  27. Jurafsky, D., et al. (1997). Automatic detection of discourse structure for speech recognition and understanding. In IEEE workshop on speech recognition and understanding, Santa Barbara.Google Scholar
  28. Jurafsky, D., & Martin, J. H. (2009). Speech and language processing: An introduction to natural language processing, speech recognition, and computational linguistics (2nd ed.). Upper Saddle River: Prentice-Hall.Google Scholar
  29. Jurafsky, D., Shriberg, E., & Biasca, D. (1997). Switchboard SWBD–DAMSL shallow-discourse-function annotation (Coders manual, draft 13). Tech. rep. 97-01, University of Colorado, Institute of Cognitive Science.Google Scholar
  30. Kautz, H. A. (1987). A formal theory of plan recognition. Tech. rep. 215. NY: Department of Computer Science, University of Rochester.Google Scholar
  31. Keizer, S. A. R., & Nijholt, A. (2002). Dialogue act recognition with Bayesian networks for Dutch dialogues. In 3rd ACL/SIGdial workshop on discourse and dialogue (pp. 88–94), Philadelphia, USA.Google Scholar
  32. Klüwer, T., Uszkoreit, H., & Xu, F. (2010). Using syntactic and semantic based relations for dialogue act recognition. In Proceedings of the 23rd international conference on computational linguistics: Posters (COLING ’10) (pp. 570–578). Stroudsburg, PA, USA: Association for Computational Linguistics. URL:
  33. Kompe, R. (1997). Prosody in speech understanding systems. Berlin: Springer.CrossRefGoogle Scholar
  34. Král, P., Cerisara, C., & Klečková, J. (2005). Combination of classifiers for automatic recognition of dialog acts. In Interspeech ’2005 (pp. 825–828). Lisboa, Portugal: ISCA.Google Scholar
  35. Král, P., Cerisara, C., & Klečková, J. (2006a). Automatic dialog acts recognition based on sentence structure. In ICASSP ’06 (pp. 61–64), Toulouse, France.Google Scholar
  36. Král, P., Klečková, J., Pavelka, T., & Cerisara, C. (2006b). Sentence structure for dialog act recognition in Czech. In ICTTA ’06, Damascus, Syria.Google Scholar
  37. Král, P., Cerisara, C., & Klečková, J. (2007). Lexical structure for dialogue act recognition. Journal of Multimedia (JMM), 2(3), 1–8.Google Scholar
  38. Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the eighteenth international conference on machine learning (ICML ’01) (pp. 282–289). San Francisco, CA: Morgan Kaufmann. URL:
  39. Lavie, A., Pianesi, F., & Levin, L. (2006). The NESPOLE! System for multilingual speech communication over the internet. IEEE Transactions on Audio, Speech, and Language Processing, 14(5), 1664–1673.CrossRefGoogle Scholar
  40. Lendvai, P. A., & van den Bosch, K. E. (2003). Machine learning for shallow interpretation of user utterances in spoken dialogue systems. In Workshop on dialogue systems: Interaction, adaptation and styles management (EACL-03) (pp. 69–78). Hungary: Budapest.Google Scholar
  41. Levin, L., Langley, C., Lavie, A., Gates, D., Wallace, D., & Peterson, K. (2003). Domain specific speech acts for spoken language translation. In 4th SIGdial workshop on discourse and dialogue. Japan: Sapporo.Google Scholar
  42. Litman, D. J. (1985). Plan recognition and discourse analysis: An integrated approach for understanding dialogues. Ph.D. thesis, Rochester, NY: University. of Rochester.Google Scholar
  43. Mast, M., et al. (1996). Automatic classification of dialog acts with semantic classification trees and polygrams. In Connectionist, statistical and symbolic approaches to learning for natural language processing (pp. 217–229).Google Scholar
  44. Mast, M., Kompe, R., Harbeck, S., Kiessling, A., Niemann, H., Nöth, E., et al. (1996). Dialog act classification with the help of prosody. In ICSLP ’96, Philadelphia, USA.Google Scholar
  45. Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., et al. (2007). MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13(2), 95–135.Google Scholar
  46. Orkin, J., & Roy, D. (2010). Semi-automated dialogue act classification for situated social agents in games. In Proceedings of the agents for games and simulations workshop at the 9th international conference on autonomous agents and multiagent systems (AAMAS), Toronto, Canada.Google Scholar
  47. Pavelka, T., Ekštein, K. (2007). JLASER: An automatic speech recognizer written in Java. In XII international conference speech and computer (SPECOM ’2007) (pp. 165–169), Moscow, Russia.Google Scholar
  48. Petukhova, V., & Bunt, H. (2011). Incremental dialogue act understanding. In Proceedings of the 9th international conference on computational semantics (IWCS-9), Oxford.Google Scholar
  49. Power, R. J. D. (1979). The organization of purposeful dialogues. Linguistics, 17, 107–152.Google Scholar
  50. Quarteroni, S., Ivanov, A. V., & Riccardi, G. (2011). Simultaneous dialog act segmentation and classification from human–human spoken conversations. In Proceedings of the ICASSP, Prague, Czech Republic.Google Scholar
  51. Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest semantics for the organization of turn-taking in conversation. Language, 50(4), 696–735.CrossRefGoogle Scholar
  52. Samuel, K., Carberry, S., & Vijay-Shanker, K. (1998). Dialogue act tagging with transformation-based learning. In 17th international conference on computational linguistics (Vol. 2, pp. 1150–1156). Morristown, NJ, USA, Montreal, QC, Canada: Association for Computational Linguistics.Google Scholar
  53. Schegloff, E. A. (1968). Sequencing in conversational openings. American Anthropologist, 70(1), 1075–1095.CrossRefGoogle Scholar
  54. Searle, J. R. (1969). Speech acts: An essay in the philosophy of language.Google Scholar
  55. Serafin, R., & Di Eugenio, B. (2004). LSA: Extending latent semantic analysis with features for dialogue act classification. In Proceedings of the 42nd annual meeting on Association for Computational Linguistics, Spain.Google Scholar
  56. Shriberg, E., Bates, R., Stolcke, A., Taylor, P., Jurafsky, D., Ries, K., et al. (1998). Language and speech, Vol. 41 of special double issue on prosody and conversation, Ch. can prosody aid the automatic classification of dialog acts in conversational speech? (pp. 439–487).Google Scholar
  57. Sporleder, C., & Lascarides, A. (2008). Using automatically labelled examples to classify rhetorical relations: A critical assessment, Natural Language Engineering, 14(3).Google Scholar
  58. Stolcke, A. et al. (2000). Dialog act modeling for automatic tagging and recognition of conversational speech. Computational Linguistics, 26, 339–373.CrossRefGoogle Scholar
  59. Traum, D. R. (1999). Speech acts for dialogue agents. In M. Wooldridge & A. Rao (Eds.), Foundations and theories of rational agents. (pp. 169–201). Dordrecht: Kluwer.CrossRefGoogle Scholar
  60. Tur, G., Guz, U., & Hakkani-Tur, D. (2006). Model adaptation for dialogue act tagging. In Proceedings of the IEEE spoken language technology workshop.Google Scholar
  61. Verbree, D., Rienks, R., & Heylen, D. (2006). Dialog-act tagging using smart feature selection; results on multiple corpora. In The first international IEEE workshop on spoken language technology (SLT), Aruba, Palm Beach.Google Scholar
  62. Webb, N. (2010). Cue-based dialog act classification, Ph.D. thesis, University of Sheffield.Google Scholar
  63. Wright, H. (1998). Automatic utterance type detection using suprasegmental features. In ICSLP ’98 (Vol. 4), Sydney, Australia.Google Scholar
  64. Wright, H., Poesio, M., & Isard, S. (1999). Using high level dialogue information for dialogue act recognition using prosodic features. In ESCA workshop on prosody and dialogue, Holland, Eindhoven.Google Scholar
  65. Zhou, K., & Zong, C. (2009). Dialog-act recognition using discourse and sentence structure information. In Proceedings of the 2009 international conference on asian language processing (IALP ’09) (pp. 11–16). Washington, DC, USA: IEEE Computer Society.Google Scholar
  66. Zimmermann, M., Stolcke, A., & Shriberg, E. (2006). Joint segmentation and classification of dialog acts in multiparty meetings. In ICASSP ’06 (pp. 581–584), Toulouse, France.Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.Department of Computer Science and Engineering, Faculty of Applied SciencesUniversity of West BohemiaPlzeňCzech Republic
  2. 2.Faculty of Applied Sciences, New Technologies for the Information Society (NTIS)University of West BohemiaPlzeňCzech Republic
  3. 3.LORIA UMR 7503VandoeuvreFrance

Personalised recommendations