Detection of Dialogue Acts Using Perplexity-Based Word Clustering

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4629)


In the present work we used a word clustering algorithm based on the perplexity criterion, in a Dialogue Act detection framework in order to model the structure of the speech of a user at a dialogue system. Specifically, we constructed an n-gram based model for each target Dialogue Act, computed over the word classes. Then we evaluated the performance of our dialogue system on ten different types of dialogue acts, using an annotated database which contains 1,403,985 unique words. The results were very promising since we achieved about 70% of accuracy using trigram based models.


Training Corpus Dialogue System Word Class Word Cluster Linguistic Data Consortium 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Gianchin, E., Mc Glashan, S.: Corpus-based Methods in Speech Processing. Kluwer Academic, Dordrecht (1997)Google Scholar
  2. 2.
    Stolke, A., Coccaro, N., Bates, R., Taylor, P., van Ess-Dykema, C., Ries, K., Shriberg, E., Jurafsky, D., Martin, R., Meteer, M.: Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech, Computational Linguistics 26(3)Google Scholar
  3. 3.
    Alshawi, H.: Effective Utterance Classification with Unsupervised Phonotactic Models. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Edmonton, Canada, vol. 1, pp. 1–7 (2003)Google Scholar
  4. 4.
    Grau, S., Sanchis, E., Castro, M.J., Vilar, D.: Dialogue Act Classification Using a Bayesian Approach. In: Proceedings of the 9th International Conference Speech and Computer, pp. 495–499 (2004)Google Scholar
  5. 5.
    Nagata, M., Morimoto, T.: First steps toward statistical modeling of dialogue to predict the speech act type of the next utterance, Speech Communication, 15 (1994)Google Scholar
  6. 6.
    Fernandez, R., Ginzburg, J., Lappin, S.: Using Machine Learning for Non-Sentential Utterance Classification. In: Proceedings of the 6th SIGdial Workshop on Discourse and Dialogue, Lisbon, Portugal, pp. 77–86 (2005)Google Scholar
  7. 7.
    Reithinger, N., Engel, R., Kipp, M., Klesen, M.: Predicting Dialogue Acts for a speech to speech translation system. In: Proceedings of the International Conference on Spoken Language Processing, Philadelphia, vol. 2, pp. 654–657 (1996)Google Scholar
  8. 8.
    Lendvai, P., van den Bosch, A., Krahmer, E.: Machine Learning for Shallow Interpretation of User Utterances in Spoken Dialogue Systems. In: Proceedings of the EACL-03 Workshop on Dialogue Systems: Interaction, Adaptation and Styles of Management, Budapest, Hungary, pp. 69–78 (2003)Google Scholar
  9. 9.
    Nagata, M.: Using pragmatics to rule out recognition errors in cooperative task-oriented dialogues. In: Proceedings of the International Conference on Spoken Language Processing, Banff, Canada, vol. 1, pp. 647–650 (1992)Google Scholar
  10. 10.
    Yoshimura, T., Hayamizu, S., Ohmura, H., Tanaka, K.: Pitch pattern clustering of user utterances in human-machine dialogue. In: Proceedings of the International Conference on Spoken Language Processing, Philadelphia, vol. 2, pp. 837–840 (1996)Google Scholar
  11. 11.
    Martin, S., Liermann, J., Ney, H.: Algorithms for bigram and trigram word clustering. Speech Communication 24 (1998)Google Scholar
  12. 12.
    Prasad, R., Walker, M.: 2000 Communicator Dialogue Act Tagged. Linguistic Data Consortium, Philadelphia (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  1. 1.Artificial Intelligence Group, Wire Communications Laboratory, Electrical and Computer Engineering Department, University of Patras, 26500 Rion, PatrasGreece

Personalised recommendations