Exploring Features and Classifiers for Dialogue Act Segmentation

  • Harm op den Akker
  • Christian Schulz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5237)


This paper takes a classical machine learning approach to the task of Dialogue Act segmentation. A thorough empirical evaluation of features, both used in other studies as well as new ones, is performed. An explorative study to the effectiveness of different classification methods is done by looking at 29 different classifiers implemented in WEKA. The output of the developed classifier is examined closely and points of possible improvement are given.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    AMIDA: Augmented Multiparty Interaction with Distance Acces Deliverable D5.2: Report on multimodal content abstraction. Technical report, Brno University of Technology, DFKI, ICSI, IDIAP, TNO, University of Edinburgh, University of Twente and University of Sheffield (2007)Google Scholar
  2. 2.
    Zimmermann, M., Liu, Y., Shriberg, E., Stolcke, A.: Toward joint segmentation and classification of dialog acts in multiparty meetings. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 187–193. Springer, Heidelberg (2006)Google Scholar
  3. 3.
    Stolcke, A., Shriberg, E.: Automatic linguistic segmentation of conversational speech. In: Proc. ICSLP 1996, Philadelphia, PA, vol. 2, pp. 1005–1008 (1996)Google Scholar
  4. 4.
    Kolář, J., Liu, Y., Shriberg, E.: Speaker adaptation of language models for automatic dialog act segmentation of meetings (2007)Google Scholar
  5. 5.
    Zimmermann, M., Stolcke, A., Shriberg, E.: Joint segmentation and classification of dialog acts in multiparty meetings. In: Proc. IEEE ICASSP, Toulouse, France, vol. 1, pp. 581–584 (2006)Google Scholar
  6. 6.
    Cuendet, S., Shriberg, E., Favre, B., Fung, J., Hakkani-Tur, D.: An analysis of sentence segmentation features for broadcast news, broadcast conversations, and meetings. In: Proceedings SIGIR Workshop on Searching Conversational Spontaneous Speech, Amsterdam, Netherlands, pp. 37–43 (2007)Google Scholar
  7. 7.
    McCowan, I., et al.: The AMI meeting corpus. In: Proceedings of the 5th International Conference on Methods and Techniques in Behavioral Research (2005)Google Scholar
  8. 8.
    Poel, M., Stegeman, L., den op Akker, R.: A Support Vector Machine Approach to Dutch Part-of-Speech Tagging. In: R. Berthold, M., Shawe-Taylor, J., Lavrač, N. (eds.) IDA 2007. LNCS, vol. 4723, pp. 274–283. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  9. 9.
    Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of english: The penn treebank. Computational Linguistics 19(2), 313–330 (1994)Google Scholar
  10. 10.
    Dielmann, A., Renals, S.: DBN based joint dialogue act recognition of multiparty meetings. In: Proc. IEEE ICASSP, vol. 4, pp. 133–136 (2007)Google Scholar
  11. 11.
    Daelemans, W., Hoste, V., De Meulder, F., Naudts, B.: Combined optimization of feature selection and algorithm parameter interaction in machine learning of language. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 84–95. Springer, Heidelberg (2003)Google Scholar
  12. 12.
    Germesin, S., Becker, T., Poller, P.: Hybrid multi-step disfluency detection. In: MLMI 2008 (2008)Google Scholar
  13. 13.
    Webb, N., Hepple, M., Wilks, Y.: Empirical determination of thresholds for optimal dialogue act classification (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Harm op den Akker
    • 1
  • Christian Schulz
    • 2
  1. 1.Twente UniversityEnschedeThe Netherlands
  2. 2.Deutsche Forschungszentrum für Künstliche Intelligenz (DFKI)SaarbrückenGermany

Personalised recommendations