A Machine Learning Approach to Speech Act Classification Using Function Words

O’Shea, James; Bandar, Zuhair; Crockett, Keeley

doi:10.1007/978-3-642-13541-5_9

A Machine Learning Approach to Speech Act Classification Using Function Words

James O’Shea²³,
Zuhair Bandar²³ &
Keeley Crockett²³

Conference paper

893 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6071))

Abstract

This paper presents a novel technique for the classification of sentences as Dialogue Acts, based on structural information contained in function words. It focuses on classifying questions or non-questions as a generally useful task in agent-based systems. The proposed technique extracts salient features by replacing function words with numeric tokens and replacing each content word with a standard numeric wildcard token. The Decision Tree, which is a well-established classification technique, has been chosen for this work. Experiments provide evidence of potential for highly effective classification, with a significant achievement on a challenging dataset, before any optimisation of feature extraction has taken place.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Keizer, S.: A Bayesian Approach to Dialogue Act Classification. In: BI-DIALOG 2001 the 5th Workshop on Formal Semantics and Pragmatics of Dialogue, pp. 88–94. ZiF, Bielefeld (2001)
Google Scholar
Webb, N., Hepple, M., Wilks, Y.: Dialogue Act Classification Based on Intra-Utterance Features. In: AAAI 2005. AAAI Press, Pittsburgh (2005)
Google Scholar
Verbree, D., Rienks, R., Heylen, D.: Dialogue-Act Tagging Using Smart Feature Selection; Results On Multiple Corpora. In: IEEE Spoken Language Technology Workshop, pp. 70–73. IEEE Press, New York (2006)
Chapter Google Scholar
Venkataraman, A., Stolcke, A., Shriberg, E.: Automatic Dialog Act Labeling With Minimal Supervision. In: 9th Australian International Conference on Speech Science and Technology (2002)
Google Scholar
Serafin, R., Di Eugenio, B., Glass, M.: Latent Semantic Analysis for dialogue act classification. In: The 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (2003)
Google Scholar
Wermter, S., Lochel, M.: Learning Dialog Act Processing. In: COLING 1996, 16th International Conference on Computational Linguistics (1996)
Google Scholar
Li, Y., Bandar, Z., McLean, D., O’Shea, J.: A method for measuring sentence similarity and its application to conversational agents. In: The 17th International Florida Artificial Intelligence Research Society Conference (FLAIRS 2004), pp. 820–825. AAAI Press, Menlo Park (2004)
Google Scholar
Längle, T., Lüth, T., Stopp, E., Herzog, G., Kamstrup, G.: KANTRA - A Natural Language Interface for Intelligent Robots. In: Intelligent Autonomous Systems (IAS 4), pp. 357–364 (1995)
Google Scholar
Bickmore, T., Giorgino, T.: Health dialog systems for patients and consumers. J. Biomed. Inform. 39(5), 556–571 (2006)
Article Google Scholar
Keizer, S., op den Akker, R., Nijholt, A.: Dialogue Act Recognition with Bayesian Networks for Dutch Dialogues. In: Third SIGdial Workshop on Discourse and Dialogue, pp. 88–94 (2002)
Google Scholar
Crockett, K., Bandar, Z., O’Shea, J., McLean, D.: Bullying and Debt: Developing Novel Applications of Dialogue Systems. In: Knowledge and Reasoning in Practical Dialogue Systems (IJCAI), Pasadena (2009)
Google Scholar
van Rijsbergen, C.: Information Retrieval. Butterworths, Boston (1980)
Google Scholar
Sanderson, M.: http://ftp.dcs.glasgow.ac.uk/idom/ir_resources/linguistic_utils/stop_words
Spärck-Jones, K.: A Statistical Interpretation of Term Specificity and its Application in Retrieval. Journal of Documentation 28, 11–21 (1972)
Article Google Scholar
Salton, G., Wong, A., Yang, C.: A Vector Space Model for Automatic Indexing. Communications of the ACM 18(11), 613–620 (1975)
Article MATH Google Scholar
Deerwester, S., Dumais, S., Furnas, G., Harshman, R., Landauer, T., Lochbaum, K., Streeter, L.: Computer information retrieval using Latent Semantic Structure. Bell Communications Research Inc. U.S.P. Office (1989)
Google Scholar
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by Latent Semantic Analysis. Journal of the American Society of Information Science 41(6), 391–407 (1990)
Article Google Scholar
Bollacker, K., Lawrence, S., Giles, C.: CiteSeer: An Autonomous Web Agent for Automatic Retrieval and Identification of Interesting Publications. In: 2nd International ACM Conference on Autonomous Agents, pp. 116–123. ACM Press, New York (1998)
Chapter Google Scholar
Li, Y., Bandar, Z., McLean, D., O’Shea, J.: Sentence Similarity Based on Semantic Nets and Corpus Statistics. IEEE Transactions on Knowledge and Data Engineering 18(8), 1138–1150 (2006)
Article Google Scholar
Islam, A., Inkpen, D.: Semantic Text Similarity using Corpus-Based Word Similarity and String Similarity. ACM Transactions on Knowledge Discovery from Data 2(2), 1–25 (2008)
Article Google Scholar
BBC, http://news.bbc.co.uk/1/hi/health/8166444.stm
Quinlan, J.: C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Mateo (1993)
Google Scholar
Witten, I., Eibe, F.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Elsevier, San Francisco (2005)
MATH Google Scholar
Aleksander, I., Morton, H.: Introduction to Neural Computing. International Thomson Computer Press (1995)
Google Scholar
Quinlan, J.: Induction of Decision Trees. Machine Learning 1(1), 81–106 (1986)
Google Scholar
Fong, T., Grange, S., Thorpe, C., Baur, C.: Multi-robot remote driving with collaborative control. In: IEEE International Workshop on Robot-Human Interactive Communication (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing and Mathematics, Manchester Metropolitan University, United Kingdom
James O’Shea, Zuhair Bandar & Keeley Crockett

Authors

James O’Shea
View author publications
You can also search for this author in PubMed Google Scholar
Zuhair Bandar
View author publications
You can also search for this author in PubMed Google Scholar
Keeley Crockett
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Gdynia Maritime University, 81-221, Gdynia, Poland
Piotr Jędrzejowicz
Institute of Informatics, Wroclaw University of Technology, Str. Wyb. Wyspianskiego 27, 50-370, Wroclaw, Poland
Ngoc Thanh Nguyen
University of Brighton, BN2 4GJ, Brighton, United Kingdom
Robert J. Howlet
University of South Australia, 5095, Mawson Lakes, SA, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

O’Shea, J., Bandar, Z., Crockett, K. (2010). A Machine Learning Approach to Speech Act Classification Using Function Words. In: Jędrzejowicz, P., Nguyen, N.T., Howlet, R.J., Jain, L.C. (eds) Agent and Multi-Agent Systems: Technologies and Applications. KES-AMSTA 2010. Lecture Notes in Computer Science(), vol 6071. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13541-5_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-13541-5_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13540-8
Online ISBN: 978-3-642-13541-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics