Skip to main content

Advertisement

Log in

Genetic-based approach for cue phrase selection in dialogue act recognition

  • Research Paper
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

Automatic cue phrase selection is a crucial step for designing a dialogue act recognition model using machine learning techniques. The approaches, currently used, are based on specific type of feature selection approaches, called ranking approaches. Despite their computational efficiency for high dimensional domains, they are not optimal with respect to relevance and redundancy. In this paper we propose a genetic-based approach for cue phrase selection which is, essentially, a variable length genetic algorithm developed to cope with the high dimensionality of the domain. We evaluate the performance of the proposed approach against several ranking approaches. Additionally, we assess its performance for the selection of cue phrases enriched by phrase’s type and phrase’s position. The results provide experimental evidences on the ability of the genetic-based approach to handle the drawbacks of the ranking approaches and to exploit cue’s type and cue’s position information to improve the selection. Furthermore, we validate the use of the genetic-based approach for machine learning applications. We use selected sets of cue phrases for building a dynamic Bayesian networks model for dialogue act recognition. The results show its usefulness for machine learning applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Ali A, Mahmod R, Ahmad F, Sullaiman N (2006) Dynamic bayesian networks for intention recognition in conversational agent. In: Proceedings of the 3rd international conference on artificial intelligence in engineering and technology (iCAiET2006). Universiti Malaysia Sabah, Sabah, Malaysia

  2. Allen J, Core M (1997) Draft of DAMSL: dialog act markup in several layers. The Multiparty Discourse Group, University of Rochester, Rochester, USA. Available from http://www.cs.rochester.edu/research/cisd/resources/damsl

  3. Araujo L (2002) Part-of-speech tagging with evolutionary algorithms. In: Proceedings of the international conference on intelligent text processing and computational linguistics, lecture notes in computer science, vol 2276. Springer-Verlag, Berlin, pp 230–239

  4. Austin JL (1962) How to do things with words. Oxford University Press, Oxford

    Google Scholar 

  5. Belz A, Eskikaya B (1998). A genetic algorithm for finite-state automata induction with an application to phonotactics. In: Proceedings of ESSLLI-98 workshop on automated acquisition of syntax and parsing

  6. Blum A, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97:245–271

    Article  MATH  MathSciNet  Google Scholar 

  7. Bunt H (1994) Context and dialogue control. Think 3(1):19–31

    Google Scholar 

  8. Caballero RE, Estevez PA (1998) A niching genetic algorithm for selecting features for neural network classifiers. In: Proceedings of the 8th international conference of artificial neural networks. Springer-Verlag, pp 311–316

  9. Cantu-Paz E (2004) Feature subset selection, class separability, and genetic algorithms. In: Proceedings of genetic and evolutionary computation conference-GECCO 2004, Deb K, (ed) et al. pp 959–970

  10. Chunkai K, Zhang HH (2005), An effective feature selection scheme via genetic algorithm using mutual information. In: Proceedings of 2nd international conference on fuzzy systems and knowledge discovery, pp 73–80

  11. Dash M, Liu H (1997) Feature selection for classification. Int Data Anal Int J 1(3):131–156

    Article  Google Scholar 

  12. Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151(1/2):155–176

    Article  MATH  MathSciNet  Google Scholar 

  13. David WA (1992) Tolerating noisy, irrelevant and novel attributes in instance-based learning algorithms. Int J Man Mach Stud 36(1):267–287

    MATH  Google Scholar 

  14. Davidor Y (1991) A genetic algorithm applied to robot trajectory generation. In: Davis L (ed) Handbook of genetic algorithms. Van Nostrand Reinhold, pp 144–165

  15. Davidor Y (1991) Genetic algorithms and robotics: a heuristic strategy for optimisation, vol 1 of World Scientic Series in robotics and automated systems World Scientific

  16. Eads D, Hill D, Davis S, Perkins S, Ma J, Porter R, Theiler J (2002) Genetic algorithms and support vector machines for time series classification. In: Proceedings of 5th conference on the application and science of neural networks, fuzzy systems and evolutionary computation, symposium on optical science and technology of the 2002 SPIE annual meeting, pp 74–85

  17. Fatourechi M, Birch GE, Ward RK (2007) Application of a hybrid wavelet feature selection method in the design of a self-paced brain interface system. J Neuroeng Rehabil 4:11

    Google Scholar 

  18. Filho B (2000) Feature selection from huge feature sets in the context of computer vision. Master’s thesis, Colorado State University Fort Collins, Colorado

  19. Fishel M (2007) Machine learning techniques in dialogue act recognition. In: Proceedings of estonian papers in applied linguistics 3, pp 117–134

  20. Fleuret F (2004) Fast binary feature selection with conditional mutual information. J Mach Learn Res 5:1531–1555

    Google Scholar 

  21. Fogel LJ, Owens AJ, Walsh MJ (1966) Artificial intelligence through simulated evolution. Wiley, New York

    MATH  Google Scholar 

  22. Frohlich H, Chapelle O, Schölkopf B (2004) Feature selection for support vector machines using genetic algorithms. Int J Artif Intell Tools 13(4):791–800

    Article  Google Scholar 

  23. Goldberg DE (1989) Genetic algorithms in search, optimisation, and machine learning. Addison-Wesley, New York

    Google Scholar 

  24. Goldberg DE, Korb B, Deb K (1990) Messy genetic algorithms: motivation, analysis, and first results. Complex Syst 3:493–530

    MathSciNet  Google Scholar 

  25. Goldberg DE, Deb K, Korb B (1990) Messy genetic algorithms revisited: studies in mixed size and scale. Complex Syst 4(4):415–444

    MATH  Google Scholar 

  26. Gonzalo V, Sánchez-Ferrero J, Arribas I (2007) A statistical-genetic algorithm to select the most significant features in mammograms. In: Proceedings of the 12th international conference on computer analysis of images and patterns, pp 189–196

  27. Harvey I (1995) The artificial evolution of adaptive behaviour, D. Phil. thesis, School of cognitive and computing sciences. University of Sussex

  28. Haupt RL, Haupt SE (2004) Practical genetic algorithms, 2nd edn. Wiley, New York

  29. Hirschberg J, Litman D (1993) Empirical studies on the disambiguation of cue phrases. Comput Linguist 19(3):501–530

    Google Scholar 

  30. Holland JH (1975) Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor

    Google Scholar 

  31. Hong JH, Cho SB (2006) Efficient huge-scale feature selection with speciated genetic algorithm. Pattern Recognit Lett 2(27):143–150

    Article  Google Scholar 

  32. Intel Corporation (2004). Probabilistic network library—user guide and reference manual

  33. Jurafsky D (2004) Pragmatics and computational linguistics. In: Horn L, Ward G (eds) The Handbook of pragmatics. Oxford, Blackwell, pp 578–604

    Google Scholar 

  34. Jurafsky D, Shriberg E, Fox B, Traci C (1998) Lexical, prosodic, and syntactic cues for dialog acts. In: Proceedings of ACL/coling ‘98 workshop on discourse relations and discourse markers, Montreal, Quebec, Canada pp 114–120

  35. Kats H (2006) Classification of user utterances in question answering dialogues. Master’s thesis, University of Twente, Netherlands

  36. Kazakov D (1998) Genetic algorithms and MDL bias for word segmentation. In: Proceeding of ESSLLI-97

  37. Kelly JD, Davis L (1991) Hybridising the genetic algorithm and the K nearest neighbors classification algorithm. In: ICGA pp 377–383

  38. Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell J 97(1/2):273–324

    Article  MATH  Google Scholar 

  39. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge

  40. Lankhorst MM (1994) Automatic word categorisation with genetic algorithms: computer science report CS-R9405. University of Groningen, The Netherlands

    Google Scholar 

  41. Lanzi P (1997) Fast feature selection with genetic algorithms: a filter approach. In: Proceedings of IEEE international conference on evolutionary computation, pp 537–540

  42. Lesch S (2005) Classification of multidimensional dialogue acts using maximum entropy. Diploma Thesis, Saarland University, Postfach 151150, D-66041 Saarbrucken, Germany

  43. Li L, Weinberg CR, Darden TA, Pedersen LG (2001) Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17:1131–1142

    Article  Google Scholar 

  44. Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Kluwer, Boston

    MATH  Google Scholar 

  45. Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17:491–502

    Article  Google Scholar 

  46. Liu JJ, Cutler G, Li W, Pan Z, Peng S, Hoey T, Chen L, Ling X (2005) Multiclass cancer classification and biomarker discovery using GA-based algorithms. Bioinformatics 21:2691–2697

    Article  Google Scholar 

  47. Liu W, Wang M, Zhong Y (1995) Selecting features with genetic algorithm in handwritten digits recognition. In: Proceedings of the international IEEE conference on evolutionary computation, pp 396–399

  48. Losee RM (1995) Learning syntactic rules and tags with genetic algorithms for information retrieval and filtering: an empirical basis for grammatical rules. Inf Process Manag 32:185–197

    Google Scholar 

  49. Lu J, Zhao T, Zhang Y (2008) Feature selection based-on genetic algorithm for image annotation. Knowl Based Syst 21(8):887–891

    Article  Google Scholar 

  50. Manning C, Schutze H (1999) Foundation of statistical natural language processing. MIT Press, Cambridge

    Google Scholar 

  51. Mitchell M (1996) An introduction to genetic algorithms. MIT Press, Cambridge

    Google Scholar 

  52. Morariu D, Vintan L, Tresp V, (2006) Evolutionary feature selection for text documents using the SVM. In: Proceedings of 3rd international conference on machine learning and pattern recognition (MLPR 2006), ISSN 1305–5313 vol 15, pp 215–221, Barcelona

  53. Moser A, Murty M. (2000) On the scalability of genetic algorithms to very large-scale feature selection. EvoWorkshops, pp 77–86

  54. Nettleton DJ, Gargliano R (1994). Evolutionary algorithms and dialogue. In: Practical handbook of genetic algorithms. CRC Press, New York

  55. Oakes M (1997) Statistics for corpus linguistics. Edinburgh University Press, Edinburgh

    Google Scholar 

  56. Ozdemir M, Embrechts MJ, Arciniegas F, Breneman CM, Lockwood L, Bennett KP (2001) Feature selection for in-silico drug design using genetic algorithms and neural networks. IEEE mountain workshop on soft computing in industrial applications, pp 53–57

  57. Punch WF, Goodman ED, Pei M, Chia-Shun L, Hovland P, Enbody R (1993), Further research on feature selection and classification using genetic algorithms. In: Proceedings of the fifth international conference on genetic algorithms, champaign, Ill: pp 557–564

  58. Samuel K, Carberry S, Vijay-Shanker K (1999) Automatically selecting useful phrases for dialogue act tagging, In: Proceedings of PACLING ‘99 (fourth Conference of the Pacific Association for Computational Linguistics). Waterloo, Ontario, Canada

  59. Schütz M (1997) Other operators: gene duplication and deletion. In: Bäck TH, Fogel DB, Michalewicz Z Hrsg., Handbook of evolutionary computation C3.4:8–15. Oxford University Press, New York, und Institute of Physics Publishing, Bristol

  60. Searle JR (1975) A taxonomy of illocutionary acts. In: Gunderson K, (eds), Language, mind and knowledge, Minnesota studies in the philosophy of science. University of Minnesota Press 7:344–369

  61. Sebastiani F (2002) Machine learning in automated text categorisation. ACM Comput Surv 34(1):1–47

    Article  Google Scholar 

  62. Siedlecki W, Sklansky J (1988) On automatic feature selection. Int J Patt Recognit Artif Intell 2(2):197–220

    Article  Google Scholar 

  63. Silla CN, Pappa GL, Freitas AA, Kaestner CAA (2004) Automatic text summarisation with genetic algorithm-based attribute selection. In: IX IBERAMIA—Ibero-American conference on artificial intelligence, Puebla

  64. Smith SF (1980) A learning system based on genetic adaptive algorithms. Ph. D. Thesis. University of Pittsburgh, PA, USA

  65. Vafaie H, De Jong K (1995) Genetic algorithms as a tool for restructuring feature space representations. In: Proceedings of the seventh international conference on tools with artificial intelligence. Henidon

  66. Vafaie H, De Jong K (1992) Genetic algorithms as a tool for feature selection in machine learning. In: Proceeding of the 4th international conference on tools with artificial intelligence, Arlington

  67. Van de Burgt SP, Schaake J, Nijholt A (1995) Language analysis for dialogue management in theatre information and booking system, language engineering, AI 95, 15th international conference, Montpellier, pp 351–362

  68. Verbree AT, Rienks RJ, Heylen DKJ (2006) Dialogue act tagging using smart feature selection: results on multiple corpora. In: Proceedings of the first international IEEE workshop on spoken language technology SLT, pp 10–13

  69. Vose MD (1999) The simple genetic algorithm: foundation and theory. MIT Press, Cambridge

  70. Webb N, Hepple M, Wilks Y (2005) Dialogue act classification based on Intra-utterance features”. In: Proceedings of the AAAI 05

  71. Webb N, Hepple M, Wilks Y (2005) Empirical determination of thresholds for optimal dialogue act classification. In: Proceeding of the ninth workshop on the semantics and pragmatics of dialogue

  72. William H (2004) Genetic wrappers for feature selection in decision tree induction and variable ordering in Bayesian network structure learning. Inf Sci Int J 163(1–3):103–122

    Google Scholar 

  73. Wilson, GC, Heywood MI (2005) Use of a genetic algorithm in brill’s transformation-based part-of-speech tagger. In: Proceedings of the genetic and evolutionary computation conference (GECCO 2005), June 25–29, 2005, Washington, DC, USA, ACM Press, ISBN 1-59593-010-8, pp 2067–2073

  74. Yang J, Honavar V (1998) Feature subset selection using a genetic algorithm. IEEE Intell Syst 13(2):44–49

    Article  Google Scholar 

  75. Yu E, Cho S (2003) GA-SVM wrapper approach for feature subset selection in keystroke dynamics identity verification. In: Proceedings of the IEEE international joint conference on neural networks 3:2253–2257

  76. Zebulum RS, Pacheco MA, Vellasco M (2000) Variable length representation in evolutionary electronics. Evol Comput J 8(1):93–120

    Article  Google Scholar 

  77. Zhang L, Wang J, Zhao Y, Yang Z (2003) A novel hybrid feature selection algorithm: using Relief estimation for GA-wrapper search. In: Proceedings of IEEE international conference on machine learning and cybernetics

  78. Zhang P, Verma B, Kumar K (2004) A neural-genetic algorithm for feature selection and breast abnormality classification in digital mammography. In: Proceedings of IEEE international joint conference on neural networks, vol 3, pp 2303–2308

  79. Zheng Z, Wu X, Srihari R (2004) Feature selection for text categorization on imbalanced data. SIGKDD 6(1):80–89

    Article  Google Scholar 

  80. Zhuo L, Zheng J,Wang F, Li X, Ai B, Qian J, (2008) A genetic algorithm based wrapper feature selection method for classification of hyperspectral images using support vector machine. The international archives of the photogrammetry, remote sensing and spatial information sciences. vol XXXVII. Part B7. Beijing

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anwar Ali Yahya.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yahya, A.A., Ramli, A.R. Genetic-based approach for cue phrase selection in dialogue act recognition. Evol. Intel. 1, 253–269 (2009). https://doi.org/10.1007/s12065-008-0016-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-008-0016-6

Keywords

Navigation