Combined Optimization of Feature Selection and Algorithm Parameters in Machine Learning of Language

  • Walter Daelemans
  • Véronique Hoste
  • Fien De Meulder
  • Bart Naudts
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2837)


Comparative machine learning experiments have become an important methodology in empirical approaches to natural language processing (i) to investigate which machine learning algorithms have the ‘right bias’ to solve specific natural language processing tasks, and (ii) to investigate which sources of information add to accuracy in a learning approach. Using automatic word sense disambiguation as an example task, we show that with the methodology currently used in comparative machine learning experiments, the results may often not be reliable because of the role of and interaction between feature selection and algorithm parameter optimization. We propose genetic algorithms as a practical approach to achieve both higher accuracy within a single approach, and more reliable comparisons.


Feature Selection Algorithm Parameter Inductive Logic Programming Word Sense Disambiguation Combine Optimization 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Alpaydin, E.: Combined 5×2 cv F test for comparing supervised classification learning algorithms. Neural Computation 11(8), 1885–1892 (1999)CrossRefGoogle Scholar
  2. 2.
    Banko, M., Brill, E.: Scaling to very very large corpora for natural language disambiguation. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, pp. 26–33. Association for Computational Linguistics (2001)Google Scholar
  3. 3.
    Cohen, W.W.: Fast effective rule induction. In: Proc. 12th International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann, San Francisco (1995)Google Scholar
  4. 4.
    Daelemans, W., Hoste, V.: Evaluation of machine learning methods for natural language processing tasks. In: Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002), pp. 755–760 (2002)Google Scholar
  5. 5.
    Daelemans, W., van den Bosch, A., Zavrel, J.: Forgetting exceptions is harmful in language learning. Machine Learning 34, 11–41 (1999)zbMATHCrossRefGoogle Scholar
  6. 6.
    Daelemans, W., Zavrel, J., van der Sloot, K., van den Bosch, A.: Timbl: Tilburg memory based learner, version 4.0, reference guide. Technical report, ILK Technical Report 01-04 (2001)Google Scholar
  7. 7.
    Dietterich, T.G.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation 10(7), 1895–1923 (1998)CrossRefGoogle Scholar
  8. 8.
    Edmonds, P., Kilgarriff, A. (eds.): Journal of Natural Language Engineering special issue based on Senseval-2, vol. 9. Cambridge University Press, Cambridge (2003)Google Scholar
  9. 9.
    Escudero, G., Marquez, L., Rigau, G.: Boosting applied to word sense disambiguation. In: European Conference on Machine Learning, pp. 129– 141 (2000)Google Scholar
  10. 10.
    Kilgarriff, A., Palmer, M. (eds.): Computers and the Humanities special issue based on Senseval-1, vol. 34 (1999)Google Scholar
  11. 11.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–323 (1997)zbMATHCrossRefGoogle Scholar
  12. 12.
    Kool, A., Daelemans, W., Zavrel, J.: Genetic algorithms for feature relevance assignment in memory-based language processing. In: Cardie, C., Daelemans, W., Nédellec, C., Sang, E.T.K. (eds.) Proceedings of the Fourth Conference on Computational Natural Language Learning and of the Second Learning Language in Logic Workshop, Lisbon, pp. 103–106 (2000); Association for Computational Linguistics (2000)Google Scholar
  13. 13.
    Kool, A., Zavrel, J., Daelemans, W.: Simultaneous feature selection and parameter optimization for memory-based natural language processing. In: Feelders, A. (ed.) Proceedings of the 10th BENELEARN meeting, Tilburg, The Netherlands, pp. 93–100.Google Scholar
  14. 14.
    Leacock, C., Towell, G., Voorhees, E.: Corpus-based statistical sense resolution. In: Proceedings of the ARPA Workshop on Human Language Technology, March 1993, pp. 260–265 (1993)Google Scholar
  15. 15.
    Lee, Y.K., Ng, H.T.: An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), pp. 41–48 (2002)Google Scholar
  16. 16.
    Mooney, R.J.: Comparative experiments on disambiguating word senses: An illustration of the role of bias in machine learning. In: Brill, E., Church, K. (eds.) Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 82–91; Association for Computational Linguistics, Somerset, New Jersey (1996)Google Scholar
  17. 17.
    Ng, H.T., Lee, H.B.: Integrating multiple knowledge sources to disambiguate word sense: An exemplar-based approach. In: Joshi, A., Palmer, M. (eds.) Proceedings of the Thirty-Fourth Annual Meeting of the Association for Computational Linguistics, pp. 40–47. Morgan Kaufmann Publishers, San Francisco (1996)Google Scholar
  18. 18.
    Salzberg, S.L.: On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery 1(3), 317–327 (1997)CrossRefGoogle Scholar
  19. 19.
    Veenstra, J., Van den Bosch, A., Buchholz, S., Daelemans, W., Zavrel, J.: Memory-based word sense disambiguation. Computing and the Humanities (2000)Google Scholar
  20. 20.
    Daelemans, W., Hoste, V., Hendrickx, I., van den Bosch, A.: Parameter optimization for machine-learning of word sense disambiguation. Natural Language Engineering, 311–325 (2002)Google Scholar
  21. 21.
    Weiss, S., Indurkhya, N.: Predictive Data Mining: A Practical Guide. Morgan Kaufmann, San Francisco (1998)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Walter Daelemans
    • 1
  • Véronique Hoste
    • 1
  • Fien De Meulder
    • 1
  • Bart Naudts
    • 2
  1. 1.CNTS Language Technology GroupUniversity of AntwerpAntwerpen
  2. 2.Postdoctoral researcher of the Fund for Scientific ResearchISLAB, University of AntwerpFlandersBelgium

Personalised recommendations