Detection of interdependences in attribute selection

  • Javier Lorenzo
  • Mario Hernández
  • Juan Méndez
Communications Session 8. Attribute Selection
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1510)


A new measure for attribute selection, called GD, is proposed. The GD measure is based on Information Theory and allows to detect the interdependence between attributes. This measure is based on a quadratic form of the Mántaras distance and a matrix called Transinformation Matrix. In order to test the quality of the proposed measure, it is compared with other two feature selection methods, namely Mántaras distance and Relief algorithms. The comparison is done over 19 datasets along with three different induction algorithms.


Feature Selection Mutual Information Attribute Selection Gain Ratio Induction Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    D. W. Aha and R. L. Bankert. Feature selection for case-based classification of cloud types: An empirical comparison. In Proc. of the 1994 AAAI Workshop on Case-Based Reasoning, pages 106–112. AAAI Press, 1994.Google Scholar
  2. 2.
    D. W. Aha, Dennis Kibler, and M. K. Albert. Instance-based learning algorithms. Machine Learning, 6:37–66, 1991.Google Scholar
  3. 3.
    H. Almuallim and T.G. Dietterich. Learning with many irrelevant features. In Proc. of the Ninth National Conference on Artificial Intelligence, pages 547–552. AAAI Press, 1991.Google Scholar
  4. 4.
    A. L. Blum and P. Langley. Selection of relevant features and examples in machine learning. Artificial Intelligence, 97:245–271, 1997.zbMATHMathSciNetCrossRefGoogle Scholar
  5. 5.
    R. Caruana and D. Freitag. Greedy attribute selection. In Proc. of the 11th International Machine Learning Conference, pages 28–36, New Brunswick, NJ, 1994. Morgan Kaufmann.Google Scholar
  6. 6.
    T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley & Sons Inc., 1991.Google Scholar
  7. 7.
    Walter Daelemans and Antal van den Bosch. Generalization performance of backpropagation learning on a syllabification task. In Proc. of the Third Twente Workshop on Language Technology, pages 27–38, 1992.Google Scholar
  8. 8.
    R. Duda and P. Hart. Pattern Classification and Scene Analysis. John Willey and Sons, 1973.Google Scholar
  9. 9.
    U. M. Fayyad and K. B. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In Proc. of the 13th Int. Joint Conference of Artificial Intelligence, pages 1022–1027, 1993.Google Scholar
  10. 10.
    G. H. John, R. Kohavi, and K. Pfleger. Irrelevant features and the subset selection problem. In W. William and Haym Hirsh, editors, Procs. of the Eleventh International Conference on Machine Learning, pages 121–129. Morgan Kaufmann, San Francisco, CA, 1994.Google Scholar
  11. 11.
    K. Kira and L. A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proc. of the 10th National Conf. on Artificial Intelligence, pages 129–134, 1992.Google Scholar
  12. 12.
    R. Kohavi and G. H. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1–2):273–324, December 1997.zbMATHCrossRefGoogle Scholar
  13. 13.
    R. Kohavi, D. Sommerfield, and J. Dougherty. Data mining using MLC++: A machine learning library in C++. In Tools with Artificial Intelligence, pages 234–245. IEEE Computer Society Press, 1996. Received the best paper award.Google Scholar
  14. 14.
    D. Koller and M. Sahami. Toward optimal feature selection. In Proc. of the 13th Int. Conf. on Machine Learning, pages 284–292. Morgan Kaufmann, 1996.Google Scholar
  15. 15.
    I. Kononenko. Estimating attributes: Analysis and extensions of relief. In F. Bergadano and L. de Raedt, editors, Machine Learning: ECML-94, pages 171–182, Berlin, 1994. Springer.Google Scholar
  16. 16.
    N. Littlestone. Learning quickly when irrelevant attributes abound: A new linearthreshold algorithm. Machine Learning, 2:285–318, 1988.Google Scholar
  17. 17.
    R. Lopez de Mántaras. A distance-based attribute selection measure for decision tree induction. Machine Learning, 6:81–92, 1991.Google Scholar
  18. 18.
    J. Lorenzo, M. Hernández, and J. Méndez. GD: A Measure based on Information Theory for Attribute Selection. In Helder Coelho, editor, Proc. of the 6th Ibero-American Conference on Artificial Intelligence, Lectures Notes in Artificial Intelligence, Springer Verlag, 1998.Google Scholar
  19. 19.
    C. J. Merz and P.M. Murphy. UCI Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science., 1996.Google Scholar
  20. 20.
    P.M. Narendra and K. Fukunaga. A branch and bound algorithm for feature selection. IEEE Trans. on Computers, 26:917–922, 1977.zbMATHGoogle Scholar
  21. 21.
    J. R. Quinlan. Induction of decision trees. Machine Learning, 1:81–106, 1986.Google Scholar
  22. 22.
    D. Wettschereck and D. W. Aha. Weighting features. In Proc. of the First Int. Conference on Case-Based Reasoning, pages 347–358, 1995.Google Scholar
  23. 23.
    D. Wettschereck and T. G. Dieterich. An experimental comparison of the nearestneighbor and nearest-hyperrectangle algorithms. Machine Learning, 19:5–27, 1995.Google Scholar
  24. 24.
    A. P. White and W. Z. Liu. Bias in information-based measures in decision tree induction. Machine Learning, 15:321–329, 1994.zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Javier Lorenzo
    • 1
  • Mario Hernández
    • 1
  • Juan Méndez
    • 1
  1. 1.Dpto. de Informática y SistemasUniv. de Las Palmas de Gran CanariaLas PalmasSpain

Personalised recommendations