Sifting the Margin – An Iterative Empirical Classification Scheme

  • Dan Vance
  • Anca Ralescu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3157)

Abstract

Attribute or feature selection is an important step in designing a classifier. It often reduces to choosing between computationally simple schemes (based on a small subset of attributes) that do not search the space and more complex schemes (large subset or entire set of available attributes) that are computationally intractable. Usually a compromise is reached: A computationally tractable scheme that relies on a subset of attributes that optimize a certain criterion is chosen. The result is usually a’good’ sub-optimal solution that may still require a fair amount of computation. This paper presents an approach that does not commit itself to any particular subset of the available attributes. Instead, the classifier uses each attribute successively as needed to classify a given data point. If the data set is separable in the given attribute space the algorithm will classify a given point with no errors. The resulting classifier is transparent, and the approach compares favorably with previous approaches both in accuracy and efficiency.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Merz, C.J., Murphy, P.M.: UC1 repository of machine learning databases. University of California, Irvine, Department of Information & Computer Sciences (1998)Google Scholar
  2. 2.
    Caruana, R., de Sa, V.R.: Benefitting from the Variables that Variable Selection Discards. Journal of Machine Learning Research, 1245–1264 (2003); Ito, T., Abadi, M. (eds.): TACS 1997. LNCS, vol. 1281, pp. 415–438. Springer, Heidelberg (1997)Google Scholar
  3. 3.
    Guyon, I., ElisseetT, A.: An Introduction to Variable and Feature Selection. JMLR Special Issue, 1157-1182, GGGH (2003)Google Scholar
  4. 4.
    Narazaki, Ralescu, A.L.: Iterative Induction of a Category Membership Function. International Journal of Uncertainty, Fuzziness, and Knowledge-Based Systems 2(1), 91–100 (1992)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Stoppiglia, H., Dreyfus, G., et al.: Ranking a Random Variable for Variable and Feature Selection. JMLR Special Issue, 1399–1414 (2003)Google Scholar
  6. 6.
    John, G.: Cross-Validated C4.5: Using Error Estimation for Automatic Parameter Selection. Technical Report STAN-CS-TN-94-12, CS Department. Stanford University (October 1994)Google Scholar
  7. 7.
    Zheng, Z.: A Benchmark for Classifier Learning. In: Technical Report 474, Proceedings of the 6th Australian Joint Conference on Al, pp. 281–286. World Scientific, Singapore (1993)Google Scholar
  8. 8.
    Setiono, R.: Extracting Rules from Pruned Neural Networks for Breast Cancer Diagnosis. Artificial Intelligence in Medicine 8(1), 37–51 (1996)CrossRefGoogle Scholar
  9. 9.
    Taha, J., Ghosh, J.: Evaluation and Ordering of Rules Extracted from Forwardfccd Networks. In: Proceedings of the IEEE International Conference on Neural Networks, pp. 221–226 (1997)Google Scholar
  10. 10.
    Tsakonas, A., Dounias, G.: A Scheme for the Evolution of Feedforward Networks using BNF-Grammar Driven Genetic Programming. In: Proceedings EUNITE 2002, Algarve, Portugal (2002)Google Scholar
  11. 11.
    Ju, W., et al.: On Bayesian Learning of Sparse Classifiers (September 2003)Google Scholar
  12. 12.
    Pina-Reyes, C., Sipper, M.: Evolving Fuzzy Rules for Breast Cancer Diagnosis. In: Proceedings of 1998 International Symposium on Nonlinear Theory and Applications (NOLTA 1998), Lausanne, vol. 2, pp. 369–372 (1998)Google Scholar
  13. 13.
    Yau, P., et al.: Bayesian Variable Selection and Model Averaging in High Dimensional Multinomial Nonparametric Regression. Journal of Computational and Graphical Statistics 12(1), 23–32 (2002)CrossRefGoogle Scholar
  14. 14.
    Bahler, D., Navarro, L.: Combining Heterogeneous Sets of Classifiers: Theoretical and Experimental Comparison of Methods. In: 17th National Conference on Artificial Intelligence (AAAI 2000), Workshop on New Research Problems for Machine Learning (2000)Google Scholar
  15. 15.
    Aguilar, J., Riquelme, J., Toro, M.: Data Set Editing by Ordered Projection. In: 14th European Conference on Artificial Intelligence (August 2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Dan Vance
    • 1
  • Anca Ralescu
    • 1
  1. 1.ECECS DepartmentUniversity of CincinnatiCincinnatiUSA

Personalised recommendations