Skip to main content
Log in

Feature Subset Selection within a Simulated Annealing Data Mining Algorithm

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

An overview of the principle feature subset selection methods isgiven. We investigate a number of measures of feature subset quality, usinglarge commercial databases. We develop an entropic measure, based upon theinformation gain approach used within ID3 and C4.5 to build trees, which isshown to give the best performance over our databases. This measure is usedwithin a simple feature subset selection algorithm and the technique is usedto generate subsets of high quality features from the databases. A simulatedannealing based data mining technique is presented and applied to thedatabases. The performance using all features is compared to that achievedusing the subset selected by our algorithm. We show that a substantialreduction in the number of features may be achieved together with animprovement in the performance of our data mining system. We also present amodification of the data mining algorithm, which allows it to simultaneouslysearch for promising feature subsets and high quality rules. The effect ofvarying the generality level of the desired pattern is alsoinvestigated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • ANGOSS: 1994, KnowledgeSEEKER 3.11.05 on-line help.

  • ANGOSS: 1995, KnowledgeSEEKER for Windows Version 3.0 User's Guide, Toronto, Canada.

  • Biggs, D., de Ville, B. and Suen, E.: 1991, A method of choosing multiway partitions for classification and decision trees, Journal of Applied Statistics 18(1), 49–62.

    Google Scholar 

  • Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J.: 1984, Classification and Regression Trees, Wadsworth and Brooks, Monterey, CA.

    Google Scholar 

  • Chang, C. Y.: 1973, Dynamic programming as applied to feature selection in pattern recognition systems, IEEE Trans. Syst. Man and Cybernet. 3, 166–171.

    Google Scholar 

  • Chardaire, P., Lutton, J. L. and Sutter, A.: 1995, Thermostatistical persistency: A powerful improving concept for simulated annealing algorithms, European Journal of Operational Research 86.

  • Cover, T. M. and Van Campenhout, J. M.: 1977, On the possible orderings in the measurement selection problem, IEEE Trans. Sys. Man Cybern. 7, 651–661.

    Google Scholar 

  • de la Iglesia, B., Debuse, J. C. W. and Rayward-Smith, V. J.: 1996, Discovering knowledge in commercial databases using modern heuristic techniques, inE. Simoudis, J. W. Han and U. Fayyad (eds), Proc. of the Second Int. Conf. on Knowledge Discovery and Data Mining (KDD-96), pp. 44–49.

  • deVille, B.: 1990, Applying statistical knowledge to database analysis and knowledge base construction, Proceedings of the Sixth IEEE Conference on Artificial Intelligence Applications, IEEE Computer Society, Washington.

    Google Scholar 

  • Devijver, P. A. and Kittler, J.: 1982, Pattern Recognition: a Statistical Approach, Prentice-Hall International, London.

    Google Scholar 

  • Foroutan, I. and Sklansky, J.: 1987, Feature selection for automatic classification of non-gaussian data, IEEE Trans. Sys. Man Cybern. 17, 187–198.

    Google Scholar 

  • John, G. H., Kohavi, R. and Pfleger, K.: 1994, Irrelevant features and the subset selection problem, inW. W. Cohen and H. Hirsh (eds), Machine Learning: Proceedings of the Eleventh International Conference, Morgan Kaufmann, San Francisco, pp. 121–129.

    Google Scholar 

  • Kass, G. V.: 1980, An exploratory technique for investigating large quantities of categorical data, Appl. Statist. 29, 119–127.

    Google Scholar 

  • Kohavi, R. and Sommerfield, D.: 1995, Feature subset selection using the wrapper method: overfitting and dynamic search space topology, inU. M. Fayyad and R. Uthurusamy (eds), Proc. of the First Int. Conf. on Knowledge Discovery and Data Mining, AAAI Press, pp. 192–197.

  • Koller, D. and Sahami, M.: 1996, Toward optimal feature selection, in(Saitta, 1996).

  • Liu, H. and Setiono, R.: 1996a, Feature selection and classification-a probabilistic wrapper approach, Proc. of the 9th Int. Conf. on Industrial and Engineering Applications of AI and Expert Systemspp. 419–424.

  • Liu, H. and Setiono, R.: 1996b, A probabilistic approach to feature selection–a filter solution, in(Saitta, 1996).

  • Lundy, M. and Mees, A.: 1986, Convergence of an annealing algorithm, Mathematical programming 34, 111–124.

    Google Scholar 

  • Mann, J. W.: 1995, X-SAmson v1.0 user manual, School of Information Systems, University of East Anglia.

  • Marill, T. and Green, D. M.: 1963, On the effectiveness of receptors in recognition systems, IEEE Trans. Inform. Theory 9, 11–17.

    Google Scholar 

  • Michael, M. and Lin, W. C.: 1973, Experimental study of information measures and inter-intra class distance ratios on feature selection and ordering, IEEE Trans. Systems Man Cybernet. 3, 172–181.

    Google Scholar 

  • Narendra, P. M. and Fukunaga, K.: 1977, A branch and bound algorithm for feature subset selection, IEEE Transactions on Computerspp. 917–922.

  • Pei, M., Goodman, E. D., Punch, W. F. and Ding, Y.: 1995, Genetic algorithms for classification and feature extraction, Proc. of the Classification Society Conf..

  • Quinlan, J. R.: 1983, Learning efficient classification procedures and their application to chess end games, inR. S. Michalski, J. G. Carbonell and T. M. Mitchell (eds), Machine learning: An artificial intelligence approach, Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  • Quinlan, J. R.: 1986, Induction of decision trees, Machine Learning 1.

  • Quinlan, J. R.: 1993, C4.5: Programs for Machine Learning, Morgan Kaufmann.

  • Rauber, T. W.: 1996, The tooldiag package. Available electronically from: http://www.uninova.pt/∼tr/home/tooldiag.html.

  • Rayward-Smith, V. J., Debuse, J. C. W. and de la Iglesia, B.: 1995, Using a genetic algorithm to data mine in the financial services sector, inA. Macintosh and C. Cooper (eds), Applications and Innovations in Expert Systems III, SGES Publications, pp. 237–252.

  • Saitta, L. (ed.): 1996, Proc. of the 13th Int. Conf. on Machine Learning.

  • Shannon, C. E. and Weaver, W.: 1963, The Mathematical Theory of Communication, University of Illinois Press, Urbana.

    Google Scholar 

  • Thrun, S., Bala, J., Bloedorn, E., Bratko, I., Cestnik, B., Cheng, J., De Jong, K., Dzeroski, S., Fahlman, S. E., Fisher, D., Hamman, R., Kaufman, K., Keller, I. Kononenko, I., Kreuziger, J., Michalski, R. S., Mitchell, T., Pachowicz, P., Reich, Y., Vafaie, H., Van de Welde, W., Wenzel, W., Wnek, J. and Zhang, J.: 1991, The MONK’s problems-a performance comparison of different learning algorithms, Carnegie Mellon University CMU-CS-91-197.

  • Weiss, S. M. and Kulikowski, C. A.: 1991, Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning and expert systems, Morgan Kaufmann, San Francisco.s

    Google Scholar 

  • Whitney, A.: 1971, A direct method of nonparametric measurement selection, IEEE Trans. Comput. 20, 1100–1103.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Debuse, J.C., Rayward-Smith, V.J. Feature Subset Selection within a Simulated Annealing Data Mining Algorithm. Journal of Intelligent Information Systems 9, 57–81 (1997). https://doi.org/10.1023/A:1008641220268

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008641220268

Navigation