Advertisement

On Combining Fractal Dimension with GA for Feature Subset Selecting

  • GuangHui Yan
  • ZhanHuai Li
  • Liu Yuan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4293)

Abstract

Selecting a set of features which is optimal for a given task is a problem which plays an important role in a wide variety of contexts including pattern recognition, adaptive control, and machine learning. Recently, exploiting fractal dimension to reduce the features of dataset is a novel method. FDR (Fractal Dimensionality Reduction), proposed by Traina in 2000, is the most famous fractal dimension based feature selection algorithm. However, it is intractable in the high dimensional data space for multiple scanning the dataset and incapable of eliminating two or more features simultaneously. In this paper we combine GA with the Z-ordering based FDR for addressing this problem and present a new algorithm GAZBFDR(Genetic Algorithm and Z-ordering Based FDR). The algorithm proposed can directly select the fixed number features from the feature space and utilize the fractal dimension variation to evaluate the selected features within the comparative lower space. The experimental results show that GAZBFDR algorithm achieves better performance in the high dimensional dataset.

Keywords

Genetic Algorithm Fractal Dimension Feature Selection Feature Space Feature Subset 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baeza-Yates, R., Navarro, G.: Block-addressing indices for approximate text retrieval. In: Golshani, F., Makki, K. (eds.) Proc of the 6th Int’l Conf on Information and Knowledge Management, pp. 1–8. ACM Press, New York (1997)Google Scholar
  2. 2.
    Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Lomet, D.B. (ed.) Proc. of the 4th Int’l Conf Foundations of Data Organization and Algorithms, pp. 69–84. Springer, Berlin (1993)Google Scholar
  3. 3.
    Jiang, D., Tang, C., Zhang, A.: Cluster Analysis for Gene Expression Data: A Survey. IEEE Transactions on Knowledge and Data Engineering 16(11), 1370–1386 (2004)CrossRefGoogle Scholar
  4. 4.
    Schena, M.D., Shalon, R., Davis, R., Brown, P.: Quantitative Monitoring of Gene Expression Patterns with a Compolementatry DNA Microarray. Science 270, 467–470 (1995)CrossRefGoogle Scholar
  5. 5.
    Aha, D.W., Bankert, R.L.: A Comparative Evaluation of Sequential Feature Selection Algorithms. In: Artificial Intelligence and Statistics V, pp. 199–206. Springer, New York (1996)Google Scholar
  6. 6.
    Scherf, M., Brauer, W.: Feature Selection by Means of a Feature Weighting Approach. Technische Universität München, Munich (1997)Google Scholar
  7. 7.
    Blum, A., Langley, P.: Selection of Relevant Features and Examples in Machine Learning. AI 97, 245–271 (1997)MATHMathSciNetGoogle Scholar
  8. 8.
    Francesco, C.: Data dimensionality estimation methods: a survey. Pattern Recognition 36, 2945–2954 (2003)CrossRefGoogle Scholar
  9. 9.
    Vafaie, H., Jong, K.A.D.: Robust Feature Selection Algorithms. In: Intl. Conf. on Tools with AI, Boston, MA (1993)Google Scholar
  10. 10.
    Yang, J., Honavar, V.: Feature subset selection using a genetic algorithm. In: Koza, J., et al. (eds.) Proceedings of the Second Annual Conference, Stanford University, CA, USA (1997)Google Scholar
  11. 11.
    Traina Jr., C., Traina, A., et al.: Fast feature selection using fractal dimension. In: XV Brazilian DB Symposium, João Pessoa-PA-Brazil, pp. 158–171 (2000)Google Scholar
  12. 12.
    Bao, Y., Yu, G., Sun, H., Wang, D.: Performance Optimization of Fractal Dimension Based Feature Selection Algorithm. In: Li, Q., Wang, G., Feng, L. (eds.) WAIM 2004. LNCS, vol. 3129, pp. 739–744. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  13. 13.
    Liebovitch, L., Toth, T.: A Fast Algorithm to Determine Fractal Dimensions by Box Counting [J]. Physics Letters 141A(8), 386–390 (1989)MathSciNetGoogle Scholar
  14. 14.
    Orenstein, J., Merrett, T.H.: A class of data structures for associative searching. In: Proceedings of the Third ACM SIGACT- SIGMOD Symposium on Principles of Database Systems, pp. 181–190 (1984)Google Scholar
  15. 15.
    Sarraille, J., DiFalco, P.: FD3, http://tori.postech.ac.kr/softwares/
  16. 16.
    De Jong, K.: Learning with Genetic Algorithms: An overview. Machine Learning 3, 121–138 (1988)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • GuangHui Yan
    • 1
    • 2
  • ZhanHuai Li
    • 1
  • Liu Yuan
    • 1
  1. 1.Dept. Computer Science & Software NorthWestern Polytechnical UniversityXianP.R. China
  2. 2.Key Laboratory of Opto-Electronic Technology and Intelligent Control(Lanzhou Jiaotong University), Ministry of EducationLanzhou

Personalised recommendations