Abstract
Clearly the quality of discovered knowledge strongly depends on the quality of the data being mined. This has motivated the development of several algorithms for data preparation tasks, as discussed in chapter 4.
“In reality, the boundary between pre-processor and classifier is arbitrary. If the pre-processor generated the predicted class label as a feature [attribute], then the classifier would be trivial. Similarly, the pre-processor could be trivial and the classifier do all the work.” [Sherrah et al. 1997, p. 305]
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J. Bala, K. De Jong, J. Huang, H. Vafaie and H. Wechsler. Hybrid learning using genetic algorithms and decision trees for pattern classification. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI ‘85),719–724. 1995.
J. Bala, K. De Jong, J. Huang, H. Vafaie and H. Wechsler. Using learning to facilitate the evolution of features for recognizing visual concepts. Evolutionary Computation 4(3), 297–312, 1996.
K. Chen and H. Liu. Towards an evolutionary algorithm: a comparison of two feature selection algorithms. Proceedings of the Congress on Evolutionary Computation (CEC ‘89), 1309–1313. Washington D.C., 1999.
S. Chen, C. Guerra-Salcedo, and S.F. Smith. Non-standard crossover for a standard representation — commonality-based feature subset selection. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ‘89), 129–134. Morgan Kaufmann, 1999.
K.J. Cherkauer and J.W. Shavlik. Growing simpler decision trees to facilitate knowledge discovery. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD ‘86), 315–318. AAAI Press, 1996.
C. Emmanouilidis, A. Hunter and J. Maclntyre. A multiobjective evolutionary setting for feature selection and a commonality-based crossover operator. Proceedings of the 2000 Congress on Evolutionary Computation (CEC ‘2000), 309–316. IEEE, 2000.
A.A. Freitas. The principle of transformation between efficiency and effectiveness: towards a fair evaluation of the cost-effectiveness of KDD techniques. Proceedings of the 1st European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD ‘87). Lecture Notes in Artificial Intelligence 1263, 299–306. Springer, 1997.
A.A. Freitas. A survey of evolutionary algorithms for data mining and knowledge discovery. To appear in: A. Ghosh and S. Tsutsui (Eds.) Advances in Evolutionary Computation. Springer, 2002.
D. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, 1989.
C. Guerra-Salcedo and D. Whitley. Genetic search for feature subset selection: a comparison between CHC and GENESIS. Genetic Programming 1998: Proceedings of the 3rd Annual Conference, 504–509. Morgan Kaufmann, 1998.
C. Guerra-Salcedo and D. Whitley. Genetic approach to feature selection for ensemble creation. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ‘89), 236–243. Morgan Kaufmann, 1999.
C. Guerra-Salcedo and D. Whitley. Feature selection mechanisms for ensemble creation: a genetic search perspective. In: A.A. Freitas (Ed.) Data Mining with Evolutionary Algorithms: Research Directions — Papers from the AAAI ‘89/GECCO ‘89 Workshop. Technical Report WS-99–06, 13–17. AAAI Press, 1999.
C. Guerra-Salcedo, S. Chen, D. Whitley, and S. Smith. Fast and accurate feature selection using hybrid genetic strategies. Proceedings of the Congress on Evolutionary Computation (CEC ‘89), 177–184. Washington D.C., USA. 1999.
Y-J. Hu. A genetic programming approach to constructive induction. Genetic Programming 1998: Proceedings of the 3rd Annual Conference, 146–151. Morgan Kaufmann, 1998.
Y-J. Hu and D. Kibler. Generation of attributes for learning algorithms. Proceedings of the 1996 National Conference on Artificial Intelligence (AAAI ‘86), 806–811. AAAI Press, 1996.
H. Ishibuchi and T. Nakashima. Multi-objective pattern and feature selection by a genetic algorithm. Proceedings of the 2000 Genetic and Evolutionary Computation Conference (GECCO ‘2000), 1069–1076. Morgan Kaufmann, 2000.
Y. Kim, W.N. Street and F. Menczer. Feature selection in unsupervised learning via evolutionary search. Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ‘2000), 365–369. ACM, 2000.
M. Kudo and J. Sklansky. Comparison of algorithms that select features for pattern classifiers. Pattern Recognition 33 (2000), 2541.
I. Kuscu. Evolution of learning rules for hard learning problems. Proceedings of the 5th Annual Evolutionary Programming Conference MIT Press, 1996.
I. Kuscu. A genetic constructive induction model. Proceedings of the Congress on Evolutionary Computation (CEC ‘89), 212–217. Washington D.C., 1999.
I. Kuscu. Generalisation and domain specific functions in genetic programming. Proceedings of the Congress on Evolutionary Computation (CEC ‘2000), 1393–1400. IEEE, 2000.
X. Llora and J.M. Garrell. Inducing partially-defined instances with evolutionary algorithms. Proceedings of the 18th International Conference on Machine Learning (ICML ‘2001), 337–344. Morgan Kaufmann, 2001.
M.J. Martin-Bautista and M.-A. Vila. A survey of genetic feature selection in mining issues. Proceedings of the Congress on Evolutionary Computation (CEC ‘89), 1314–1321. IEEE, 1999.
F. Menczer, M. Degeratu and W.N. Street. Efficient and scalable Pareto optimization by evolutionary local selection algorithms. Evolutionary Computation 8(2), 223–247, 2000.
A. Moser and M.N. Murty. On the scalability of genetic algorithms to very large-scale feature selection. Proceedings of the Real-World Applications of Evolutionary Computing (EvoWorkshops 2000). Lecture Notes in Computer Science 1803, 77–86. Springer, 2000.
Punch et al. 1993] Punch, W.F.; Goodman, E.D.; Pei, M.; Chia-Shun, L.; Hovland, P. and Enbody, R. Further research on feature selection and classification using genetic algorithms. Proceedings of the 5th International Conference Genetic Algorithms (ICGA ‘83),557–564.
J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, 1993.
H. Ragavan, L. Rendell, M. Shaw and A. Tessmer. Complex concept acquisition through direct search and feature caching. Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI ‘83), 946–951. 1993.
M.L. Raymer, W.F. Punch, E.D. Goodman and L.A. Kuhn. Genetic programming for improved data mining - application to the biochemistry of protein interactions. Genetic Programming 1996: Proceedings of the 1st Annual Conference, 375–380. Morgan Kaufmann, 1996.
M.L. Raymer, W.F. Punch, E.D. Goodman, P.C. Sanschagrin and L.A. Kuhn. Simultaneous feature scaling and selection using a genetic algorithm. Proceedings of the 7th International Conference on Genetic Algorithms (ICGA ‘87), 561–567. Morgan Kaufmann, 1997.
M.L. Raymer, W.F. Punch, E.D. Goodman, L.A. Kuhn and A.K. Jain. Dimensionality reduction using genetic algorithms. IEEE Transactions on Evolutionary Computation 4(2), 164–171, 2000.
P.K. Sharpe and R.P. Glover. Efficient GA based techniques for classification. Applied Intelligence 11, 277–284, 1999.
J.R. Sherrah, R.E. Bogner and A. Bouzerdoum. The evolutionary pre-processor: automatic feature extraction for supervised classification using genetic programming. Genetic Programming 1997: Proceedings of the 2nd Annual Conference (GP ‘87), 304–312. Morgan Kaufmann, 1997.
T. Terano and Y. Ishino. Interactive genetic algorithm based feature selection and its application to marketing data analysis. In: H. Liu and H. Motoda (Eds.) Feature Extraction, Construction and Selection, 393–406. Kluwer, 1998.
Thompson 1998] S. Thompson. Pruning boosted classifiers with a real valued genetic algorithm. Research and Develop. in Expert Systems XV — Proceedings of ES’98,133–146. Springer, 1998.
S. Thompson. Genetic algorithms as postprocessors for data mining. In: A.A. Freitas (Ed.) Data Mining with Evolutionary Algorithms: Research Directions — Papers from the AAAI Workshop, 18–22. Technical Report WS-99–06. AAAI Press, 1999.
H. Vafaie and K. DeJong. Evolutionary Feature Space Transformation. In: H. Liu and H. Motoda (Eds.) Feature Extraction, Construction and Selection, 307–323. Kluwer, 1998.
K. Wang and S. Sundaresh. Selecting features by vertical compactness of data. In: H. Liu and H. Motoda (Eds.) Feature Extraction, Construction and Selection, 71–84. Kluwer, 1998.
J. Yang and V. Honavar. Feature subset selection using a genetic algorithm. Genetic Programming 1997: Proceedings of the 2nd Annual Conference (GP ‘87), 380-–385. Morgan Kaufmann, 1997.
J. Yang and V. Honavar. Feature subset selection using a genetic algorithm. In: H. Liu and H. Motoda (Eds.) Feature Extraction, Construction and Selection, 117–136. Kluwer, 1998.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Freitas, A.A. (2002). Evolutionary Algorithms for Data Preparation. In: Data Mining and Knowledge Discovery with Evolutionary Algorithms. Natural Computing Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04923-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-662-04923-5_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-07763-0
Online ISBN: 978-3-662-04923-5
eBook Packages: Springer Book Archive