Skip to main content

Dimensionality Reduction via Genetic Value Clustering

  • Conference paper
  • First Online:
Genetic and Evolutionary Computation — GECCO 2003 (GECCO 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2724))

Included in the following conference series:

Abstract

Feature extraction based on evolutionary search offers new possibilities for improving classification accuracy and reducing measurement complexity in many data mining and machine learning applications. We present a family of genetic algorithms for feature synthesis through clustering of discrete attribute values. The approach uses new compact graph-based encoding for cluster representation, where size of GA search space is reduced exponentially with respect to the number of items in partitioning, as compared to original idea of Park and Song. We apply developed algorithms and study their effectiveness for DNA fingerprinting in population genetics and text categorization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Devijver, P.A. and Kittler, J.: Pattern Recognition: A Statistical Approach. Prentice-Hall International, (1982)

    Google Scholar 

  2. Jain A.K., Duin R.P. and Mao J, Statistical pattern recognition: a review. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, (2000) 4–37

    Article  Google Scholar 

  3. Freitas A.A.: Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer-Verlag, (2002)

    Google Scholar 

  4. Kohavi R. and John G.: Wrappers for Feature Subset Selection. Artificial Intelligence Journal 97(1–2), (1997) 273–324

    Article  MATH  Google Scholar 

  5. Siedlecki W. and Sklansky J.: On automatic feature selection. International Journal of Pattern Recognition and Artificial Intelligence, 2, (1988) 197–220

    Article  Google Scholar 

  6. Vafaie H., and De Jong K.: Robust feature selection algorithms. In Proc. of the 5th IEEE International Conference on Tools for Artificial Intelligence, Boston, MA, (1993) 356–363

    Google Scholar 

  7. Whitley D., Beveridge R., Guerra C. and Graves C.: Messy Genetic Algorithms for Subset Feature Selection. International Conference on Genetic Algorithms. T. Baeck, ed. Morgan Kaufmann, (1997)

    Google Scholar 

  8. Yang J. and Honavar V.: Feature Subset Selection Using a Genetic Algorithm. In: Feature Extraction, Construction, and Subset Selection: A Data Mining Perspective. Motoda, H. and Liu, H. (Eds.) New York, Kluwer, (1998)

    Google Scholar 

  9. Punch W.F., Goodman E.D., Pei M., Chia-Shun L., Hovland P. and Enbody R.: Further Research on Feature Selection and Classification Using Genetic Algorithms. In Proc. 5th International Conference on Genetic Algorithms, Urbana-Champaign IL, (1993) 557–562

    Google Scholar 

  10. Raymer M., Punch W., Goodman E., Sanschagrin P., and Kuhn L., Simultaneous Feature Extraction and Selection using a Masking Genetic Algorithm. In Proc. of 7th International Conference on Genetic Algorithms (ICGA), San Francisco CA, (1997) 561–567

    Google Scholar 

  11. Vafaie H. and DeJong K.: Feature Space Transformation Using Genetic Algorithms. IEEE Intelligent Systems 13(2), (1998) 57–65

    Article  Google Scholar 

  12. Lin C. and Wu J.: Automatic facial feature extraction by genetic algorithms. IEEE Trans. on Image Processing, vol. 8(6), (1999) 834–845

    Article  Google Scholar 

  13. Raymer M.L., Punch W.F., Goodman E.D., Kuhn L.A. and Jain A.K.: Dimensionality Reduction Using Genetic Algorithms. IEEE Trans. on Evolutionary Computations 4(2), (2000) 164–171

    Article  Google Scholar 

  14. Brumby S.P., Theiler J., Perkins S.J., Harvey N.R., Szymanski J.J., Bloch J.J., and Mitchell M.: Investigation of Feature Extraction by a Genetic Algorithm. Proc. SPIE 3812, (1999) 24–31

    Article  Google Scholar 

  15. Larsen O., Freitas A.A. and Nievola J.C.: Constructing X-of-N attributes with a genetic algorithm. In Proc. 4th Int. Conf. on Recent Advances in Soft Computing, (2002) 326–331

    Google Scholar 

  16. Pudil P. and Novovicová J.: Feature Subset Selection Using a Genetic Algorithm in Feature Extraction. In: Huan Liu, Hiroshi Motoda (eds.): Construction and Selection: A Data Mining Perspective, Kluwer (1998)

    Google Scholar 

  17. Martin-Bautista M. and Vila M.-A.: A survey of genetic feature selection in mining issues. In Proceedings of the Congress on Evolutionary Computation (CEC 99), (1999) 13–23

    Google Scholar 

  18. Falkenauer E., Genetic Algorithms and Grouping Problems. John Wiley & Son Ltd., (1998)

    Google Scholar 

  19. Park Y-J. and Song M-S.: A genetic algorithm for clustering problems. In Proc. 3rd Annual Conf. on Genetic Programming, (1998) 568–575.

    Google Scholar 

  20. Trunk, G.V.: A problem of dimensionality: a simple example. IEEE Trans. Patt. Anal. Mach. Intell. 1, (1979) 306–307

    Article  Google Scholar 

  21. Minker, J., Wilson, G.A., Zimmerman, B.H., An evaluation of query expansion by the addition of clustered terms for a document retrieval system. Information Storage and Retrieval 8(6), (1972) 329–348

    Article  Google Scholar 

  22. Spark-Jones K. and Jackson D.M.: The use of automatically-obtained keyword classifications for information retrieval. Information Processing and Management 5, (1970) 175–201

    Google Scholar 

  23. Merzbacher M. and Chu W. W.: Pattern-based clustering for database attribute values. In Proc. of AAAI Workshop on Knowledge Discovery in Databases, Wash., D.C., (1993)

    Google Scholar 

  24. Tishby N., Pereira F.C., and Bialek W.: The information bottleneck method. In Proc. of the 37-th Annual Allerton Conference on Communication, Control and Computing, (1999) 368–377

    Google Scholar 

  25. Slonim N. and Tishby N.: Agglomerative Information Bottleneck. In Advances in Neural Information Processing Systems (NIPS-12), MIT Press, (1999) 617–623

    Google Scholar 

  26. Friedman N., Mosenzon O., Slonim N., and Tishby N.: Multivariate Information Bottleneck. In Proc. of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI), (2001)

    Google Scholar 

  27. Slonim N., Friedman N., and Tishby N.: Agglomerative Multivariate Information Bottleneck. In Advances in Neural Information Processing Systems (NIPS-14), (2001)

    Google Scholar 

  28. O’Connell J.R. and Weeks D.E.: The VITESSE algorithm for rapid exact multilocus linkage analysis via genotype set-recoding and fuzzy inheritance. Nature Genetics 11, (1995) 402–408

    Article  Google Scholar 

  29. Friedman N., Geiger D., and Lotner N.: Likelihood Computation with Value Abstraction. In Proc. Sixteenth Conf. on Uncertainty in Artificial Intelligence (UAI), (2000)

    Google Scholar 

  30. Chartrand, G. and Oellermann O.R.: Applied and Algorithmic Graph Theory. McGraw-Hill, Inc., New York (1993)

    Google Scholar 

  31. Bollob’as B.: Random Graphs. Academic Press, London, (1985)

    MATH  Google Scholar 

  32. Waser P.M. and Strobeck C.: Genetic signatures of interpopulation dispersal. Trends Ecol Evol 13, (1998) 43–44

    Article  Google Scholar 

  33. Guinand, B., Topchy A., Page K.S., Burnham-Curtis M.K., Punch W.F., and Scribner K. T.: Comparisons of likelihood and machine learning methods of individual classification. Journal of Heredity 93(4), (2002) 260–269

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Topchy, A., Punch, W. (2003). Dimensionality Reduction via Genetic Value Clustering. In: Cantú-Paz, E., et al. Genetic and Evolutionary Computation — GECCO 2003. GECCO 2003. Lecture Notes in Computer Science, vol 2724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45110-2_16

Download citation

  • DOI: https://doi.org/10.1007/3-540-45110-2_16

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40603-7

  • Online ISBN: 978-3-540-45110-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics