Advertisement

An Efficient Feature Selection Algorithm Based on Hybrid Clonal Selection Genetic Strategy for Text Categorization

  • Jiansheng Jiang
  • Wanneng Shu
  • Huixia Jin
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 56)

Abstract

Feature selection is commonly used to reduce dimensionality of datasets with thousands of features which would be impossible to process further. At present there are many methods to deal with text feature selection. To improve the performance of text categorization, we present a new feature selection algorithm for text categorization, called hybrid clonal selection genetic algorithm (HCSGA). Our experimental results, comparing HCSGA with an extensive and representative list of feature selection algorithms, show that HCSGA leads to a considerable increase in the classification accuracy, and is faster than the existing feature selection algorithms.

Keywords

Text categorization Feature selection Feature extraction Hybrid clonal selection genetic algorithm 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Raymer, M., Punch, W., Goodman, E.: Dimensionality reduction using genetic algorithm. IEEE Transactions on Evolutionary Computing 12(4), 164–171 (2000)CrossRefGoogle Scholar
  2. 2.
    Kim, H., Howland, P.: Dimension reduction in text classification with support vector machines. Journal of Machine Learning Research 12(6), 37–53 (2005)MathSciNetGoogle Scholar
  3. 3.
    Ani, A.: Ant colony optimization for feature subset selection. Transaction on Engineering, Computing and Tecgbikigt 12(4), 35–38 (2005)Google Scholar
  4. 4.
    Lam, W.: Automatic textual document categorization based on generalized instance sets and a met model. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(2), 628–633 (2003)Google Scholar
  5. 5.
    Jorng, T.H., Ching, C.Y.: Applying genetic algorithms to query optimization in document retrieval. Information Processing and Management (36), 737–759 (2000)Google Scholar
  6. 6.
    Punch, W.F., Goodman, E.D.: Further research on feature selection and classification using genetic algorithm. In: Proceedings of the Fifth International Conference on Genetic Algorithm, pp. 557–564. Morgan Kaufmann, San Mateo (1993)Google Scholar
  7. 7.
    Pietramala, A., Policicchio, V.L., Rullo, P., Sidhu, I.: A Genetic Algorithm for Text Classification Rule Induction. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 188–203. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  8. 8.
    Kudo, M., Sklansky, K.: Comparison of algorithms that select feature for pattern classifier. Pattern Recognition 33(2), 25–41 (2000)CrossRefGoogle Scholar
  9. 9.
    Jiao, L.-c., Du, H.-f.: The prospect of the artificial immune system. Acta Electronic Since 31(10), 1540–1548 (2003)Google Scholar
  10. 10.
    Jiao, L.C., Wang, L.: A Novel Genetic Algorithm Based on Immunity. IEEE Transaction on Systems, Man, and Cybernetics-Part A: Systems and Humans 30(5), 552–561 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Jiansheng Jiang
    • 1
  • Wanneng Shu
    • 2
  • Huixia Jin
    • 3
  1. 1.Faculty of Mechanical and Electronic EngineeringChina University of Petroleum BeijingBeijingChina
  2. 2.College of Computer ScienceSouth-Central University for NationalitiesWuhanChina
  3. 3.Department of Physics and Telecom EngineeringHunan City UniversityYiyangChina

Personalised recommendations