Webpage Classification with ACO-Enhanced Fuzzy-Rough Feature Selection

  • Richard Jensen
  • Qiang Shen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4259)


Due to the explosive growth of electronically stored information, automatic methods must be developed to aid users in maintaining and using this abundance of information effectively. In particular, the sheer volume of redundancy present must be dealt with, leaving only the information-rich data to be processed. This paper presents an approach, based on an integrated use of fuzzy-rough sets and Ant Colony Optimization (ACO), to greatly reduce this data redundancy. The work is applied to the problem of webpage categorization, considerably reducing dimensionality with minimal loss of information.


Feature Selection Feature Subset Swarm Intelligence Optimal Feature Subset Good Feature Subset 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bonabeau, E., Dorigo, M., Theraulez, G.: Swarm Intelligence: From Natural to Artificial Systems. Oxford University Press Inc., New York (1999)MATHGoogle Scholar
  2. 2.
    Chouchoulas, A., Shen, Q.: Rough set-aided keyword reduction for text categorisation. Applied Artificial Intelligence 15(9), 843–873 (2001)CrossRefGoogle Scholar
  3. 3.
    Cohen, W.W.: Fast effective rule induction. In: Machine Learning: Proceedings of the 12th International Conference, pp. 115–123 (1995)Google Scholar
  4. 4.
    Dash, M., Liu, H.: Feature Selection for Classification. Intelligent Data Analysis 1(3), 131–156 (1997)CrossRefGoogle Scholar
  5. 5.
    Dubois, D., Prade, H.: Putting rough sets and fuzzy sets together. In: Slowinski, R. (ed.) Intelligent Decision Support, pp. 203–232. Kluwer Academic Publishers, Dordrecht (1992)Google Scholar
  6. 6.
    Han, J., Hu, X., Lin, T.Y.: Feature Subset Selection Based on Relative Dependency between Attributes. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS, vol. 3066, pp. 176–185. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Jensen, R., Shen, Q.: Fuzzy-rough attribute reduction with application to web categorization. Fuzzy Sets and Systems 141(3), 469–485 (2004)MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Jensen, R., Shen, Q.: Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough Based Approaches. IEEE Transactions on Knowledge and Data Engineering 16(12), 1457–1471 (2004)CrossRefGoogle Scholar
  9. 9.
    Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishing, Dordrecht (1991)MATHGoogle Scholar
  10. 10.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. The Morgan Kaufmann Series in Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)Google Scholar
  11. 11.
    Rasmani, K., Shen, Q.: Modifying weighted fuzzy subsethood-based rule models with fuzzy quantifiers. In: Proceedings of the 13th International Conference on Fuzzy Systems, pp. 1687–1694 (2004)Google Scholar
  12. 12.
    Salton, G., Buckley, C.: Term Weighting Approaches in Automatic Text Retrieval. Information Processing and Management 24(5), 513–523 (1988)CrossRefGoogle Scholar
  13. 13.
    Witten, I.H., Frank, E.: Generating Accurate Rule Sets Without Global Optimization. In: Machine Learning: Proceedings of the 15th International Conference. Morgan Kaufmann Publishers, San Francisco (1998)Google Scholar
  14. 14.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools with Java implementations. Morgan Kaufmann Publishers, San Francisco (2000)Google Scholar
  15. 15.
  16. 16.
    Yao, J., Zhang, M.: Feature Selection with Adjustable Criteria. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS, vol. 3641, pp. 204–213. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Zadeh, L.A.: Fuzzy sets. Information and Control 8, 338–353 (1965)MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Richard Jensen
    • 1
  • Qiang Shen
    • 1
  1. 1.Department of Computer ScienceThe University of WalesAberystwyth

Personalised recommendations