Skip to main content

Web Page Classification with an Ant Colony Algorithm

  • Conference paper
Parallel Problem Solving from Nature - PPSN VIII (PPSN 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3242))

Included in the following conference series:

Abstract

This paper utilizes Ant-Miner – the first Ant Colony algorithm for discovering classification rules – in the field of web content mining, and shows that it is more effective than C5.0 in two sets of BBC and Yahoo web pages used in our experiments. It also investigates the benefits and dangers of several linguistics-based text preprocessing techniques to reduce the large numbers of attributes associated with web content mining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 74.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan Kaufmann Publications, San Francisco (2000)

    Google Scholar 

  2. Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery: an overview. In: Fayyad, U.M., et al. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 1–34. AAAI/MIT (1996)

    Google Scholar 

  3. Parpinelli, R.S., Lopes, H.S., Freitas, A.A.: Data Mining with an Ant Colony Optimization Algorithm. IEEE Trans. on Evolutionary Computation, special issue on Ant Colony algorithms 6(4), 321–332 (2002)

    Google Scholar 

  4. Parpinelli, R.S., Lopes, H.S., Freitas, A.A.: An Ant Colony Algorithm for Classification Rule Discovery. In: Abbass, H.A., Sarker, R.A., Newton, C.S. (eds.) Data Mining: a Heuristic Approach, pp. 191–208. Idea Group Publishing, London (2002)

    Google Scholar 

  5. Chakrabarti Mining, S.: the web: discovering knowledge from hypertext data. Morgan Kaufmann, San Francisco (2003)

    Google Scholar 

  6. Abraham, A., Ramos, V.: Web Usage Mining Using Artificial Ant Colony Clustering and Genetic Programming. In: Proc. Congress on Evolut. Comp (CEC-2003), IEEE Press, Los Alamitos (2003)

    Google Scholar 

  7. Cutler, M., Deng, H., Maniccam, S.S., Meng, W.: A New Study Using HTML Structures to Improve Retrieval. In: Proc. 11th IEEE Int. Conf. on Tools with AI, pp. 406–409. IEEE, Los Alamitos (1999)

    Google Scholar 

  8. Fellbaum, C. (ed.): WordNet - an electronic lexical database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  9. Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: from natural to artificial systems, Oxford (1999)

    Google Scholar 

  10. Dorigo, M., Gambardella, L.M.: Ant colonies for the traveling salesman problem. Biosystems 43, 73–81 (1997)

    Article  Google Scholar 

  11. Hoe, K.M., Lai, W.K., Tai, T.S.Y.: Homogeneous ants for web document similarity modeling and categorization. In: Dorigo, M., Di Caro, G.A., Sampels, M. (eds.) Ant Algorithms 2002. LNCS, vol. 2463, pp. 256–261. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  12. Schoonderwoerd, R., Holland, O., Bruten, J.: Ant-like agents for load balancing in telecommunications networks. HP Labs Technical Report, HPL-96-76, May 21 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Holden, N., Freitas, A.A. (2004). Web Page Classification with an Ant Colony Algorithm. In: Yao, X., et al. Parallel Problem Solving from Nature - PPSN VIII. PPSN 2004. Lecture Notes in Computer Science, vol 3242. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30217-9_110

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30217-9_110

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23092-2

  • Online ISBN: 978-3-540-30217-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics