Abstract
This paper utilizes Ant-Miner – the first Ant Colony algorithm for discovering classification rules – in the field of web content mining, and shows that it is more effective than C5.0 in two sets of BBC and Yahoo web pages used in our experiments. It also investigates the benefits and dangers of several linguistics-based text preprocessing techniques to reduce the large numbers of attributes associated with web content mining.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan Kaufmann Publications, San Francisco (2000)
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery: an overview. In: Fayyad, U.M., et al. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 1–34. AAAI/MIT (1996)
Parpinelli, R.S., Lopes, H.S., Freitas, A.A.: Data Mining with an Ant Colony Optimization Algorithm. IEEE Trans. on Evolutionary Computation, special issue on Ant Colony algorithms 6(4), 321–332 (2002)
Parpinelli, R.S., Lopes, H.S., Freitas, A.A.: An Ant Colony Algorithm for Classification Rule Discovery. In: Abbass, H.A., Sarker, R.A., Newton, C.S. (eds.) Data Mining: a Heuristic Approach, pp. 191–208. Idea Group Publishing, London (2002)
Chakrabarti Mining, S.: the web: discovering knowledge from hypertext data. Morgan Kaufmann, San Francisco (2003)
Abraham, A., Ramos, V.: Web Usage Mining Using Artificial Ant Colony Clustering and Genetic Programming. In: Proc. Congress on Evolut. Comp (CEC-2003), IEEE Press, Los Alamitos (2003)
Cutler, M., Deng, H., Maniccam, S.S., Meng, W.: A New Study Using HTML Structures to Improve Retrieval. In: Proc. 11th IEEE Int. Conf. on Tools with AI, pp. 406–409. IEEE, Los Alamitos (1999)
Fellbaum, C. (ed.): WordNet - an electronic lexical database. MIT Press, Cambridge (1998)
Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: from natural to artificial systems, Oxford (1999)
Dorigo, M., Gambardella, L.M.: Ant colonies for the traveling salesman problem. Biosystems 43, 73–81 (1997)
Hoe, K.M., Lai, W.K., Tai, T.S.Y.: Homogeneous ants for web document similarity modeling and categorization. In: Dorigo, M., Di Caro, G.A., Sampels, M. (eds.) Ant Algorithms 2002. LNCS, vol. 2463, pp. 256–261. Springer, Heidelberg (2002)
Schoonderwoerd, R., Holland, O., Bruten, J.: Ant-like agents for load balancing in telecommunications networks. HP Labs Technical Report, HPL-96-76, May 21 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Holden, N., Freitas, A.A. (2004). Web Page Classification with an Ant Colony Algorithm. In: Yao, X., et al. Parallel Problem Solving from Nature - PPSN VIII. PPSN 2004. Lecture Notes in Computer Science, vol 3242. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30217-9_110
Download citation
DOI: https://doi.org/10.1007/978-3-540-30217-9_110
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23092-2
Online ISBN: 978-3-540-30217-9
eBook Packages: Springer Book Archive