Abstract
We present in this paper a clustering algorithm which is based on a cellular automaton and which aims at displaying a map of web pages. We describe the main principles of methods that build such maps, and the main principles of cellular automata. We show how these principles can be applied to the problem of web pages clustering: the cells, which are organized in a 2D grid, can be either empty or may contain a page. The local transition function of cells favors the creation of groups of similar states (web pages) in neighbouring cells. We then present the visual results obtained with our method on standard data as well as on sets of documents. These documents are thus organized into a visual map which eases the browsing of these pages.
Chapter PDF
References
Zamir O, Etzioni O (1999). Grouper: a dynamic clustering interface to Web search results Computer Networks (Amsterdam, Netherlands: 1999), 31(11–16):1361–1374.
Kohonen T (1998). Self-organization of very large document collections: State of the art In: Niklasson, Lars; Bodén, Mikael; Ziemke, Tom (Eds): Proceedings of ICANN98, the 8th International Conference on Artificial Neural Networks Conference: Skövde, Sweden, September 2–4 Springer (London) 1998 p 65–74.
Wise, J. A: The Ecological Approach to Text Visualization In: Journal of the American Society for Information Science (JASIS), 50 (1999) 13, p 1224–1233.
Cugini, J (2000). Presenting Search Results: Design, Visualization and Evaluation In: Workshop: Information Doors — Where Information Search and Hypertext Link. San Antonio, TX, May 30.
Roussinov D, Tolle K, Ramsey M, McQuaid M, and Chen H (1999). Visualizing Internet Search Results with Adaptive Self-Organizing Maps. Proceedings of ACM SIGIR, August 15–19, Berkeley, CA.
Chen H, Schuffels C, Orwig R (1996). Internet categorization and search: a self-organizing approach In: Journal of visual communication and image representation, p 88–102.
Handl Julia, Bernd Meyer. Improved ant-based clustering and sorting in a document retrieval interface. In Proceedings of the Seventh International Conference on Parallel Problem Solving from Nature, Vol. 2439 of Lecture Notes in Computer Science (pp. 913–923). Berlin, Germany: Springer-Verlag.
Jain AK, Murty MN, Flynn PJ (1999). Data clustering: a review, ACM Computing Surveys, 31(3), pages 264–323.
Gardner M (1970). Mathematical Games: The fantastic combinations of John Conway’s new solitaire game ‘life’ Scientific American, pages 120–123, Octobre.
Ganguly N, Sikdar BK, Deutsch A, Canright G, Chaudhuri P (2003). A Survey on Cellular Automata Technical Report Centre for High Performance Computing, Dresden University of Technology, December.
Lumer E, Faieta B (1994). Diversity and adaption in populations of clustering ants In Proceedings of the Third International Conference on Simulation of Adaptive Behaviour: From Animals to Animats 3, pages 501–508 MIT Press, Cambridge.
Blake CL, Merz, CJ (1998). UCI Repository of machine learning databases http://wwwicsuciedu/mlearn/MLRepositoryhtml] Irvine, CA: University of California, Department of Information and Computer Science.
Guinot C, Malvy DJM, Morizot F, Tenenhaus M, Latreille J, Lopez S, Tschachler E, et Dubertret L (2003). Classification of healthy human facial skin Textbook of Cosmetic Dermatology Third edition.
Han Eui-Hong, Boley Daniel, Gini Maria, Gross Robert, Hastings Kyle, Karypis George, Kumar ipin, Mobasher Bamshad, Moore J (1998). Webace: a web agent for document categorization and exploration In AGENTS’98: Proceedings of the second international conference on Autonomous agents, pages 408–415, New York, NY, USA, ACM Press.
Salton G, Yang CS, Yu CT (1975). A theory of term importance in automatic text analysis Journal of the American Society for Information Scienc, 26(l):33–44.
Azzag H, Picarougne F, Guinot C, Venturini G (2004). Un survol des algorithmes biomimétiques pour la classification Classification et Fouille de Données, pages 13–24, RNTI-C-1, Cépaduès.
Mokaddem F, Picarougne F, Azzag H, Guinot G, Venturini G (2004). Techniques visuelles de recherche d’informations sur le Web, à paraître dans Revue des Nouvelles Technologies de l’Information, numéro special Visualisation en Extraction des Connaissances, Pascale Kuntz et Franois Poulet rédacteurs invités, Cépaduès.
Von Neumann J (1966). Theory of Self Reproducing Automata, University of Illinois Press, Urbana Champaign, Illinois.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 International Federation for Information Processing
About this paper
Cite this paper
Azzag, H., Ratsimba, D., Da Costa, D., Guinot, C., Venturini, G. (2006). On Building Maps of Web Pages with a Cellular Automaton. In: Pan, Y., Rammig, F.J., Schmeck, H., Solar, M. (eds) Biologically Inspired Cooperative Computing. BICC 2006. IFIP International Federation for Information Processing, vol 216. Springer, Boston, MA . https://doi.org/10.1007/978-0-387-34733-2_4
Download citation
DOI: https://doi.org/10.1007/978-0-387-34733-2_4
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-34632-8
Online ISBN: 978-0-387-34733-2
eBook Packages: Computer ScienceComputer Science (R0)