Summary
The methodology for the knowledge discovery in databases architecture outlines possible approaches taken by search engines to improve their IR systems. The conventional approach provided the requester with query results based on the user’s knowledge of respective IR systems. This paper proposes the use of an information sharing model based on the information processing methodology of honeybees and knowledge discovery in databases as opposed to the traditional IR models used by current search engines. The major limitation of IR-based systems is their dependency on human editors which is reflected in static sets of query terms and the use of stemming. Experimental results are presented for data clustering component (Web page indexer) of the Tocorime Apicu search engine which is based on the information sharing model.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
H.A. Abbass. MBO: Marriage in Honey Bees Optimization A Haplometrosis Polygynous Swarming Approach. In Proceedings of CEC 2001, pages 207–214. IEEE, Piscataway, NJ, 2001.
P. Adriaans and D. Zantinge. Data Mining. Addison-Wesley, Harlow, England, 1996.
E. Bonabeau, A. Sobkowski, G. Theraulaz, and J. Deneubourg. Adaptive Task Inspired by a Model of Divsion of Labor in Social Insets. In D. Lundh, B. Olsson, and A. Narayanan, editors, Biocomputing and Emergent Computation, pages 36–45. World Scientific, 1997.
J.B. Free. The Social Organization of Honeybees. (Studies in Biology no. 81 ) The Camelot Press Ltd, Southampton, 1970.
J.B. Free. Pheromones of Social Bees. Comstock Publishing Associates, Ithaca, New York, 1987.
P. Fritsch. Five Mellow Guys Follow Their Dream: A `Tall Ship’ in Brazil. The Wall Street Journal, CXLII(35):1, Friday, February 18, 2000.
Wall Street Journal. (Western Edition). Dow Jones and Company, 200 Liberty St., New York, February 2003.
S. Kim and B. Zhang. Evolutionary Learning of Web-Document Structure for Information Retrieval. In Proceedings of CEC 2000, pages 1253–1260. IEEE, Piscataway, NJ, 2000.
S. Lawrence and C.L. Giles. Searching the World Wide Web. Science, 280: 98–100, 1998.
V. Maniezzo, A. Carbonaro, M. Golfarelli, and S. Rizzi. An ANTS Algorithm for Optimizing the Materialization of Fragmented Views in Data Warehouses: Preliminary Results. In E.J.W. Boers et al., editor, EvoWorkshop 2001, LNCS 2037, pages 80–89. Springer-Verlag, Berlin, 2001.
S. Parthasarathy and M. Ogihara. Exploiting Dataset Similarity for Distributed Mining. In IPDPS 2000 Workshops, LNCS 1800, pages 399–406. Springer-Verlag, Berlin, 2000.
V. Ramos, F. Mugo, and P. Pina. Self-Organized Data and Image Retrieval as a Consequence of Inter-Dynamic Synergistic Relationships in Artificial Ant Colonies. In A. Abraham, J. Ruiz del Solar, and M. Koppen, editors, Soft Computing Systems: Design, Management and Applications, pages 500–509, IOS Press, Amsterdam, December 2002.
G. Salton, J. Allen, C. Buckley, and A. Singhal. Automaic Analysis, Theme Generation, and Summarization of Machine-Readable Texts. Science, 264: 1421–1426, 1994.
G. Salton, A. Wong, and C.S. Yang. A Vector Space Model for Automatic Indexing. Communications of the ACM, 18: 613–620, 1975.
M.P. Sinka and D.W. Corne. A Large Benchmark Dataset for Web Document Clustering. In A. Abraham, J. Ruiz del Solar, and M. Koppen, editors, Soft Computing Systems: Design, Management and Applications, pages 881–890, IOS Press, Amsterdam, December 2002.
R.L. Walker. Search Engine Case Study: Searching the Web Using Genetic Programming and MPI. Parallel Computing, 27 (1/2): 71–89, 2001.
R.L. Walker. Applying Evolutionary Computation Methodologies for Search Engine Development. In L. Wang, K.C. Tan, T. Furhashi, J. Kim, and X. Yao, editors, SEAL ‘02: Proceedings of the 2002 Asia-Pacific Conference on Simulated Evolution and Learning, pages 208–213, Singapore, November 2002. Nanyang Technological University Press.
R.L. Walker. Simulating an Information Ecosystem within the WWW. In A. Abraham, J. Ruiz del Solar, and M. Koppen, editors, Soft Computing Systems: Design, Management and Applications, pages 891–900, IOS Press, Amsterdam, December 2002.
R.L. Walker. Using Nearest Neighbors to Discover Web Page Similarities. In H.R. Arabnia, editor, PDPTA ‘02: Proceedings of the 2002 International Conference on Parallel and Distributed Processing Techniques and Applications, pages 157–163. CSREA Press, June 2002.
R.L. Walker. Comparative Study of the Information Retrieval Systems of Current Search Engines. In W. Abramowicz and G. Klein, editors, BIS 2003: Proceedings of the 2003 Business Information System Conference, June 2003. To appear.
R.L. Walker. Tocorime Apicu: Design of an Experimental Search Engine using an Information Sharing Model. Ph.D. Dissertation, University of California, Los Angeles, 2003.
Yahoo. Yahoo Web Page. Yahoo Inc. Santa Clara, CA, November 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Walker, R.L. (2003). A Framework for High-Performance Web Mining in Dynamic Environments using Honeybee Search Strategies. In: Abraham, A., Franke, K., Köppen, M. (eds) Intelligent Systems Design and Applications. Advances in Soft Computing, vol 23. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44999-7_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-44999-7_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40426-2
Online ISBN: 978-3-540-44999-7
eBook Packages: Springer Book Archive