Advertisement

Dynamic Load Balancing Model: Preliminary Results for Parallel Pseudo-search Engine Indexers/Crawler Mechanisms Using MPI and Genetic Programming

  • Reginald L. Walker
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1981)

Abstract

Methodologies derived from Genetic Programming (GP) and Knowledge Discovery in Databases (KDD) were used in the parallel implementation of the indexer simulator to emulate the currentWorldWide Web (WWW) search engine indexers. This indexer followed the indexing strategies that were employed by AltaVista and Inktomi that index each word in each Web document. The insights gained from the initial implementation of this simulator have resulted in the initial phase of the adaption of a biological model. The biological model will offer a basis for future developments associated withan integrated Pseudo-Search Engine. The basic characteristics exhibited by the model will be translated so as to develop a model of an integrated searche ngine using GP. The evolutionary processes exhibited by this biological model will not only provide mechanisms for the storage, processing, and retrieval of valuable information but also for Web crawlers, as well as for an advanced communication system. The current Pseudo-Search Engine Indexer, capable of organizing limited subsets of Web documents, provides a foundation for the first simulator of this model. Adaptation of the model for the refinement of the Pseudo-Search Engine establishes order in the inherent interactions between the indexer, crawler and browser mechanisms by including the social (hierarchical) structure and simulated behavior of this complex system. The simulation of behavior will engender mechanisms that are controlled and coordinated in their various levels of complexity. This unique model will also provide a foundation for an evolutionary expansion of the search engine as WWW documents continue to grow. The simulator results were generated using Message Passing Interface (MPI) on a network of SUN workstations and an IBM SP2 computer system.

Keywords

Genetic Programming Message Passing Interface Internet Service Provider Information Science Institute Active Capsule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abramson, M. Z., Hunter, L.: Classification using Cultural Co-evolution and Genetic Programming. In: Koza, J. R., Goldberg, D. E., Fogel, D. B., Riolo, R. L. (eds.): Proc. of the 1996 Genetic Programming Conf. MIT Press, Cambridge, MA (1996) 249–254.Google Scholar
  2. 2.
    Bagrodia, R.: Process Synchronization: Design and Performance Evaluation of Distributed Algorithms. IEEE Transactions on Software Engineering 15 no. 9 (1989) 1053–1064.CrossRefGoogle Scholar
  3. 3.
    Braden, B., Cerpa, A., Faber, T., Lindell, B., Phillips, G., Kann, J.: The ASP EE: An Active Execution Environment for Network Control Protocols. Technical Report, Information Sciences Institute, University of Southern California, Marina del Rey, CA (1999).Google Scholar
  4. 4.
    Chapman, C. D., Jakiela, M. J.: Genetic Algorithm-Based Structural Topology Design withComp liance and Topology Simplification Considerations. J. of Mech. Design 118 (1996) 89–98.CrossRefGoogle Scholar
  5. 5.
    Crovella, M. E., Bestavros, A.: Self-Similarity in World Wide Web Trafic: Evidence and Possible Causes. IEEE/ACM Transactions on Networking (1997) 1–25.Google Scholar
  6. 6.
    Dracopoulos, D. C., Kent, S.: Bulk Synchronous Parallelisation of Genetic Programming. In: Wasniewski, J., Dongarra, J., Madsen, K., Olesen, D. (eds.): PARA’9: Proc. of the 3rd Intl. Workshop on Applied Parallel Computing, Industrial Computation and Optimization. Springer-Verlag, Berlin, Germany (1996) 216–226.Google Scholar
  7. 7.
    Duda, J. W., Jakiela, M. J.: Generation and Classification of Structural Topologies with Genetic Algorithm Speciation. Journal of Mechanical Design 119 (1997) 127–131.CrossRefGoogle Scholar
  8. 8.
    Franke, H., Hochschild, P., Pattnaik, P., Snir, M.: An Efficient Implementation of MPI. In: Proc. of Conf on Prog. Environments for Massively Parallel Distributed Systems. (1994) 219–229.Google Scholar
  9. 9.
    Free, J. B.: The Social Organization of Honeybees (Studies in Biology no. 81). The Camelot Press Ltd, Southampton (1970).Google Scholar
  10. 10.
    Gouet, P., Diprose, J. M., Grimes, J. M., Malby, R., Burroughs, J. N., Zientara, S., Stuart, D. I., Mertens, P. P. C.: The Highly Ordered Double-Stranded RNA Genome of Bluetongue Virus Revealed by Crystallography. Cell 97 (1999) 481–490.Google Scholar
  11. 11.
    Horta, E. L., Kofuji, S. T.: Using Reconfigurable Logic to Implement an Active Network. In: Shin, S. Y. (ed.): CATA 2000: Proc. of the 15th Intl. Conf. on Computers and their Applications. ISCA Press, Cary, NC (2000) 37–41.Google Scholar
  12. 12.
    Iba, H., Nozoe, T., Ueda, K.: Evolving Communicating Agents based on Genetic Programming. In: ICEC’ 97: Proc. of the 1997 IEEE Intl. Conf. on Evolutionary Computation. IEEE Press, New York (1997) 297–302.Google Scholar
  13. 13.
    Information Sciences Institute: Transmission Control Protocol (TCP). Technical Report RFC: 793, University of Southern California, Marina del Rey, CA (1981).Google Scholar
  14. 14.
    Koza, J. R.: Survey of Genetic Algorithms and Genetic Programming. In: Proc. of WESCON’ 95. IEEE Press, New York (1995) 589–594.Google Scholar
  15. 15.
    Koza, J. R., Andre, D.: Parallel Genetic Programming on a Network of Transputers. Technical Report STAN-CS-TR-95-1542. Stanford University, Department of Computer Science, Palo Alto (1995).Google Scholar
  16. 16.
    Leland W. E., Taqqu M. S., Willinger W., Wilson, D. V.: On the Self-Similar Nature of Ethernet Trafic. In: Proc. of ACM SIGComm’ 93 ACM Press (1993) 1–11.Google Scholar
  17. 17.
    Marenbach, P., Bettenhausen, K. D., Freyer, S., U., Rettenmaier, H.: Data-Driven Structured Modeling of a Biotechnological Fed-Batch Fermentation by Means of Genetic Programming. J. of Systems and Control Engineering 211 no. I5 (1997) 325–332.Google Scholar
  18. 18.
    Oussaidène, M., Chopard, B., Pictet, O. V., Tomassini, M.: Parallel Genetic Programming and Its Application to Trading Model Induction. Parallel Computing 23 no. 8 (1997) 1183–1198.zbMATHCrossRefGoogle Scholar
  19. 19.
    Pacheco, P. S.: Parallel Programming with MPI. Morgan Kaufman Publishers, Inc., San Francisco, (1997).zbMATHGoogle Scholar
  20. 20.
    Peterson, L. L., Davie, B. S.: Computer Networks: A Systems Approach. Morgan Kaufmann Pbulishers, Inc., San Francisco (1996).zbMATHGoogle Scholar
  21. 21.
    Quinn, M. J.: Designing Efficient Algorithms for Parallel Computers. McGraw-Hill, New York (1987).zbMATHGoogle Scholar
  22. 22.
    Sherrah, J., Bogner, R. E., Bouzerdoum, B.: Automatic Selection of Features for Classification using Genetic Programming. In:Narasimhan, V. L. Jain, L. C. (eds.): Proc. of the 1996 Australian New Zealand Conf. on Intelligent Information Systems. IEEE Press, New York (1996) 284–287.Google Scholar
  23. 23.
    Snir, M., Hochschild, P., Frye, D. D., Gildea, K. J.: The communication software and parallel environment of the IBM SP2. IBM Systems Journal 34 no. 9 (1995) 205–221.CrossRefGoogle Scholar
  24. 24.
    Stoffel, K., Spector, L.: High-Performance, Parallel, Stack-Based Genetic Programming. In: Koza, J. R., Goldberg, D. E., Fogel, D. B., Riolo, R. L. (eds.): Proc. of the 1996 Genetic Programming Conf. MIT Press, Cambridge, MA (1996) 224–229.Google Scholar
  25. 25.
    Tanese, R.: Parallel Genetic Algorithm for a Hypercube. In: Grefenstette, J. J. (ed.): Proc. of the 2nd Intl. Conf. on Genetic Algorithms. Lawrence Erlbaum Associates, Hilsdale, NJ (1987) 177–183.Google Scholar
  26. 26.
    Tatsumi, M., Hanebutte, U. R.: Study of Parallel Efficiency in Message Passing Environments. In: Tentner, A. (ed.): Proc. of the 1996 SCS Simulation Multiconference. SCS Press, San Diego, CA (1996) 193–198.Google Scholar
  27. 27.
    Tennenhouse, D. L., Smith, J. M., Sincoskie, W. D., Wetherall, D. J., Minden, G. J.: A Survey of Active Network Research. IEEE Communications Magazine 35 no. 1 (1997) 80–86.CrossRefGoogle Scholar
  28. 28.
    Tennenhouse, D. L., Wetherall, D. J.: Towards an Active Network Architecture. ACM Computer Communications Review 26 no. 2 (1996).Google Scholar
  29. 29.
    von Frisch, K.: Bees: Their Vision, Chemical Senses, and Languages. Cornell University Press, Ithaca, New York (1964).Google Scholar
  30. 30.
    Walker, R. L.: Assessment of theWeb using Genetic Programming. In: Banshaf, W., Daida, J., Eiben, A. E., Garzon, M. H., Honavar, V., Jakiela, M., Smith, R. E. (eds.): GECCO-99: Proc. of the Genetic and Evolutionary Computation Conf. Morgan Kaufman Publishers, Inc., San Francisco (1999) 1750–1755..Google Scholar
  31. 31.
    Walker, R. L.: Development of an Indexer Simulator for a Parallel Pseudo-Search Engine. In: ASTC 2000: Proc. of the 2000 Advanced Simulation Technologies Conf. SCS Press, San Diego, CA (April 2000) To Appear.Google Scholar
  32. 32.
    Walker, R.L.: Dynamic Load Balancing Model: Preliminary Assessment of a Biological Model for a Pseudo-SearchEn gine. In:Biologically Inspired Solutions to Parallel Processing Problems (BioSP3). Lecture Notes in Computer Science. Springer-Veglag, Berlin Heidelberg New York (2000) To Appear.Google Scholar
  33. 33.
    Walker, R. L.: Implementation Issues for a Parallel Pseudo-Search Engine Indexer using MPI and Genetic Programming. In: Ingber, M., Power, H., Brebbia, C.A. (eds.): Applications of High-Performance Computers in Engineering VI. WIT Press, Ashurst, Southampton, UK (2000) 71–80.Google Scholar
  34. 34.
    Walker, R. L., Ivory, M. Y., Asodia, S., Wright-Pegs, L.: Preliminary Study of Search Engine Indexing and Update Mechanisms: Usability Implications. In: Shin, S. Y. (ed.): CATA 2000: Proc. of the 15th Intl. Conf. on Computers and their Applications. ISCA Press, Cary, NC (2000) 383–388.Google Scholar
  35. 35.
    Wetherall, D. J.: Developing Network Protocols with the ANTS Toolkit. Design Review (1997).Google Scholar
  36. 36.
    Willis, M. J., Hiden, H. G., Marenbach, P., McKay, B. Montague, G. A.: Genetic Programming: An Introduction and Survey of Applications. In: Proc. of the 2nd Int. Conf. on Genetic Algorithms in Engineering Systems: Innovations and Applications. IEE Press, London (1997) 314–319.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Reginald L. Walker
    • 1
  1. 1.Computer Science DepartmentUniversity of California at Los AngelesCalifornia

Personalised recommendations