Multi-modal Services for Web Information Collection Based on Multi-agent Techniques

  • Qing He
  • Xiurong Zhao
  • Sulan Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4088)


With the rapid information growth on the Internet, web information collection is becoming increasingly important in many web applications, especially in search engines. The performance of web information collectors has a great influence on the quality of search engines, so when it comes to web spiders, we usually focus on their speed and accuracy. In this paper, we point out that customizability is also an important feature of a well-designed spider, which means spiders should be able to provide multi-modal services to satisfy different users with different requirements and preferences. And we have developed a parallel web spider system based on multi-agent techniques. It runs with high speed and high accuracy, and what’s the most important, it can provide its services in multiple perspectives and has good extensibility and personalized customizability.


Search Engine Index Agent ISDN System 10th International World Wide Good Extensibility 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cho, J., Garcia-Molina, H., Page, L.: Efficient Crawling Through URL Ordering. Computer Networks and ISDN Systems 30, 161–172 (1998)CrossRefGoogle Scholar
  2. 2.
    Brin, S., Page, L.: The anatomy of a large-scale hypertext Web search engine. Computer Networks and ISDN Systems 30, 107–117 (1998)CrossRefGoogle Scholar
  3. 3.
    Miller, R.C., Bharat, K.: SPHINX: A Framework for creating personal, site-specific Web crawlers. Computer Networks and ISDN Systems 30, 119–130 (1998)CrossRefGoogle Scholar
  4. 4.
    Diligenti, M., Coetzee, F.M., Lawrence, S., et al.: Focused Crawling Using Context Graphs. In: Proceedings of the 26th VLDB Conference, Cairo, Egypt (2000)Google Scholar
  5. 5.
    Najork, M., Wiener, J.L.: Breadth-first search crawling yields high-quality pages. In: Proceeding of 10th International World Wide Web Conference (2001)Google Scholar
  6. 6.
    Dong, M., Liu, S., Zhang, H., Shi, Z.: Parallel Web Spider Based on Intelligent Agent. In: Proceedings of The 5th Pacific Rim International Workshop on Multi-Agents, Tokyo (2002)Google Scholar
  7. 7.
    Luo, J., Shi, Z., Wang, M., Wang, W.: Parallel Web Spiders for Cooperative Information Gathering. In: Zhuge, H., Fox, G.C. (eds.) GCC 2005. LNCS, vol. 3795, pp. 1192–1197. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  8. 8.
    Heydon, A., Najork, M.: Mercator: A Scalable, Extensible Web Crawler. World Wide Web 2, 219–229 (1999)CrossRefGoogle Scholar
  9. 9.
    Peng, H., Lin, Z.: Search Engines and Meta Search Engines on Internet. Computer Science 29, 1–12 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Qing He
    • 1
  • Xiurong Zhao
    • 1
    • 2
  • Sulan Zhang
    • 1
    • 2
  1. 1.The Key Laboratory of Intelligent Information Processing, Department of Intelligence Software, Institute of Computing TechnologyChinese Academy of SciencesBeijingChina
  2. 2.Graduate University of Chinese Academy of Sciences 

Personalised recommendations