Meta-search Based Web Resource Discovery for Object-Level Vertical Search

  • Ling Lin
  • Gang Li
  • Lizhu Zhou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4255)

Abstract

Object-level vertical search engine has been the research focus recently where the resource collecting problem is still an open area. It is difficult to adapt the traditional link-based web crawler for this task because of the sparse linkage and data-centered webpage of the relevant resources. In this paper, we propose a meta-search based method enhanced with auxiliary crawling to address the problem caused by sparse linkage of the relevant resources. And to retrieve the data-centered webpages efficiently, domain schema is defined to describe the target resource, and representative data instances are selected for meta-search query composing. Moreover, evaluation criteria for the domain resource survey are also proposed as the guideline for query composing and auxiliary crawling, which enable the resource discovery to be automatically performed by computers. Experiment results on real-world data show that our method is effective and efficient.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Nie, Z., Zhang, Y., Wen, J.R., Ma, W.Y.: Object-Level Ranking: Bringing Order to Web Objects. In: 14th International World Wide Web Conference, Chiba (2005)Google Scholar
  2. 2.
    Wen, J.R.: Object-Level Vertical Search. In: 2nd NICT China-Japan Joint Research Forum on Digital Content and Web Computing, Beijing (2006)Google Scholar
  3. 3.
    Meng, X., Lu, H., Wang, H., Gu, M.: SG-WRAP: A Schema-Guided Wrapper Generator. In: 18th International Conference on Data Engineering, San Jose (2002)Google Scholar
  4. 4.
    Li, Z., Ng, W.K., Sun, A.: Web data extraction based on structural similarity. Knowledge and Information Systems 8(4), 438–461 (2005)CrossRefGoogle Scholar
  5. 5.
    Ester, M., Kriegel, H.P., Schubert, M.: Accurate and Efficient Crawling for Relevant Websites. In: 30th International Conference on Very Large Data Bases, Toronto (2004)Google Scholar
  6. 6.
    Chakrabarti, S., Berg, M., Dom, B.: Focused Crawling: a new Approach to Topic-Specific Web Resource Discovery. In: 8th International World Wide Web Conference, Toronto (1999)Google Scholar
  7. 7.
    Diligenti, M., Coetzee, F., Lawrence, S., Giles, C.L., Gori, M.: Focused Crawling Using Context Graphs. In: 26th International Conference on Very Large Data Bases, Cairo (2000)Google Scholar
  8. 8.
    Qin, J., Zhou, Y., Chau, M.: Building Domain-Specific Web Collections for Scientific Digital Libraries: a Meta-Search Enhanced Focused Crawling Method. In: 4th ACM/IEEE Joint Conference on Digital Libraries, Tucson (2004)Google Scholar
  9. 9.
    Guo, Q., Zhou, L., Guo, H., Zhang, J.: SESQ: A Novel System for Building Domain Specific Web Search Engines. In: 8th Asia Pacific Web Conference, Harbin (2006)Google Scholar
  10. 10.
    Meng, W., Wu, Z., Yu, C., Li, H.: A Highly Scalable and Effective Method for Metasearch. ACM Transactions on Information Systems 19(3), 310–335 (2001)CrossRefGoogle Scholar
  11. 11.
    Aslam, J.A., Montague, M.: Models for Metasearch. In: 24th SIGIR, New Orleans (2001)Google Scholar
  12. 12.
    Kleinberg, J.M.: Authoritative Sources in a Hyperlinked Environment. Journal of the ACM 46(5), 604–632 (1999)CrossRefMathSciNetMATHGoogle Scholar
  13. 13.
    Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the Web for Emerging Cyber-Communities. In: 8th International World Wide Web Conference, Toronto (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ling Lin
    • 1
  • Gang Li
    • 1
  • Lizhu Zhou
    • 1
  1. 1.Tsinghua UniversityBeijingPRC

Personalised recommendations