Abstract
Object-level vertical search engine has been the research focus recently where the resource collecting problem is still an open area. It is difficult to adapt the traditional link-based web crawler for this task because of the sparse linkage and data-centered webpage of the relevant resources. In this paper, we propose a meta-search based method enhanced with auxiliary crawling to address the problem caused by sparse linkage of the relevant resources. And to retrieve the data-centered webpages efficiently, domain schema is defined to describe the target resource, and representative data instances are selected for meta-search query composing. Moreover, evaluation criteria for the domain resource survey are also proposed as the guideline for query composing and auxiliary crawling, which enable the resource discovery to be automatically performed by computers. Experiment results on real-world data show that our method is effective and efficient.
This work is supported in part by National Natural Science Foundation of China 60520130299.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Nie, Z., Zhang, Y., Wen, J.R., Ma, W.Y.: Object-Level Ranking: Bringing Order to Web Objects. In: 14th International World Wide Web Conference, Chiba (2005)
Wen, J.R.: Object-Level Vertical Search. In: 2nd NICT China-Japan Joint Research Forum on Digital Content and Web Computing, Beijing (2006)
Meng, X., Lu, H., Wang, H., Gu, M.: SG-WRAP: A Schema-Guided Wrapper Generator. In: 18th International Conference on Data Engineering, San Jose (2002)
Li, Z., Ng, W.K., Sun, A.: Web data extraction based on structural similarity. Knowledge and Information Systems 8(4), 438–461 (2005)
Ester, M., Kriegel, H.P., Schubert, M.: Accurate and Efficient Crawling for Relevant Websites. In: 30th International Conference on Very Large Data Bases, Toronto (2004)
Chakrabarti, S., Berg, M., Dom, B.: Focused Crawling: a new Approach to Topic-Specific Web Resource Discovery. In: 8th International World Wide Web Conference, Toronto (1999)
Diligenti, M., Coetzee, F., Lawrence, S., Giles, C.L., Gori, M.: Focused Crawling Using Context Graphs. In: 26th International Conference on Very Large Data Bases, Cairo (2000)
Qin, J., Zhou, Y., Chau, M.: Building Domain-Specific Web Collections for Scientific Digital Libraries: a Meta-Search Enhanced Focused Crawling Method. In: 4th ACM/IEEE Joint Conference on Digital Libraries, Tucson (2004)
Guo, Q., Zhou, L., Guo, H., Zhang, J.: SESQ: A Novel System for Building Domain Specific Web Search Engines. In: 8th Asia Pacific Web Conference, Harbin (2006)
Meng, W., Wu, Z., Yu, C., Li, H.: A Highly Scalable and Effective Method for Metasearch. ACM Transactions on Information Systems 19(3), 310–335 (2001)
Aslam, J.A., Montague, M.: Models for Metasearch. In: 24th SIGIR, New Orleans (2001)
Kleinberg, J.M.: Authoritative Sources in a Hyperlinked Environment. Journal of the ACM 46(5), 604–632 (1999)
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the Web for Emerging Cyber-Communities. In: 8th International World Wide Web Conference, Toronto (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lin, L., Li, G., Zhou, L. (2006). Meta-search Based Web Resource Discovery for Object-Level Vertical Search. In: Aberer, K., Peng, Z., Rundensteiner, E.A., Zhang, Y., Li, X. (eds) Web Information Systems – WISE 2006. WISE 2006. Lecture Notes in Computer Science, vol 4255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11912873_5
Download citation
DOI: https://doi.org/10.1007/11912873_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-48105-8
Online ISBN: 978-3-540-48107-2
eBook Packages: Computer ScienceComputer Science (R0)