Finding Related Search Engine Queries by Web Community Based Query Enrichment

Li, Lin; Otsuka, Shingo; Kitsuregawa, Masaru

doi:10.1007/s11280-009-0077-1

Finding Related Search Engine Queries by Web Community Based Query Enrichment

Published: 17 November 2009

Volume 13, pages 121–142, (2010)
Cite this article

World Wide Web Aims and scope Submit manuscript

Lin Li^1,2,
Shingo Otsuka³ &
Masaru Kitsuregawa⁴

160 Accesses
9 Citations
Explore all metrics

Abstract

The conventional approaches of finding related search engine queries rely on the common terms shared by two queries to measure their relatedness. However, search engine queries are usually short and the term overlap between two queries is very small. Using query terms as a feature space cannot accurately estimate relatedness. Alternative feature spaces are needed to enrich the term based search queries. In this paper, given a search query, first we extract the Web pages accessed by users from Japanese Web access logs which store the users individual and collective behavior. From these accessed Web pages we usually can get two kinds of feature spaces, i.e, content-sensitive (e.g., nouns) and content-ignorant (e.g., URLs), to enrich the expressions of search queries. Then, the relatedness between search queries can be estimated on their enriched expressions. Our experimental results show that the URL feature space produces much lower precision scores than the noun feature space which, however, is not applicable in non-text pages, dynamic pages and so on. It is crucial to improve the quality of the URL (content-ignorant) feature space since it is generally available in all types of Web pages. We propose a novel content-ignorant feature space, called Web community which is created from a Japanese Web page archive by exploiting link analysis. Experimental results show that the proposed Web community feature space generates much better results than the URL feature space.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research-paper recommender systems: a literature survey

Article 26 July 2015

DSQA-LLM: Domain-Specific Intelligent Question Answering Based on Large Language Model

Collaborative Filtering Approach: A Review of Recent Research

References

Baeza-Yates, R.A., Hurtado, C.A., Mendoza, M.: Improving search engines by query clustering. JASIST 58(12), 1793–1804 (2007)
Article Google Scholar
Beeferman, D., Berger, A.L.: Agglomerative clustering of a search engine query log. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’00), pp. 407–416. Boston, MA, USA (2000)
Google Scholar
Buckley, C., Salton, G., Allan, J., Singhal, A.: Automatic query expansion using smart. In: Proceedings of Text REtrieval Conference (TREC’03), pp. 69–080. Gaithersburg, Maryland (1994)
Google Scholar
Catledge, L., Pitkow, J.: Characterizing browsing behaviors on the world-wide web. Comput. Netw. ISDN Syst. 27(6) (1995)
Caverlee, J., Liu, L., Rocco, D.: Discovering interesting relationships among deep web databases: a source-biased approach. World Wide Web 9(4), 585–622 (2006)
Article Google Scholar
Chirita, P.A., Firan, C.S., Nejdl, W.: Personalized query expansion for the web. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’07), pp. 7–14. Amsterdam, The Netherlands (2007)
Chapter Google Scholar
Collins-Thompson, K., Callan, J.: Query expansion using random walk models. In: Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management (CIKM’05), pp. 704–711. Bremen, Germany (2005)
Chapter Google Scholar
Cui, H., Wen, J.R., Nie, J.Y., Ma, W.Y.: Query expansion by mining user logs. IEEE Trans. Knowl. Data Eng. 15(4), 829–839 (2003)
Article Google Scholar
Dean, J., Henzinger, M.R.: Finding related pages in the world wide web. Comput. Networks 31(11–16), 1467–1479 (1999)
Article Google Scholar
Fitzpatrick, L., Dent, M.: Automatic feedback using past queries: social searching? In: Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’97), pp. 306–313. Philadelphia, PA, USA (1997)
Chapter Google Scholar
Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: Proceedings of the 6h ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’00), pp. 150–160. Boston, MA, USA (2000)
Google Scholar
Flake, G.W., Lawrence, S., Giles, C.L., Coetzee, F.: Self-organization and identification of web communities. IEEE Comput. 35(3), 66–71 (2002)
Google Scholar
Gibson, D., Kleinberg, J.M., Raghavan, P.: Inferring web communities from link topology. In: Proceedings of the 9th ACM Conference on Hypertext and Hypermedia (HT’98), pp. 225–234. Pittsburgh, PA, USA (1998)
Google Scholar
Glance, N.S.: Community search assistant. In: Proceedings of the 2001 International Conference on Intelligent User Interfaces (IUI’01), pp. 91–96. Santa Fe, NM, USA (2001)
Google Scholar
Greco, G., Greco, S., Zumpano, E.: Web communities: models and algorithms. World Wide Web 7(1), 58–82 (2004)
Article Google Scholar
Jansen, B.J., Spink, A., Bateman, J., Saracevic, T.: Real life information retrieval: a study of user queries on the web. SIGIR Forum 32(1), 5–17 (1998)
Article Google Scholar
Khy, S., Ishikawa, Y., Kitagawa, H.: A novelty-based clustering method for on-line documents. World Wide Web 11(1), 1–37 (2008)
Article Google Scholar
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
Article MATH MathSciNet Google Scholar
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the web for emerging cyber-communities. Comput. Networks 31(11–16), 1481–1493 (1999)
Article Google Scholar
Li, L., Yang, Z., Liu, L., Kitsuregawa, M.: Query-url bipartite based approach to personalized query recommendation. In: Proceedings of the 23rd AAAI Conference on Artificial Intelligence,(AAAI’08), pp. 1189–1194. Chicago, Illinois, USA (2008)
Google Scholar
Lin, J.: Divergence measures based on the shannon entropy. IEEE Trans. Inf. Theory 37(1), 145 (1991)
Article MATH Google Scholar
Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
MATH Google Scholar
Otsuka, S., Toyoda, M., Hirai, J., Kitsuregawa, M.: Extracting user behavior by web communities technology on global web logs. In: Proceedings of 15th International Conference on Database and Expert Systems Applications (DEXA’04), pp. 957–968. Zaragoza, Spain (2004)
Google Scholar
Otsuka, S., Toyoda, M., Kitsuregawa, M.: A study for related words finding methods using global web access logs. IPSJ Transactions on Databases (TOD) 46, 82–92 (2005)
Google Scholar
Pereira, F.C.N., Tishby, N., Lee, L.: Distributional clustering of english words. In: Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics (ACL’93), pp. 183–190 (1993)
Raghavan, V.V., Sever, H.: On the reuse of past optimal queries. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’95), pp. 344–350. Seattle, Washington, USA (1995)
Chapter Google Scholar
Salton, G., Buckley, C.: Improving retrieval performance by relevance feedback. JASIS 41(4), 288–297 (1990)
Article Google Scholar
Shi, X., Yang, C.C.: Mining related queries from web search engine query logs using an improved association rule mining model. JASIST 58(12), 1871–1883 (2007)
Article MathSciNet Google Scholar
Siegel, S., Castellan, N.J.: Nonparametric Statistics for the Behavioral Sciences, 2nd ed. McGraw-Hill, New York(1988)
Google Scholar
Sun, R., Ong, C.H., Chua, T.S.: Mining dependency relations for query expansion in passage retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’06), pp. 382–389. Seattle, Washington, USA (2006)
Chapter Google Scholar
Toyoda, M., Kitsuregawa, M.: Creating a web community chart for navigating related communities. In: Proceedings of the 12th ACM Conference on Hypertext and Hypermedia (HT’01), pp. 103–112. Århus, Denmark (2001)
Chapter Google Scholar
Voorhees, E.M.: Query expansion using lexical-semantic relations. In: Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval (SIGIR’94), pp. 61–69. Dublin, Ireland (1994)
Google Scholar
Wen, J.R., Nie, J.Y., Zhang, H.: Query clustering using user logs. ACM Trans. Inf. Syst. 20(1), 59–81 (2002)
Article Google Scholar
Xu, J., Croft, W.B.: Query expansion using local and global document analysis. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’96), pp. 4–11. Zurich, Switzerland (1996)
Chapter Google Scholar
Xu, J., Croft, W.B.: Improving the effectiveness of information retrieval with local context analysis. ACM Trans. Inf. Syst. 18(1), 79–112 (2000)
Article Google Scholar
Zhang, Z., Nasraoui, O.: Mining search engine query logs for query recommendation. In: Proceedings of the 15th International Conference on World Wide Web (WWW’06), pp. 1039–1040. Edinburgh, Scotland, UK (2006)
Chapter Google Scholar
Zhu, Y., Gruenwald, L.: Query expansion using web access log files. In: Proceedings of the 16th International Conference on Database and Expert Systems Applications (DEXA’05), pp. 686–695. Copenhagen, Denmark (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information and Communication Engineering, The University of Tokyo, Tokyo, Japan
Lin Li
School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
Lin Li
National Institute for Materials Science, Tsukuba, Japan
Shingo Otsuka
Institute of Industrial Science, The University of Tokyo, Tokyo, Japan
Masaru Kitsuregawa

Authors

Lin Li
View author publications
You can also search for this author in PubMed Google Scholar
Shingo Otsuka
View author publications
You can also search for this author in PubMed Google Scholar
Masaru Kitsuregawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lin Li.

Additional information

The work was done when Lin Li was a Ph.D. student at the University of Tokyo and now she works at Wuhan University of Technology.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, L., Otsuka, S. & Kitsuregawa, M. Finding Related Search Engine Queries by Web Community Based Query Enrichment. World Wide Web 13, 121–142 (2010). https://doi.org/10.1007/s11280-009-0077-1

Download citation

Received: 28 February 2009
Revised: 20 July 2009
Accepted: 04 November 2009
Published: 17 November 2009
Issue Date: March 2010
DOI: https://doi.org/10.1007/s11280-009-0077-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Finding Related Search Engine Queries by Web Community Based Query Enrichment

Abstract

Access this article

Similar content being viewed by others

Research-paper recommender systems: a literature survey

DSQA-LLM: Domain-Specific Intelligent Question Answering Based on Large Language Model

Collaborative Filtering Approach: A Review of Recent Research

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Finding Related Search Engine Queries by Web Community Based Query Enrichment

Abstract

Access this article

Similar content being viewed by others

Research-paper recommender systems: a literature survey

DSQA-LLM: Domain-Specific Intelligent Question Answering Based on Large Language Model

Collaborative Filtering Approach: A Review of Recent Research

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation