Exploiting Twitter for Spiking Query Classification

Yoshida, Mitsuo; Arase, Yuki

doi:10.1007/978-3-642-35341-3_12

Mitsuo Yoshida²¹ &
Yuki Arase²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7675))

Included in the following conference series:

Asia Information Retrieval Symposium

1251 Accesses
1 Citations
1 Altmetric

Abstract

We propose a method for classifying queries whose frequency spikes in a search engine into their topical categories such as celebrities and sports. Unlike previous methods using Web search results and query logs that take a certain period of time to follow spiking queries, we exploit Twitter to timely classify spiking queries by focusing on its massive amount of super-fresh content. The proposed method leverages unique information in Twitter—not only tweets but also users and hashtags. We integrate such heterogeneous information in a graph and classify queries using a graph-based semi-supervised classification method. We design an experiment to replicate a situation when queries spike. The results indicate that the proposed method functions effectively and also demonstrate that accuracy improves by combining the heterogeneous information in Twitter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agichtein, E., Castillo, C., Donato, D., Gionis, A., Mishne, G.: Finding high-quality content in social media. In: WSDM 2008, pp. 183–194 (2008)
Google Scholar
Baeza-Yates, R., Calderón-Benavides, L., González-Caro, C.N.: The Intention Behind Web Queries. In: Crestani, F., Ferragina, P., Sanderson, M. (eds.) SPIRE 2006. LNCS, vol. 4209, pp. 98–109. Springer, Heidelberg (2006)
Chapter Google Scholar
Beitzel, S.M., Jensen, E.C., Frieder, O., Lewis, D.D., Chowdhury, A., Kolcz, A.: Improving automatic query classification via semi-supervised learning. In: ICDM 2005, pp. 42–49 (2005)
Google Scholar
Broder, A.Z., Fontoura, M., Gabrilovich, E., Joshi, A., Josifovski, V., Zhang, T.: Robust classification of rare queries using web knowledge. In: SIGIR 2007, pp. 231–238 (2007)
Google Scholar
Diemert, E., Vandelle, G.: Unsupervised query categorization using automatically-built concept graphs. In: WWW 2009, pp. 461–461 (2009)
Google Scholar
Dong, A., Zhang, R., Kolari, P., Bai, J., Diaz, F., Chang, Y., Zheng, Z., Zha, H.: Time is of the essence: improving recency ranking using twitter data. In: WWW 2010, pp. 331–340 (2010)
Google Scholar
Hu, J., Wang, G., Lochovsky, F., Tao Sun, J., Chen, Z.: Understanding user’s query intent with Wikipedia. In: WWW 2009, pp. 471–480 (2009)
Google Scholar
Kudo, T., Yamamoto, K., Matsumoto, Y.: Applying conditional random fields to Japanese morphological analysis. In: EMNLP 2004, pp. 230–237 (2004)
Google Scholar
Kulkarni, A., Teevan, J., Svore, K.M., Dumais, S.T.: Understanding temporal query dynamics. In: WSDM 2011, pp. 167–176 (2011)
Google Scholar
Li, X., Wang, Y.-Y., Acero, A.: Learning query intent from regularized click graphs. In: SIGIR 2008, pp. 339–346 (2008)
Google Scholar
Li, Y., Zheng, Z., Dai, H.K.: KDD CUP-2005 report: facing a great challenge. SIGKDD Explor. Newsl. 7(2), 91–99 (2005)
Article Google Scholar
Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes twitter users: real-time event detection by social sensors. In: WWW 2010, pp. 851–860 (2010)
Google Scholar
Shen, D., Sun, J.-T., Yang, Q., Chen, Z.: Building bridges for web query classification. In: SIGIR 2006, pp. 131–138 (2006)
Google Scholar
Talukdar, P., Crammer, K.: New Regularized Algorithms for Transductive Learning. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009, Part II. LNCS, vol. 5782, pp. 442–457. Springer, Heidelberg (2009)
Chapter Google Scholar
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML 2003, pp. 912–919 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Systems and Information Engineering, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, Japan
Mitsuo Yoshida
Microsoft Research Asia, Building 2, No.5 Dan Ling Street, Haidian District, Beijing, P.R. China
Yuki Arase

Authors

Mitsuo Yoshida
View author publications
You can also search for this author in PubMed Google Scholar
Yuki Arase
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of computer Science and Technology, Tianjin University, Tianjin, 300072, China
Yuexian Hou
DIRO, University of Montreal, CP. 6128, succursale Centre-ville, H3C 3J7, Montreal, QC, Canada
Jian-Yun Nie
Institute of Software, Storage & Information Retrieval Laboratory, Chinese Academy of Sciences, 100190, Beijing, China
Le Sun
School of Computer Science and Technology, Tianjin University, 300072, Tianjin, China
Bo Wang
School of Computing, Robert Gordon University, St Andrew Street, AB25 1HG, Aberdeen, UK
Peng Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yoshida, M., Arase, Y. (2012). Exploiting Twitter for Spiking Query Classification. In: Hou, Y., Nie, JY., Sun, L., Wang, B., Zhang, P. (eds) Information Retrieval Technology. AIRS 2012. Lecture Notes in Computer Science, vol 7675. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35341-3_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-35341-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35340-6
Online ISBN: 978-3-642-35341-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics