A Survey of Learning to Rank for Real-Time Twitter Search

Cheng, Fuxing; Zhang, Xin; He, Ben; Luo, Tiejian; Wang, Wenjie

doi:10.1007/978-3-642-37015-1_13

A Survey of Learning to Rank for Real-Time Twitter Search

Fuxing Cheng^19,20,
Xin Zhang^19,20,
Ben He^19,20,
Tiejian Luo^19,20 &
…
Wenjie Wang^19,20

Conference paper

4089 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 7719))

Abstract

Recently learning to rank has been widely used in real-time Twitter Search by integrating various of evidence of relevance and recency features into together. In real-time Twitter search, whereby the information need of a user is represented by a query at a specific time, users are interested in fresh messages. In this paper, we introduce a new ranking strategy to rerank the tweets by incorporating multiple features. Besides, an empirical study of learning to rank for real-time Twitter search is conducted by adopting the state-of-the-art learning to rank approaches. Experiments on the standard TREC Tweets11 collection show that both the listwise and pairwise learning to rank methods outperform baselines, namely the content-based retrieval models.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C.: Terrier: A high performance and scalable information retrieval platform. In: SIGIR OSIR (2006)
Google Scholar
Amati, G.: Probabilistic models for information retrieval based on divergence from randomness. PhD thesis, DCS, University of Glasgow (2003)
Google Scholar
Amati, G., Amodeo, G., Bianchi, M., Celi, A., Nicola, C.D., Flammini, M., Gaibisso, C., Gambosi, G., Marcone, G.: Fub, iasi-cnr, UNIVAQ at TREC 2011. In: TREC (2011)
Google Scholar
Blum, A., Mitchell, T.M.: Combining labeled and unlabeled data with co-training. In: COLT, pp. 92–100 (1998)
Google Scholar
Cha, M., Haddadi, H., Benevenuto, F., Gummadi, K.P.: Measuring user influence in twitter: The million follower fallacy. In: ICWSM (2010)
Google Scholar
Chapelle, O., Yang, Y.: Yahoo! Learning to Rank Challenge Overview. In: JMLR (2011)
Google Scholar
Dong, A., Chang, Y., Zheng, Z., Mishne, G., Bai, J., Zhang, R., Buchner, K., Liao, C., Diaz, F.: Towards recency ranking in web search. In: WSDM, pp. 11–20 (2010)
Google Scholar
Duan, Y., Jiang, L., Qin, T., Zhou, M., Shum, H.-Y.: An empirical study on learning to rank of tweets. In: COLING, pp. 295–303. Tsinghua University, Beijing (2010)
Google Scholar
Duh, K., Kirchhoff, K.: Learning to rank with partially-labeled data. In: SIGIR, pp. 251–258 (2008)
Google Scholar
Efron, M., Golovchinksy, G.: Estimation methods for ranking recent information. In: SIGIR, pp. 495–504 (2011)
Google Scholar
El-Yaniv, R., Pechyony, D.: Stable Transductive Learning. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 35–49. Springer, Heidelberg (2006)
Chapter Google Scholar
Esuli, A., Sebastiani, F.: Sentiwordnet: A publicly available lexical resource for opinion mining. In: LREC, pp. 417–422 (2006)
Google Scholar
Ganjisaffar, Y., Caruana, R., Lope, C.: Bagging gradient-boosted trees for high precision, low variance ranking models. In: SIGIR, pp. 85–94 (2011)
Google Scholar
Geng, X., Liu, T.-Y., Qin, T., Arnold, A., Li, H., Shum, H.-Y.: Query dependent ranking using k-nearest neighbor. In: SIGIR, pp. 115–122 (2008)
Google Scholar
Hong, D., Wang, Q., Zhang, D., Si, L.: Query expansion and message-passing algorithms for TREC microblog track. In: TREC (2011)
Google Scholar
Huang, X., Huang, Y., Wen, M., An, A., Liu, Y., Poon, J.: Applying data mining to pseudo-relevance feedback for high performance text retrieval. In: ICDM, pp. 295–306 (2006)
Google Scholar
Joachims, T.: Optimizing search engines using clickthrough data. In: KDD, pp. 133–142 (2002)
Google Scholar
Kirkpatrick, S., Gelatt, C., Vecchi, M.: Optimization by simulated annealing. Science 220(4598) (1983)
Google Scholar
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: WWW, pp. 591–600 (2010)
Google Scholar
Lavrenko, V., Croft, W.B.: Relevance-based language models. In: SIGIR, pp. 120–127 (2001)
Google Scholar
Li, X., Croft, W.B.: Time-based language models. In: CIKM, pp. 469–475 (2003)
Google Scholar
Li, Y., Zhang, Z., Lv, W., Xie, Q., Lin, Y., Xu, R., Xu, W., Chen, G., Guo, J.: PRIS at TREC 2011 microblog track. In: TREC (2011)
Google Scholar
Liu, T.: Learning to rank for information retrieval. Foundations and Trends in Information Retrieval (3), 225–331 (2009)
Google Scholar
Louvan, S., Ibrahim, M., Adriani, M., Vania, C., Distiawan, B., Wanagiri, M.Z.: University of Indonesia at TREC 2011 microblog track. In: TREC (2011)
Google Scholar
Metzler, D., Cai, C.: Usc/isi at TREC 2011: Microblog track. In: TREC (2011)
Google Scholar
Miyanishi, T., Okamura, N., Liu, X., Seki, K., Uehara, K.: TREC 2011 microblog track experiments at KOBE university. In: TREC, Gaithersburg, MD (2011)
Google Scholar
Ounis, I., Macdonald, C., Lin, J., Soboroff, I.: Overview of the TREC 2011 microblog track. In: TREC, Gaithersburg, MD (2011)
Google Scholar
Ounis, I., Macdonald, C., Soboroff, I.: On the TREC blog track. In: ICWSM, Seattle, WA (2008)
Google Scholar
Robertson, S.E., Walker, S., Hancock-Beaulieu, M., Gatford, M., Payne, A.: Okapi at TREC-4. In: TREC (1995)
Google Scholar
Rocchio, J.: Relevance feedback in information retrieval, pp. 313–323. Prentice-Hall, Englewood Cliffs (1971)
Google Scholar
Sellamanickam, S., Garg, P., Selvaraj, S.K.: A pairwise ranking based approach to learning with positive and unlabeled examples. In: CIKM, pp. 663–672 (2011)
Google Scholar
Vapnik, V.N.: Statistical learning theory. Wiley, New York (1998)
MATH Google Scholar
Veloso, A.A., Almeida, H.M., Gonçalves, M.A., Meira Jr., W.: Learning to rank at query-time using association rules. In: SIGIR, pp. 267–274 (2008)
Google Scholar
Witten, I., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Wu, J., Yang, Z., Lin, Y., Lin, H., Ye, Z., Xu, K.: Learning to rank using query-level regression. In: SIGIR, pp. 1091–1092 (2011)
Google Scholar
Wu, Q., Burges, C., Svore, K., Cao, J.: Ranking boosting and model adaptation. Technical report, Microsoft (2008)
Google Scholar
Ye, Z., He, B., Huang, X., Lin, H.: Revisiting Rocchio’s Relevance Feedback Algorithm for Probabilistic Models. In: Cheng, P.-J., Kan, M.-Y., Lam, W., Nakov, P. (eds.) AIRS 2010. LNCS, vol. 6458, pp. 151–161. Springer, Heidelberg (2010)
Chapter Google Scholar
Zhai, C., Lafferty, J.D.: Model-based feedback in the language modeling approach to information retrieval. In: CIKM, pp. 403–410 (2001)
Google Scholar
Zheng, Z., Zha, H., Sun, G.: Query-level learning to rank using isotonic regression. In: Allerton, pp. 1108–1115 (2008)
Google Scholar
Cao, Z., Qin, T., Liu, T.-Y., Tsai, M.-F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: ICML, pp. 129–136 (2007)
Google Scholar
Ounis, I., Macdonald, C., Lin, J., Soboroff, I.: Overview of the TREC 2011 microblog track. In: TREC, Gaithersburg, MD (2011)
Google Scholar
OunIs, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Lioma, C.: Terrier: A High Performance and Scalable Information Retrieval Platform. In: Proceedings of ACM SIGIR 2006 Workshop on Open Source Information Retrieval (OSIR), Seattle, Washington, USA, August 10 (2006)
Google Scholar
http://olivo.net/software/lc4j/

Download references

Author information

Authors and Affiliations

Information Dynamic and Engineering Applications Laboratory, University of Chinese Academy of Sciences, China
Fuxing Cheng, Xin Zhang, Ben He, Tiejian Luo & Wenjie Wang
Key Laboratory of Computional Geodynamics, University of Chinese Academy of Sciences, China
Fuxing Cheng, Xin Zhang, Ben He, Tiejian Luo & Wenjie Wang

Authors

Fuxing Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Xin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ben He
View author publications
You can also search for this author in PubMed Google Scholar
Tiejian Luo
View author publications
You can also search for this author in PubMed Google Scholar
Wenjie Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Wuhan University of Technology, Heping Road 1178, Wuchang District, 430081, Wuhan, Hubei, China
Qiaohong Zu
Hayes Park Central, Fujitsu Laboratories of Europe Ltd., Hayes End Road, UB4 8FE, Hayes, Middlesex, UK
Bo Hu
Department of Electrical and Electronics Engineering, Aksaray University, Merkez Kampüsü, 68100, Aksaray, Turkey
Atilla Elçi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheng, F., Zhang, X., He, B., Luo, T., Wang, W. (2013). A Survey of Learning to Rank for Real-Time Twitter Search. In: Zu, Q., Hu, B., Elçi, A. (eds) Pervasive Computing and the Networked World. ICPCA/SWS 2012. Lecture Notes in Computer Science, vol 7719. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37015-1_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-37015-1_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37014-4
Online ISBN: 978-3-642-37015-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics