Time-Sensitive Language Modelling for Online Term Recurrence Prediction

Zhang, Dell; Lu, Jinsong; Mao, Robert; Nie, Jian-Yun

doi:10.1007/978-3-642-04417-5_12

Time-Sensitive Language Modelling for Online Term Recurrence Prediction

Dell Zhang²¹,
Jinsong Lu²¹,
Robert Mao²² &
…
Jian-Yun Nie²³

Conference paper

1021 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5766))

Abstract

We address the problem of online term recurrence prediction: for a stream of terms, at each time point predict what term is going to recur next in the stream given the term occurrence history so far. It has many applications, for example, in Web search and social tagging. In this paper, we propose a time-sensitive language modelling approach to this problem that effectively combines term frequency and term recency information, and describe how this approach can be implemented efficiently by an online learning algorithm. Our experiments on a real-world Web query log dataset show significant improvements over standard language modelling.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Garay-Vitoria, N., Abascal, J.: Text prediction systems: A survey. Universal Access in the Information Society 4(3), 188–203 (2006)
Article Google Scholar
Mei, Q., Zhou, D., Church, K.W.: Query suggestion using hitting time. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM), Napa Valley, CA, USA, pp. 469–478 (2008)
Google Scholar
Teevan, J., Adar, E., Jones, R., Potts, M.A.S.: Information re-retrieval: Repeat queries in yahoo’s logs. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Amsterdam, The Netherlands, pp. 151–158 (2007)
Google Scholar
Lempel, R., Moran, S.: Predictive caching and prefetching of query results in search engines. In: Proceedings of the 12th International World Wide Web Conference (WWW), Budapest, Hungary, pp. 19–28 (2003)
Google Scholar
Baeza-Yates, R.A., Gionis, A., Junqueira, F., Murdock, V., Plachouras, V., Silvestri, F.: The impact of caching on search engines. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Amsterdam, The Netherlands, pp. 183–190 (2007)
Google Scholar
Fagni, T., Perego, R., Silvestri, F., Orlando, S.: Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data. ACM Transactions on Information Systems (TOIS) 24(1), 51–78 (2006)
Article Google Scholar
Gan, Q., Suel, T.: Improved techniques for result caching in web search engines. In: Proceedings of the 18th International Conference on World Wide Web (WWW), Madrid, Spain, pp. 431–440 (2009)
Google Scholar
Heymann, P., Ramage, D., Garcia-Molina, H.: Social tag prediction. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Singapore, pp. 531–538 (2008)
Google Scholar
Sigurbjornsson, B., van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th International World Wide Web Conference (WWW), Beijing, China, pp. 327–336 (2008)
Google Scholar
Song, Y., Zhuang, Z., Li, H., Zhao, Q., Li, J., Lee, W.C., Giles, C.L.: Real-time automatic tag recommendation. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Singapore, pp. 515–522 (2008)
Google Scholar
Song, Y., 0007, L.Z., Giles, C.L.: A sparse gaussian processes classification framework for fast tag suggestions. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM), Napa Valley, CA, USA, pp. 93–102 (2008)
Google Scholar
Manning, C., Schutze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
MATH Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Book MATH Google Scholar
Gaber, M.M., Zaslavsky, A.B., Krishnaswamy, S.: Mining data streams: A review. SIGMOD Record 34(2), 18–26 (2005)
Article MATH Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
MATH Google Scholar
Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. In: Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics (ACL), Morristown, NJ, USA, pp. 310–318. Association for Computational Linguistics (1996)
Google Scholar
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press/ McGraw-Hill (2001)
Google Scholar
Pass, G., Chowdhury, A., Torgeson, C.: A picture of search. In: Proceedings of the 1st International Conference on Scalable Information Systems (Infoscale), Hong Kong, vol. 1 (2006)
Google Scholar
Mitchell, T.: Machine Learning, international edn. McGraw Hill, New York (1997)
MATH Google Scholar
Yang, Y., Liu, X.: A re-examination of text categorization methods. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Berkeley, CA, pp. 42–49 (1999)
Google Scholar
Kendall, M., Gibbons, J.D.: Rank Correlation Methods, 5th edn. A Charles Griffin Book (1990)
Google Scholar
Clarkson, P.R., Robinson, A.J.: Language model adaptation using mixtures and an exponentially decaying cache. In: Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 2, pp. 799–802 (1997)
Google Scholar
Li, X., Croft, W.B.: Time-based language models. In: Proceedings of the 12th ACM Conference on Information and Knowledge Management (CIKM), New Orleans, LA, USA, pp. 469–475 (2003)
Google Scholar
Zhu, X., Ghahramani, Z., Lafferty, J.: Time-sensitive dirichlet process mixture models. Technical Report CMU-CALD-05-104, Carnegie Mellon University (2005)
Google Scholar
Ding, Y., Li, X.: Time weight collaborative filtering. In: CIKM, Bremen, Germany, pp. 485–492 (2005)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research (JMLR) 3, 993–1022 (2003)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Birkbeck, University of London, London, WC1E 7HX, UK
Dell Zhang & Jinsong Lu
Microsoft Corp., Dublin, Ireland
Robert Mao
University of Montreal, Quebec, H3C 3J7, Canada
Jian-Yun Nie

Authors

Dell Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jinsong Lu
View author publications
You can also search for this author in PubMed Google Scholar
Robert Mao
View author publications
You can also search for this author in PubMed Google Scholar
Jian-Yun Nie
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computing Science, Sir Alwyn Williams Building, Lilybank Gardens, University of Glasgow, G12 8QQ, Glasgow, Scotland, UK
Leif Azzopardi
Microsoft Research Ltd, 7 JJ Thomson Avenue, CB3 0FB, Cambridge, UK
Gabriella Kazai & Stephen Robertson &
Knowledge Media Institute,, The Open University, MK7 6AA, Milton Keynes, UK
Stefan Rüger
Microsoft Research Ltd, 7 JJ Thomson Avenue, CB3 0FB, Cambridge, United Kingdom
Milad Shokouhi & Emine Yilmaz &
School of Computing, The Robert Gordon University, St Andrew Street, AB25 1HG, Aberdeen, UK
Dawei Song

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, D., Lu, J., Mao, R., Nie, JY. (2009). Time-Sensitive Language Modelling for Online Term Recurrence Prediction. In: Azzopardi, L., et al. Advances in Information Retrieval Theory. ICTIR 2009. Lecture Notes in Computer Science, vol 5766. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04417-5_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-04417-5_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04416-8
Online ISBN: 978-3-642-04417-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics