Advertisement

Silent Day Detection on Microblog Data

  • Kuang Lu
  • Hui Fang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10859)

Abstract

Microblog has become an increasingly popular information source for users to get updates about the world. Given the rapid growth of the microblog data, users are often interested in getting daily (or even hourly) updates about a certain topic. Existing studies on microblog retrieval mainly focused on how to rank results based on their relevance, but little attention has been paid to whether we should return any results to search users. This paper studies the problem of silent day detection. Specifically, given a query and a set of tweets collected over a certain time period (such as a day), we need to determine whether the set contains any relevant tweets of the query. If not, this day is referred to as a silent day. Silent day detection enables us to not overwhelm users with non-relevant tweets. We formulate the problem as a classification problem, and propose two types of new features based on using collective information from query terms. Experiment results over TREC collections show that these new features are more effective in detecting silent days than previously proposed ones.

Keywords

Silent day detection Microblog retrieval Classification 

Notes

Acknowledgements

This research was supported by the U.S. National Science Foundation under IIS-1423002.

References

  1. 1.
    Arampatzis, A., Kamps, J.: An empirical study of query specificity. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 594–597. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-12275-0_55CrossRefGoogle Scholar
  2. 2.
    Ault, T., Yang, Y.: kNN, rocchio and metrics for information filtering at TREC-10. In: Proceedings of TREC-10 (2001)Google Scholar
  3. 3.
    Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Proceedings of SIGIR 2002 (2002)Google Scholar
  4. 4.
    Cummins, R.: Document score distribution models for query performance inference and prediction. ACM Trans. Inf. Syst. 32(1), 2 (2014)CrossRefGoogle Scholar
  5. 5.
    Cummins, R., Jose, J., O’Riordan, C.: Improved query performance prediction using standard deviation. In: Proceedings of SIGIR 2011 (2011)Google Scholar
  6. 6.
    Fang, H., Zhai, C.: An exploration of axiomatic approaches to information retrieval. In: Proceedings of SIGIR 2005 (2005)Google Scholar
  7. 7.
    Hauff, C., Azzopardi, L., Hiemstra, D.: The combination and evaluation of query performance prediction methods. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 301–312. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-00958-7_28CrossRefGoogle Scholar
  8. 8.
    Lau, C.H., Li, Y., Tjondronegoro, D.: Microblog retrieval using topical features and query expansion. In: Proceedings of TREC 2011 (2011)Google Scholar
  9. 9.
    Lin, J., Mohammed, S., Sequiera, R., Tan, L., Ghelani, N., Abualsaud, M.: Overview of the TREC 2017 real-time summarization track. In: Proceedings of TREC 2017 (2017)Google Scholar
  10. 10.
    Lin, J., Efron, M., Wang, Y., Sherman, G., Voorhees, E.: Overview of the TREC-2015 microblog track. In: Proceedings of TREC 2015 (2015)Google Scholar
  11. 11.
    Lin, J., Roegiest, A., Tan, L., McCreadie, R., Voorhees, E., Diaz, F.: Overview of the TREC 2016 real-time summarization track. In: Proceedings of TREC 2016 (2016)Google Scholar
  12. 12.
    Moulahi, B., Jabeur, L.B., Tan, L., McCreadie, R., Voorhees, E., Diaz, F.: IRIT at TREC real-time summarization 2016. In: Proceedings of TREC 2016 (2016)Google Scholar
  13. 13.
    Ounis, I., Macdonald, C., Lin, J., Soboroff, I.: Overview of the TREC-2011 microblog track. In: Proceedings of TREC 2011 (2011)Google Scholar
  14. 14.
    Pérez-Iglesias, J., Araujo, L.: Standard deviation as a query hardness estimator. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 207–212. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-16321-0_21CrossRefGoogle Scholar
  15. 15.
    Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M., Gatford, M.: Okapi at TREC-3. In: NIST Special Publication 500–225: Overview of the Third Text REtrieval Conference (TREC-3) (1994)Google Scholar
  16. 16.
    Rodriguez Perez, J.A., Jose, J.M.: Predicting query performance in microblog retrieval. In: Proceedings of SIGIR 2014 (2014)Google Scholar
  17. 17.
    Shtok, A., Kurland, O., Carmel, D., Raiber, F., Markovits, G.: Predicting query performance by query-drift estimation. ACM Trans. Inf. Syst. 30(2), 1–35 (2012)CrossRefGoogle Scholar
  18. 18.
    Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: Proceedings of SIGIR 1996 (1996)Google Scholar
  19. 19.
    Soboroff, I., Ounis, I., Macdonald, C., Lin, J.: Overview of the TREC 2012 microblog track. In: Proceedings of TREC 2012 (2012)Google Scholar
  20. 20.
    Srivastava, A., Sahami, M.: Text Mining: Classification, Clustering, and Applications (2009)Google Scholar
  21. 21.
    Tan, L., Roegiest, A., Lin, J., Clarke, C.L.: An exploration of evaluation metrics for mobile push notifications. In: Proceedings of SIGIR 2016 (2016)Google Scholar
  22. 22.
    Tang, J., Lv, C., Yao, L., Zhao, D.: PKUICST at TREC 2017 real-time summarization track: push notifications and email digest. In: Proceedings of TREC 2017 (2017)Google Scholar
  23. 23.
    Tomlinson, S.: Robust, web and terabyte retrieval with hummingbird searchservertm at TREC 2004. In: Proceedings of TREC-13 (2004)Google Scholar
  24. 24.
    Yom-Tov, E., Fine, S., Carmel, D., Darlow, A.: Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval. In: Proceedings of SIGIR 2005 (2005)Google Scholar
  25. 25.
    Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22(2), 179–214 (2004)CrossRefGoogle Scholar
  26. 26.
    Zhao, Y., Scholer, F., Tsegay, Y.: Effective pre-retrieval query performance prediction using similarity and variability evidence. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 52–64. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-78646-7_8CrossRefGoogle Scholar
  27. 27.
    Zhou, Y., Croft, W.B.: Query performance prediction in web search environments. In: Proceedings of SIGIR 2007 (2007)Google Scholar
  28. 28.
    Zhu, X., Huang, J., Zhu, S., Chen, M., Zhang, C., Zhenzhen, L., Dongchuan, H., Chengliang, Z., Li, A., Jia, Y.: NUDTSNA at TREC 2015 microblog track: a live retrieval system framework for social network based on semantic expansion and quality model. In: Proceedings of TREC 2015 (2015)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.University of DelawareNewarkUSA

Personalised recommendations