World Wide Web

, Volume 22, Issue 2, pp 499–515 | Cite as

MARES: multitask learning algorithm for Web-scale real-time event summarization

  • Min Yang
  • Wenting Tu
  • Qiang Qu
  • Kai Lei
  • Xiaojun Chen
  • Jia ZhuEmail author
  • Ying Shen
Part of the following topical collections:
  1. Special Issue on Deep vs. Shallow: Learning for Emerging Web-scale Data Computing and Applications


Automatic real-time summarization of massive document streams on the Web has become an important tool for quickly transforming theoverwhelming documents into a novel, comprehensive and concise overview of an event for users. Significant progresses have been made in static text summarization. However, most previous work does not consider the temporal features of the document streams which are valuable in real-time event summarization. In this paper, we propose a novel M ultitask learning A lgorithm for Web-scale R eal-time E vent S ummarization (MARES), which leverages the benefits of supervised deep neural networks as well as a reinforcement learning algorithm to strengthen the representation learning of documents. Specifically, MARES consists two key components: (i) A relevance prediction classifier, in which a hierarchical LSTM model is used to learn the representations of queries and documents; (ii) A document filtering model learns to maximize the long-term rewards with reinforcement learning algorithm, working on a shared document encoding layer with the relevance prediction component. To verify the effectiveness of the proposed model, extensive experiments are conducted on two real-life document stream datasets: TREC Real-Time Summarization Track data and TREC Temporal Summarization Track data. The experimental results demonstrate that our model can achieve significantly better results than the state-of-the-art baseline methods.


Multitask learning Real-time event summarization Relevance prediction Document filtering 



This work was partially supported by the National Science Foundation of China (No.61750110516), the Shenzhen Key Fundamental Research Projects (Grant No. JCYJ20170412150946024), and the CAS Pioneer Hundred Talents Program.


  1. 1.
    Aliannejadi, M., Bahrainian, S.A., Giachanou, A., Crestani, F.: University of lugano at trec 2015: Contextual suggestion and temporal summarization tracks. In: TREC (2015)Google Scholar
  2. 2.
    Allan, J., Gupta, R., Khandelwal, V.: Temporal summaries of new topics. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 10–18. ACM (2001)Google Scholar
  3. 3.
    Aslam, J., Diaz, F., Ekstrand-Abueg, M., McCreadie, R., Pavlu, V., Sakai, T.: Trec 2014 temporal summarization track overview. Technical report (2015)Google Scholar
  4. 4.
    Cao, Z., Wei, F., Li, D., Li, S., Zhou, M.: Ranking with recursive neural networks and its application to multi-document summarization. In: AAAI, pp. 2153–2159 (2015)Google Scholar
  5. 5.
    Cao, Z., Li, W., Li, S., Wei, F.: Improving multi-document summarization via text classification. In: AAAI, pp. 3053–3059 (2017)Google Scholar
  6. 6.
    Chen, G.: A gentle tutorial of recurrent neural network with error backpropagation. arXiv:1610.02583 (2016)
  7. 7.
    Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words. arXiv:1603.07252 (2016)
  8. 8.
    Deshpande, A.R., Lobo, L.M.R.J.: Text summarization using clustering technique. Int. J. Eng. Trends Technol., 4(8) (2013)Google Scholar
  9. 9.
    Efron, M., Lin, J., He, J., De Vries, A.: Temporal feedback for tweet search with non-parametric density estimation. In: SIGIR, pp. 33–42. ACM (2014)Google Scholar
  10. 10.
    Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)CrossRefGoogle Scholar
  11. 11.
    Frank, J.R., Kleiman-Weiner, M., Roberts, D.A., Niu, F., Ce, Z., Ré, C., Soboroff, I.: Building an entity-centric stream filtering test collection for trec. Technical report. Massachusetts. Inst of. Tech. Cambridge. (2012)Google Scholar
  12. 12.
    Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Galley, M.: A skip-chain conditional random field for ranking meeting utterances by importance. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp, 364–372. Association for Computational Linguistics (2006)Google Scholar
  14. 14.
    Gao, L., Guo, Z., Zhang, H., Xing, X., Shen, H.T.: Video captioning with attention-based lstm and semantic consistency. IEEE Trans. Multimed. 19(9), 2045–2055 (2017)CrossRefGoogle Scholar
  15. 15.
    Gao, L., Song, J., Liu, X., Shao, J., Liu, J., Shao, J.: Learning in high-dimensional multimedia data: the state of the art. Multimed. Syst. 23(3), 303–313 (2017)CrossRefGoogle Scholar
  16. 16.
    Gillick, D., Favre, B.: A scalable global model for summarization. In: Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing, pp. 10–18. Association for Computational Linguistics (2009)Google Scholar
  17. 17.
    Guo, Q., Diaz, F., Yom-Tov, E.: Updating users about time critical events. In: European Conference on Information Retrieval, pp. 483–494. Springer (2013)Google Scholar
  18. 18.
    He, H., Gimpel, K., Lin, J.J.: Multi-perspective sentence similarity modeling with convolutional neural networks. In: EMNLP, pp. 1576–1586 (2015)Google Scholar
  19. 19.
    Hovy, E., Lin, C.-Y.: Automated text summarization and the summarist system. In: Proceedings of a Workshop on Held at Baltimore, Maryland: October 13-15, 1998, pp. 197?214. Association for Computational Linguistics (1998)Google Scholar
  20. 20.
    Kågebäck, M, Mogren, O., Tahmasebi, N., Dubhashi, D.: Extractive summarization using continuous vector space models. In: Proceedings of the 2nd EACL Workshop on Continuous Vector Space Models and their Compositionality, pp. 31–39 (2014)Google Scholar
  21. 21.
    Kedzie, C., McKeown, K., Diaz, F.: Predicting salient updates for disaster summarization. In: ACL (1), pp. 1608–1617 (2015)Google Scholar
  22. 22.
    Kedzie, C., Diaz, F., McKeown, K.: Real-time Web scale event summarization using sequential decision making. In: International Joint Conference on Artificial Intelligence, pp. 3754–3760 (2016)Google Scholar
  23. 23.
    Kingma, D., Adam, J.B.: A method for stochastic optimization. arXiv:1412.6980 (2014)
  24. 24.
    Kruengkrai, C., Jaruskulchai, C.: Generic text summarization using local and global properties of sentences. In: IEEE/WIC International Conference on Web Intelligence, 2003. Proceedings, pp. 201–206. IEEE (2003)Google Scholar
  25. 25.
    Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 68–73. ACM (1995)Google Scholar
  26. 26.
    Li, C., Qian, X., Liu, Y.: Using supervised bigram-based ilp for extractive summarization. In: ACL, pp. 1004–1013 (2013)Google Scholar
  27. 27.
    Lin, J., Efron, M., Wang, Y., Sherman, G.: Overview of the trec-2015 microblog track. Technical report (2015)Google Scholar
  28. 28.
    Lin, J., Roegiest, A., Tan, L., McCreadie, R., Voorhees, E., Diaz, F.: Overview of the trec 2016 real-time summarization track. In: TREC (2016)Google Scholar
  29. 29.
    Liu, G., Yan, Y., Subramanian, R., Song, J., Guoyu, L., Sebe, N.: Active domain adaptation with noisy labels for multimedia analysis. World Wide Web 19(2), 199–215 (2016)CrossRefGoogle Scholar
  30. 30.
    McCreadie, R., Macdonald, C., Ounis, I.: Incremental update summarization: adaptive sentence selection based on prevalence and novelty. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp. 301–310. ACM (2014)Google Scholar
  31. 31.
    McDonald, R.: A study of global inference algorithms in multi-document summarization. Adv. Inf. Retriev., 557–564 (2007)Google Scholar
  32. 32.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)Google Scholar
  33. 33.
    Rao, J., He, H., Zhang, H., Ture, F., Sequiera, R., Mohammed, S., Lin, J.: Integrating lexical and temporal signals in neural ranking models for searching social media streams. arXiv:1707.07792 (2017)
  34. 34.
    Roegiest, A., Tan, L., Lin, J.: Online in-situ interleaved evaluation of real-time push notification systems. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 415–424. ACM (2017)Google Scholar
  35. 35.
    Ross, S., Gordon, G.J., Bagnell, D.: A reduction of imitation learning and structured prediction to no-regret online learning. In: International Conference on Artificial Intelligence and Statistics, pp. 627–635 (2011)Google Scholar
  36. 36.
    Severyn, A., Moschitti, A.: Learning to rank short text pairs with convolutional deep neural networks. In: SIGIR, pp. 373–382. ACM (2015)Google Scholar
  37. 37.
    Sonawane, S.S., Kulkarni, P.A.: Graph based representation and analysis of text document: A survey of techniques. Int. J. Comput. Appl., 96(19) (2014)Google Scholar
  38. 38.
    Song, J., Gao, L., Nie, F., Shen, H.T., Yan, Y., Sebe, N.: Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans. Image. Process. 25(11), 4999–5011 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  39. 39.
    Song, J., Gao, L., Li, L., Zhu, X., Sebe, N.: Quantization-based hashing: A general framework for scalable image and video retrieval. Pattern Recognition (2017)Google Scholar
  40. 40.
    Song, J., Zhang, H., Li, X., Gao, L., Wang, M., Hong, R.: Self-supervised video hashing with hierarchical binary auto-encoder. IEEE Trans. Image. Process. 25 (11), 4999–5011 (2018)MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Tan, L., Roegiest, A., Clarke, C.L.A., Lin, J.: Simple dynamic emission strategies for microblog filtering. In: SIGIR, pp. 1009–1012. ACM (2016)Google Scholar
  42. 42.
    Tan, H., Ziyu, L., Li, W.: Neural network based reinforcement learning for real-time pushing on text stream. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 913–916. ACM (2017)Google Scholar
  43. 43.
    Wang, X., Gao, L., Wang, P., Sun, X., Liu, X.: Two-stream 3d convnet fusion for action recognition in videos with arbitrary size and length. IEEE Transactions on Multimedia (2017)Google Scholar
  44. 44.
    Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3-4), 229–256 (1992)CrossRefzbMATHGoogle Scholar
  45. 45.
    Xu, T., Oard, D.W., McNamee, P.: Hltcoe at trec 2013: Temporal summarization. In: TREC (2013)Google Scholar
  46. 46.
    Xu, J., Liu, X., Huo, Z., Deng, C., Nie, F., Huang, H.: Multi-class support vector machine via maximizing multi-class margins. In: Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), pp. 3154–3160 (2017)Google Scholar
  47. 47.
    Yang, M., Mei, J., Fei, X., Wenting, T., Lu, Z.: Discovering author interest evolution in topic modeling. In: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 801–804. ACM (2016)Google Scholar
  48. 48.
    Yu, L., Zhang, W., Wang, J., Seqgan, Y.Y.: Sequence generative adversarial nets with policy gradient. In: AAAI, pp. 2852–2858 (2017)Google Scholar
  49. 49.
    Zhao, W., Wei, X., Yang, M., Ye, J., Zhao, Z., Feng, Y., Qiao, Y.: Dual learning for cross-domain image captioning. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 29–38. ACM (2017)Google Scholar
  50. 50.
    Zhu, J., Xie, Q., Zheng, K.: An improved early detection method of type-2 diabetes mellitus using multiple classifier system. Inform. Sci. 292, 1–14 (2015)CrossRefGoogle Scholar
  51. 51.
    Zhu, J., Xie, Q., Wong, W.H., Wong, W.H.: Exploiting link structure for Web page genre identification. Data. Min. Knowl. Disc. 30(3), 550–575 (2016)MathSciNetCrossRefGoogle Scholar
  52. 52.
    Zhu, X., Suk, H.-I., Lee, S.-W., Shen, D.: Subspace regularized sparse multitask learning for multiclass neurodegenerative disease identification. IEEE Trans. Biomed. Eng. 63(3), 607–618 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Min Yang
    • 1
  • Wenting Tu
    • 2
  • Qiang Qu
    • 1
  • Kai Lei
    • 3
  • Xiaojun Chen
    • 4
  • Jia Zhu
    • 5
    Email author
  • Ying Shen
    • 6
  1. 1.Shenzhen Institutes of Advanced TechnologyChinese Academy of SciencesShenzhenChina
  2. 2.Department of Computer ScienceShanghai University of Finance and EconomicsShanghaiChina
  3. 3.School of Electronics and Computer EngineeringPeking UniversityShenZhenChina
  4. 4.School of Computing ScienceShenzhen UniversityShenzhenChina
  5. 5.School of Computing ScienceSouth China Normal UniversityGuangzhouChina
  6. 6.School of Electronics and Computer EngineeringPeking University Shenzhen Graduate SchoolShenzhenChina

Personalised recommendations