Advertisement

World Wide Web

, Volume 21, Issue 4, pp 1069–1092 | Cite as

Event phase oriented news summarization

  • Chengyu Wang
  • Xiaofeng He
  • Aoying Zhou
Article
  • 219 Downloads
Part of the following topical collections:
  1. Special Issue on Web Information Systems Engineering

Abstract

Event summarization is a task to generate a single, concise textual representation of an event. This task does not consider multiple development phases in an event. However, news articles related to long and complicated events often involve multiple phases. Thus, traditional approaches for event summarization generally have difficulty in capturing event phases in summarization effectively. In this paper, we define the task of Event Phase Oriented News Summarization (EPONS). In this approach, we assume that a summary contains multiple timelines, each corresponding to an event phase. We model the semantic relations of news articles via a graph model called Temporal Content Coherence Graph. A structural clustering algorithm EPCluster is designed to separate news articles into several groups corresponding to event phases. We apply a vertex-reinforced random walk to rank news articles. The ranking results are further used to create timelines. Extensive experiments conducted on multiple datasets show the effectiveness of our approach.

Keywords

Event phase News summarization Structural clustering Timeline generation Vertex-reinforced random walk 

Notes

Acknowledgments

This work is partially supported by the National Key Research and Development Program of China under Grant No. 2016YFB1000904. Chengyu Wang is partially supported by the Outstanding Doctoral Dissertation Cultivation Plan of Action under Grant No. YB2016040. This manuscript is an extended version of the paper “Event Phase Extraction and Summarization” presented at WISE 2016 [44].

References

  1. 1.
    Bansal, T., Kanti Das, M., Bhattacharyya, C.: Content driven user profiling for comment-worthy recommendations of news and blog articles. In: Proceedings of the 9th ACM, Conference on Recommender Systems, pp. 195–202 (2015)Google Scholar
  2. 2.
    Bauer, S., Teufel, S.: Unsupervised timeline generation for wikipedia history articles. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2343–2349 (2016)Google Scholar
  3. 3.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  4. 4.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Comput. Netw. 30(1–7), 107–117 (1998)Google Scholar
  5. 5.
    Cao, Z., Wei, F., Li, S., Li, W., Zhou, M., Wang, H.: Learning summary prior representation for extractive summarization. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pp. 829–833 (2015)Google Scholar
  6. 6.
    Chang, L., Li, W., Lin, X., Qin, L., Zhang, W.: pscan: Fast and exact structural graph clustering. In: Proceedings of the 32nd IEEE, International Conference on Data Engineering, pp. 253–264 (2016)Google Scholar
  7. 7.
    Chen, C.C., Chen, Y.-T., Sun, Y.S., Chen, M.C.: Life cycle modeling of news events using aging theory. In: Proceedings of the 14th European Conference on Machine Learning, pp. 47–59 (2003)Google Scholar
  8. 8.
    Chen, J., Niu, Z., Fu, H.: A multi-news timeline summarization algorithm based on aging theory. In: Web Technologies and Applications - 17th Asia-Pacific Web Conference, pp. 449–460 (2015)Google Scholar
  9. 9.
    Chieu, H.L., Lee, Y.K.: Query based event extraction along a timeline. In: Proceedings of the 27th Annual International ACM, SIGIR Conference on Research and Development in Information Retrieval, pp. 425–432 (2004)Google Scholar
  10. 10.
    Chopra, S., Auli, M., Rush, A.M.: Abstractive sentence summarization with attentive recurrent neural networks. In: Human Language Technologies: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics, pp. 93–98 (2016)Google Scholar
  11. 11.
    Conroy, J.M., O’Leary, D.P.: Text summarization via hidden markov models. In: Proceedings of the 24th Annual International ACM, SIGIR Conference on Research and Development in Information Retrieval, pp. 406–407 (2001)Google Scholar
  12. 12.
    Davis, J.V., Dhillon, I.S.: Estimating the global pagerank of Web communities. In: Proceedings of the Twelfth ACM, SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 116–125 (2006)Google Scholar
  13. 13.
    de Kretser, O., Moffat, A.: Effective document presentation with a locality-based similarity heuristic. In: Proceedings of the 22nd Annual International ACM, SIGIR Conference on Research and Development in Information Retrieval, pp. 113–120 (1999)Google Scholar
  14. 14.
    Diao, Q., Shan, J.: A new Web page summarization method. In: Proceedings of the 29th Annual International ACM, SIGIR Conference on Research and Development in Information Retrieval, pp. 639–640 (2006)Google Scholar
  15. 15.
    Dolby, J., Fokoue, A., Kalyanpur, A., Kershenbaum, A., Schonberg, E., Srinivas, K., Ma, L.: Scalable semantic retrieval through summarization and refinement. In: Proceedings of the Twenty-Second AAAI, Conference on Artificial Intelligence, pp. 299–304 (2007)Google Scholar
  16. 16.
    Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. (JAIR) 22, 457–479 (2004)CrossRefGoogle Scholar
  17. 17.
    Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM, SIGIR Conference on Research and Development in Information Retrieval, pp. 19–25 (2001)Google Scholar
  18. 18.
    Gu, Y., Yang, Z., Xu, G., Nakano, M., Toyoda, M., Kitsuregawa, M.: Exploration on efficient similar sentences extraction. World Wide Web 17(4), 595–626 (2014)CrossRefGoogle Scholar
  19. 19.
    Hartigan, J.A., Wong, M.A.: Algorithm as 136: A k-means clustering algorithm. J. R. Stat. Soc.: Ser. C: Appl. Stat. 28(1), 100–108 (1979)zbMATHGoogle Scholar
  20. 20.
    He, Z., Chen, C., Bu, J., Wang, C., Zhang, L., Cai, D., He, X.: Document summarization based on data reconstruction. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (2012)Google Scholar
  21. 21.
    Hong, K., Nenkova, A.: Improving the estimation of word importance for news multi-document summarization. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 712–721 (2014)Google Scholar
  22. 22.
    Jiang, L., Luo, P., Wang, J., Xiong, Y., Lin, B., Wang, M., An, N.: GRIAS: an entity-relation graph based framework for discovering entity aliases. In: Proceedins of the 2013 IEEE, 13th International Conference on Data Mining, pp. 310–319 (2013)Google Scholar
  23. 23.
    Kessler, R., Tannier, X., Hagége, C., Moriceau, V., Bittar, A.: Finding salient years for building thematic timelines. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 730–739 (2012)Google Scholar
  24. 24.
    Khuller, S., Moss, A., Naor, J.: The budgeted maximum coverage problem. Inf. Process. Lett. 70(1), 39–45 (1999)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Knights, D., Mozer, M.C., Nicolov, N.: Detecting topic drift with compound topic models. In: Proceedings of the Third International Conference on Weblogs and Social Media (2009)Google Scholar
  26. 26.
    Li, J., Li, S.: Evolutionary hierarchical dirichlet process for timeline summarization. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 556–560 (2013)Google Scholar
  27. 27.
    Li, W., He, L., Zhuge, H.: Abstractive news summarization based on event semantic link network. In: Proceedings of the 26th International Conference on Computational Linguistics, pp. 236–246 (2016)Google Scholar
  28. 28.
    Lin, C.-Y., Hovy, E.H.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (2003)Google Scholar
  29. 29.
    Mei, Q., Guo, J., Radev, D.R.: Divrank: the interplay of prestige and diversity in information networks. In: Proceedings of the 16th ACM, SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1009–1018 (2010)Google Scholar
  30. 30.
    Ng, J.-P., Chen, Y., Kan, M.-Y., Li, Z.: Exploiting timelines to enhance multi-document summarization. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 923–933 (2014)Google Scholar
  31. 31.
    Parveen, D., Ramsl, H.-M., Strube, M.: Topical coherence for graph-based extractive summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1949–1954 (2015)Google Scholar
  32. 32.
    Pemantle, R.: Vertex-reinforced random walk. Probab. Theory Relat. Fields 92(1), 117–136 (1992)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Peng, M., Zhu, J., Li, X., Huang, J., Wang, H., Zhang, Y.: Central topic model for event-oriented topics mining in microblog stream. In: Proceedings of the 24th ACM, International Conference on Information and Knowledge Management, pp. 1611–1620 (2015)Google Scholar
  34. 34.
    Qian, X., Liu, Y.: Fast joint compression and summarization via graph cuts. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1492–1502 (2013)Google Scholar
  35. 35.
    Ren, P., Wei, F., Chen, Z., Ma, J., Zhou, M.: A redundancy-aware sentence regression framework for extractive summarization. In: Proceedings of the 26th International Conference on Computational Linguistics, pp. 33–43 (2016)Google Scholar
  36. 36.
    Seeland, M., Berger, S.A., Stamatakis, A., Kramer, S.: Parallel structural graph clustering. In: Proceedings of the 2011 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, pp. 256–272 (2011)Google Scholar
  37. 37.
    Shen, W., Wang, J., Luo, P., Wang, M.: A hybrid framework for semantic relation extraction over enterprise data. Int. J. Semantic Web Inf. Syst. 11(3), 1–24 (2015)CrossRefGoogle Scholar
  38. 38.
    Tran, G.B., Alrifai, M., Herder, E.: Timeline summarization from relevant headlines. In: Advances in Information Retrieval - 37th European Conference on IR, Research, pp. 245–256 (2015)Google Scholar
  39. 39.
    Tran, G.B., Herder, E., Markert, K.: Joint graphical models for year selection in timeline summarization. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pp. 1598–1607 (2015)Google Scholar
  40. 40.
    Unankard, S., Li, X., Sharaf, M.A.: Emerging event detection in social networks with location sensitivity. World Wide Web 18(5), 1393–1417 (2015)CrossRefGoogle Scholar
  41. 41.
    Wan, X., Yang, J.: Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st Annual International ACM, SIGIR Conference on Research and Development in Information Retrieval, pp. 299–306 (2008)Google Scholar
  42. 42.
    Wan, X., Zhang, J.: CTSUM: extracting more certain summaries for news articles. In: Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 787–796 (2014)Google Scholar
  43. 43.
    Wang, C., Zhang, R., He, X., Zhou, A.: Nerank: Ranking named entities in document collections. In: Proceedings of the 25th International Conference on World Wide Web, pp. 123–124 (2016)Google Scholar
  44. 44.
    Wang, C., Zhang, R., He, X., Zhou, G., Zhou, A.: Event phase extraction and summarization. In: Proceedings of the 17th International Conference on Web Information Systems Engineering, pp. 473–488 (2016)Google Scholar
  45. 45.
    Wang, C., Zhang, R., He, X., Zhou, G., Zhou, A.: Nerank: Bringing order to named entities from texts. In: Web Technologies and Applications - Proceedings of the 18th Asia-Pacific Web Conference, pp. 15–27 (2016)Google Scholar
  46. 46.
    Xu, X., Yuruk, N., Feng, Z., Schweiger, T.A.J.: SCAN: a structural clustering algorithm for networks. In: Proceedings of the 13th ACM, SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 824–833 (2007)Google Scholar
  47. 47.
    Yan, R., Kong, L., Huang, C., Wan, X., Li, X., Zhang, Y.: Timeline generation through evolutionary trans-temporal summarization. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 433–443 (2011)Google Scholar
  48. 48.
    Yan, J., Cheng, W., Wang, C., Liu, J., Gao, M., Zhou, A.: Optimizing word set coverage for multi-event summarization. J. Comb Optim. 30(4), 996–1015 (2015)MathSciNetCrossRefGoogle Scholar
  49. 49.
    Yu, H., Hatzivassiloglou, V.: Owards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing (2003)Google Scholar
  50. 50.
    Zhao, W.X., Guo, Y., Yan, R., He, Y., Li, X.: Timeline generation with social attention. In: Proceedings of the 36th International ACM, SIGIR conference on research and development in Information Retrieval, pp. 1061–1064 (2013)Google Scholar
  51. 51.
    Zhou, E., Zhong, N., Li, Y.: Extracting news blog hot topics based on the W2T methodology. World Wide Web 17(3), 377–404 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Shanghai Key Laboratory of Trustworthy Computing, School of Computer Science and Software EngineeringEast China Normal UniversityShanghaiChina
  2. 2.School of Data Science and EngineeringEast China Normal UniversityShanghaiChina

Personalised recommendations