Advertisement

News Timeline Generation: Accounting for Structural Aspects and Temporal Nature of News Stream

  • Mikhail Tikhomirov
  • Boris Dobrov
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 822)

Abstract

The number of news articles that are published daily is larger than any person can afford to study. Correct summarization of the information allows for an easy search for the event of interest. This research was designed to address the issue of constructing annotations of news story. Standard multi-document summarization approaches are not able to extract all information relevant to the event. This is due to the fact that such approaches do not take into account the variability of the event context in time. We have implemented a system that automatically builds timeline summary. We investigated impact of three factors: query extension, accounting for temporal nature and structure of news article in form of inverted pyramid. The annotations that we generate are composed of sentences sorted in chronological order, which together contain the main details of the news story. The paper shows that taking into account the described factors positively affects the quality of the annotations created.

Keywords

Timeline summarization Extractive summarization Multi-document summarization Information retrieval 

References

  1. 1.
    Binh, T.G., Alrifai, M., Quoc Nguyen, D.: Predicting relevant news events for timeline summaries. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 91–92. ACM (2013)Google Scholar
  2. 2.
    Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 335–336. ACM (1998)Google Scholar
  3. 3.
    Dang, H.T.: Overview of DUC 2006. In: Proceedings of the Document Understanding Workshop, Presented at HLT-NAACL 2006 (2006). http://duc.nist.gov/pubs/2006papers/duc2006.pdf
  4. 4.
    Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)Google Scholar
  5. 5.
    Hu, P., Huang, M.L., Zhu, X.Y.: Exploring the interactions of storylines from informative news events. J. Comput. Sci. Technol. 29(3), 502–518 (2014)CrossRefGoogle Scholar
  6. 6.
    Radev, D., Jing, H., Budzikowska, M.: Centroid-based summarization of multiple docuemtns: sentence extraction, utility-based evaluation, and user studies. In: Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization, Seattle, pp. 21–30 (2000)Google Scholar
  7. 7.
    Radev, D., McKeown, K., Hovy, E.: Introduction to the special issue on summarization. Comput. Linguist. 28(4), 399–408 (2002)CrossRefGoogle Scholar
  8. 8.
    Shahaf, D., Guestrin, C.: Connecting two (or less) dots: discovering structure in news articles. ACM Trans. Knowl. Discov. Data (TKDD) 5(4), 24–54 (2012)Google Scholar
  9. 9.
    Tran, G., Alrifai, M., Herder, E.: Timeline summarization from relevant headlines. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 245–256. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-16354-3_26CrossRefGoogle Scholar
  10. 10.
    Yan, R., et al.: Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, Beijing, China, 24–28 July 2011, pp. 745–754. ACM (2011).  https://doi.org/10.1145/2009916.2010016
  11. 11.
    Wu, Z., Lei, L., Li, G., Huang, H., Zheng, C., Chen, E., Xu, G.: A topic modeling based approach to novel document automatic summarization. Expert Syst. Appl. 84, 12–23 (2017)CrossRefGoogle Scholar
  12. 12.
    Hennig, L., Umbrath, W., Wetzker, R.: An ontology-based approach to text summarization. In: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol. 3, pp. 291–294 (2008)Google Scholar
  13. 13.
    Nallapati, R., Zhou, B., Gulcehre, C., Xiang, B.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. arXiv preprint arXiv:1602.06023 (2016)
  14. 14.
    Wei, T., Lu, Y., Chang, H., Zhou, Q., Bao, X.: A semantic approach for text clustering using WordNet and lexical chains. Expert Syst. Appl. 42(4), 2264–2275 (2015)CrossRefGoogle Scholar
  15. 15.
    Allahyari, M., Pouriyeh, S., Assefi, M., Safaei, S., Trippe, E.D., Gutierrez, J.B., Kochut, K.: Text summarization techniques: a brief survey. arXiv preprint arXiv:1707.02268 (2017)
  16. 16.
    Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685 (2015)
  17. 17.
    Hertzfeld, A.: Introducing Google News Timeline. https://news.googleblog.com/2009/04/introducing-google-news-timeline.html. Accessed 10 Jan 2018
  18. 18.
    Christensen, J., Mausam, S.S., Soderland, S., Etzioni, O.: Towards Coherent Multi-Document Summarization. In: HLT-NAACL, pp. 1163–1173 (2013)Google Scholar
  19. 19.
    Nishikawa, H., Arita, K., Tanaka, K., Hirao, T., Makino, T., Matsuo, Y.: Learning to generate coherent summary with discriminative hidden semi-markov model. In: COLING, pp. 1648–1659 (2014)Google Scholar
  20. 20.
    Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. In: Advances in Automatic Text Summarization, pp. 111–121 (1999)Google Scholar
  21. 21.
    Jiang, L., Mitamura, T., Yu, S.I., Hauptmann, A.G.: Zero-example event search using multimodal pseudo relevance feedback. In: Proceedings of International Conference on Multimedia Retrieval, p. 297 (2014)Google Scholar
  22. 22.
    Bergstra, J., Komer, B., Eliasmith, C., Yamins, D., Cox, D.D.: Hyperopt: a python library for model selection and hyperparameter optimization. Comput. Sci. Discov. 8(1), 014008 (2015)CrossRefGoogle Scholar
  23. 23.
    Hutter, F., Hoos, Holger H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-25566-3_40CrossRefGoogle Scholar
  24. 24.
    Goldberg, Y., Levy, O.: word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722 (2014)
  25. 25.
    Tikhomirov, M.M., Dobrov, B.V.: Using news corpora for temporal summary formation (in Russian). In: Selected Papers of the XIX International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2017), CEUR Workshop Proceedings, Moscow, Russia, vol. 2022, pp. 165–171 (2017)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Lomonosov Moscow State UniversityMoscowRussia

Personalised recommendations