Advertisement

Modeling and Summarizing News Events Using Semantic Triples

  • Radityo Eko Prasojo
  • Mouna Kacimi
  • Werner Nutt
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10843)

Abstract

Summarizing news articles is becoming crucial for allowing quick and concise access to information about daily events. This task can be challenging when the same event is reported with various levels of detail or is subject to diverse view points. A well established technique in the area of news summarization consists in modeling events as a set of semantic triples. These triples are weighted, mainly based on their frequencies, and then fused to build summaries. Typically, triples are extracted from main clauses, which might lead to information loss. Moreover, some crucial facets of news, such as reasons or consequences, are mostly reported in subordinate clauses and thus they are not properly handled. In this paper, we focus on an existing work that uses a graph structure to model sentences allowing the access to any triple independently from the clause it belongs to. Summary sentences are then generated by taking the top ranked paths that contain many triples and show grammatical correctness. We further provide several improvements to that approach. First, we leverage node degrees for finding the most important triples and facets shared among sentences. Second, we enhance the process of triple fusion by providing more effective similarity measures that exploit entity linking and predicate similarity. We performed extensive experiments using the DUC’04 and DUC’07 datasets showing that our approach outperforms baseline approaches by a large margin in terms of ROUGE and PYRAMID scores.

Notes

Acknowledgment

This work has been partially supported by the project TaDaQua, funded by the Free University of Bozen-Bolzano.

References

  1. 1.
    Amato, F., d’Acierno, A., Colace, F., Moscato, V., Penta, A., Picariello, A.: Semantic summarization of news from heterogeneous sources. Advances on P2P, Parallel, Grid, Cloud and Internet Computing. LNDECT, vol. 1, pp. 305–314. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-49109-7_29CrossRefGoogle Scholar
  2. 2.
    Arora, S., Liang, Y., Ma, T.: A simple but tough-to-beat baseline for sentence embeddings (2016)Google Scholar
  3. 3.
    Bing, L., Li, P., Liao, Y., Lam, W., Guo, W., Passonneau, R.: Abstractive multi-document summarization via phrase selection and merging. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 1, pp. 1587–1597 (2015). (Volume 1: Long Papers)Google Scholar
  4. 4.
    Christensen, J., Soderland, S., Bansal, G., Mausam: Hierarchical summarization: scaling up multi-document summarization. In: Proceedings of the 52nd Annual Meeting of the Association for Computlational Linguistics, pp. 902–912 (2014)Google Scholar
  5. 5.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)MATHGoogle Scholar
  6. 6.
    d’Acierno, A., Moscato, V., Persia, F., Picariello, A., Penta, A.: Semantic summarization of web documents. In: 2010 IEEE Fourth International Conference on Semantic Computing (ICSC), pp. 430–435. IEEE (2010)Google Scholar
  7. 7.
    Del Corro, L., Gemulla, R.: ClausIE: Clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 355–366. ACM (2013)Google Scholar
  8. 8.
    Gatt, A., Reiter, E.: SimpleNLG: A realisation engine for practical applications. In: Proceedings of the 12th European Workshop on Natural Language Generation, pp. 90–93. Association for Computational Linguistics (2009)Google Scholar
  9. 9.
    Genest, P.-E., Lapalme, G.: Framework for abstractive summarization using text-to-text generation. In: Proceedings of the Workshop on Monolingual Text-To-Text Generation, pp. 64–73. Association for Computational Linguistics (2011)Google Scholar
  10. 10.
    Khan, A., Salim, N., Kumar, Y.J.: A framework for multi-document abstractive summarization based on semantic role labelling. Appl. Soft Comput. 30, 737–747 (2015)CrossRefGoogle Scholar
  11. 11.
    Kshirsagar, M., Thomson, S., Schneider, N., Carbonell, J., Smith, N.A., Dyer, C.: Frame-semantic role labeling with heterogeneous annotations. In: ACL, vol. 2, pp. 218–224 (2015)Google Scholar
  12. 12.
    Li, P., Cai, W., Huang, H.: Weakly supervised natural language processing framework for abstractive multi-document summarization: weakly supervised abstractive multi-document summarization. In: Proceedings of the 24th CIKM, pp. 1401–1410. ACM (2015)Google Scholar
  13. 13.
    Li, W.: Abstractive multi-document summarization with semantic information extraction. In: EMNLP, pp. 1908–1913 (2015)Google Scholar
  14. 14.
    Lin, C.-Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 NAACL, vol. 1, pp. 71–78. Association for Computational Linguistics (2003)Google Scholar
  15. 15.
    Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J.R., Bethard, S., McClosky, D.:. The Stanford coreNLP natural language processing toolkit. In: ACL (System Demonstrations), pp. 55–60 (2014)Google Scholar
  16. 16.
    Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 1–8. ACM (2011)Google Scholar
  17. 17.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  18. 18.
    Nenkova, A., Passonneau, R.J.: Evaluating content selection in summarization: The Pyramid method. In: HLT-NAACL, vol. 4, pp. 145–152 (2004)Google Scholar
  19. 19.
    Oya, T., Mehdad, Y., Carenini, G., Ng, R.: A template-based abstractive meeting summarization: leveraging summary and source text relationships. In: Proceedings of the 8th International Natural Language Generation Conference (INLG), pp. 45–53. Association for Computational Linguistics, Philadelphia, June 2014Google Scholar
  20. 20.
    Passonneau, R.J., Chen, E., Guo, W., Perin, D.: Automated pyramid scoring of summaries using distributional semantics. In: ACL, vol. 2, pp. 143–147 (2013)Google Scholar
  21. 21.
    Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet::Similarity: Measuring the relatedness of concepts. In: Demonstration Papers at HLT-NAACL 2004, pp. 38–41. Association for Computational Linguistics (2004)Google Scholar
  22. 22.
    Pighin, D., Cornolti, M., Alfonseca, E., Filippova, K.: Modelling events through memory-based, open-IE patterns for abstractive summarization. In: ACL, vol. 1, pp. 892–901 (2014)Google Scholar
  23. 23.
    Prasojo, R.E., Kacimi, M., Nutt, W.: Entity and aspect extraction for organizing news comments. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, CIKM 2015, pp. 233–242. ACM, New York (2015)Google Scholar
  24. 24.
    Raghunathan, K., Lee, H., Rangarajan, S., Chambers, N., Surdeanu, M., Jurafsky, D., Manning, C.: A multi-pass sieve for coreference resolution. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP 2010, pp. 492–501. Association for Computational Linguistics, Stroudsburg (2010)Google Scholar
  25. 25.
    Schmitz, M., Bart, R., Soderland, S., Etzioni, O., et al.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 523–534. Association for Computational Linguistics (2012)Google Scholar
  26. 26.
    Vanderwende, L., Banko, M., Menezes, A.: Event-centric summary generation. In: Working Notes of DUC, pp. 127–132 (2004)Google Scholar
  27. 27.
    Wang, L., Raghavan, H., Castelli, V., Florian, R., Cardie, C.: A sentence compression based framework to query-focused multi-document summarization. arXiv preprint arXiv:1606.07548 (2016)

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Free University of Bozen-BolzanoBozen-BolzanoItaly

Personalised recommendations