Skip to main content

Exploring Summarization to Enhance Headline Stance Detection

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2021)

Abstract

The spread of fake news and misinformation is causing serious problems to society, partly due to the fact that more and more people only read headlines or highlights of news assuming that everything is reliable, instead of carefully analysing whether it can contain distorted or false information. Specifically, the headline of a correctly designed news item must correspond to a summary of the main information of that news item. Unfortunately, this is not always happening, since various interests, such as increasing the number of clicks as well as political interests can be behind of the generation of a headlines that does not meet its intended original purpose. This paper analyses the use of automatic news summaries to determine the stance (i.e., position) of a headline with respect to the body of text associated with it. To this end, we propose a two-stage approach that uses summary techniques as input for both classifiers instead of the full text of the news body, thus reducing the amount of information that must be processed while maintaining the important information. The experimentation has been carried out using the Fake News Challenge FNC-1 dataset, leading to a 94.13% accuracy, surpassing the state of the art. It is especially remarkable that the proposed approach, which uses only the relevant information provided by the automatic summaries instead of the full text, is able to classify the different stance categories with very competitive results, so it can be concluded that the use of the automatic extractive summaries has a positive impact for determining the stance of very short information (i.e., headline, sentence) with respect to its whole content.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.fakenewschallenge.org/ (accessed online 18 March, 2021).

  2. 2.

    Implementation available at https://github.com/rsepulveda911112/Headline-Stance-Detection.

  3. 3.

    https://pypi.org/project/sumy/.

  4. 4.

    This metric assigns higher weight to examples correctly classified, as long as they belonged to a different class from the unrelated one.

  5. 5.

    This is computed as the mean of those per-class F scores.

  6. 6.

    https://github.com/hanselowski/athene_system/ (accessed online 15 March, 2021).

  7. 7.

    https://github.com/Cisco-Talos/fnc-1 (accessed online 15 March, 2021).

References

  1. Hanselowski, A., Avinesh, P.V.S., Schiller, B., Caspelherr, F.: Description of the system developed by team athene in the FNC-1 (2017). https://github.com/hanselowski/athene_system/blob/master/system_description_athene.pdf

  2. Babakar, M., et al.: Fake News Challenge - I (2016). http://www.fakenewschallenge.org/. Accessed 29 May 2020

  3. Baird, S., Sibley, D., Pan, Y.: Talos targets disinformation with fake news challenge victory (2017). blog.talosintelligence.com/2017/06/talos-fake-news-challenge.html. Accessed 29 May 2020

  4. Banko, M., Mittal, V.O., Witbrock, M.J.: Headline generation based on statistical translation. In: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, pp. 318–325. Association for Computational Linguistics (2000)

    Google Scholar 

  5. Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media, Inc. (2009)

    Google Scholar 

  6. Chen, Y., Conroy, N.J., Rubin, V.L.: News in an online world: the need for an “automatic crap detector”. In: Proceedings of the 78th ASIS&T Annual Meeting: Information Science with Impact: Research in and for the Community. American Society for Information Science (2015)

    Google Scholar 

  7. Chesney, S., Liakata, M., Poesio, M., Purver, M.: Incongruent headlines: yet another way to mislead your readers. Proc. Nat. Lang. Process. Meets J. 2017, 56–61 (2017)

    Google Scholar 

  8. Dernoncourt, F., Ghassemi, M., Chang, W.: A repository of corpora for summarization. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation. European Language Resources Association (2018)

    Google Scholar 

  9. van Dijk, T.A.: News as Discourse. L. Erlbaum Associates, Communication Series (1988)

    Google Scholar 

  10. Dorr, B., Zajic, D., Schwartz, R.: Hedge trimmer: a parse-and-trim approach to headline generation. In: Proceedings of the North American of the Association for Computational Linguistics, Text Summarization Workshop, pp. 1–8 (2003)

    Google Scholar 

  11. Duan, Y., Jatowt, A.: Across-time comparative summarization of news articles. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining, pp. 735–743. Association for Computing Machinery, New York (2019)

    Google Scholar 

  12. Dulhanty, C., Deglint, J.L., Daya, I.B., Wong, A.: Taking a stance on fake news: Towards automatic disinformation assessment via deep bidirectional transformer language models for stance detection. arXiv preprint arXiv:1911.11951 (2019)

  13. Esmaeilzadeh, S., Peh, G.X., Xu, A.: Neural abstractive text summarization and fake news detection. Computing Research Repository abs/1904.00788 (2019)

    Google Scholar 

  14. Ferreira, W., Vlachos, A.: Emergent: a novel data-set for stance classification. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, pp. 1163–1168. Association for Computational Linguistics (2016)

    Google Scholar 

  15. Gabielkov, M., Ramachandran, A., Chaintreau, A., Legout, A.: Social clicks: what and who gets read on Twitter? ACM SIGMETRICS Performance Eval. Rev. 44, 179–192 (2016)

    Article  Google Scholar 

  16. Gavrilov, D., Kalaidin, P., Malykh, V.: Self-attentive model for headline generation. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) ECIR 2019. LNCS, vol. 11438, pp. 87–93. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15719-7_11

    Chapter  Google Scholar 

  17. Hanselowski, A., et al.: A retrospective analysis of the fake news challenge stance-detection task. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1859–1874. Association for Computational Linguistics (2018)

    Google Scholar 

  18. Iwama, K., Kano, Y.: Multiple news headlines generation using page metadata. In: Proceedings of the 12th International Conference on Natural Language Generation, pp. 101–105. Association for Computational Linguistics (2019)

    Google Scholar 

  19. Kuiken, J., Schuth, A., Spitters, M., Marx, M.: Effective headlines of newspaper articles in a digital environment. Digit. J. 5(10), 1300–1314 (2017)

    Google Scholar 

  20. Lai, G., Xie, Q., Liu, H., Yang, Y., Hovy, E.: RACE: large-scale reading comprehension dataset from examinations. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 785–794. Association for Computational Linguistics (2017)

    Google Scholar 

  21. Liu, Y., et al.: RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  22. Mackie, S., McCreadie, R., Macdonald, C., Ounis, I.: Experiments in newswire summarisation. In: Ferro, N., et al. (eds.) ECIR 2016. LNCS, vol. 9626, pp. 421–435. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30671-1_31

    Chapter  Google Scholar 

  23. Metcalf, L., Casey, W.: Metrics, similarity, and sets. In: Cybersecurity and Applied Mathematics, pp. 3–22. Elsevier (2016)

    Google Scholar 

  24. Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404–411. Association for Computational Linguistics (2004)

    Google Scholar 

  25. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  26. Nenkova, A.: Automatic text summarization of newswire: lessons learned from the document understanding conference. In: Proceedings of the 20th National Conference on Artificial Intelligence, vol. 3, pp. 1436–1441. AAAI Press (2005)

    Google Scholar 

  27. Passalis, N., Tefas, A.: Learning bag-of-embedded-words representations for textual information retrieval. Pattern Recogn. 81, 254–267 (2018)

    Article  Google Scholar 

  28. Pouliquen, B., Steinberger, R., Best, C.: Automatic detection of quotations in multilingual news. Proc. Recent Adv. Nat. Lang. Process. 2007, 487–492 (2007)

    Google Scholar 

  29. Reimers, N., Gurevych, I.: Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)

  30. Riedel, B., Augenstein, I., Spithourakis, G.P., Riedel, S.: A simple but tough-to-beat baseline for the Fake News Challenge stance detection task. Computing Research Repository, CoRR abs/1707.03264 (2017)

    Google Scholar 

  31. Silverman, C.: Lies, damn lies and viral content (2019). http://towcenter.org/research/lies-damn-lies-and-viral-content/. Accessed 29 May 2020

  32. Slovikovskaya, V.: Transfer learning from transformers to fake news challenge stance detection (FNC-1) task. arXiv preprint arXiv:1910.14353 (2019)

  33. Tan, J., Wan, X., Xiao, J.: From neural sentence summarization to headline generation: a coarse-to-fine approach. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 4109–4115. AAAI Press (2017)

    Google Scholar 

  34. Tsipursky, G., Votta, F., Roose, K.M.: Fighting fake news and post-truth politics with behavioral science: the pro-truth pledge. Behav.Soc. Issues 27(1), 47–70 (2018). https://doi.org/10.5210/bsi.v27i0.9127

    Article  Google Scholar 

  35. Vicente, M.E., Pastor, E.L.: Relevant content selection through positional language models: an exploratory analysis. Proces. del Leng. Nat. 65, 75–82 (2020)

    Google Scholar 

  36. Vlachos, A., Riedel, S.: Identification and verification of simple claims about statistical properties. Proc. Conf. Empirical Methods Nat. Lang. Process. 2015, 2596–2601 (2015)

    Google Scholar 

  37. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp. 353–355. Association for Computational Linguistics (2018)

    Google Scholar 

  38. Wei, W., Wan, X.: Learning to identify ambiguous and misleading news headlines. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 4172–4178. AAAI Press (2017)

    Google Scholar 

  39. Zajic, D., Dorr, B., Schwartz, R.: Automatic headline generation for newspaper stories. In: Proceedings of the Workshop on Automatic Summarization 2002, pp. 78–85 (2002)

    Google Scholar 

  40. Zhang, Q., Liang, S., Lipani, A., Ren, Z., Yilmaz, E.: From stances’ imbalance to their hierarchical representation and detection. In: The World Wide Web Conference, pp. 2323–2332. ACM (2019)

    Google Scholar 

  41. Zhu, C., Yang, Z., Gmyr, R., Zeng, M., Huang, X.: Make lead bias in your favor: A simple and effective method for news summarization. arXiv preprint arXiv:1912.11602 (2019)

Download references

Acknowledgements

This research work has been partially funded by Generalitat Valenciana through project “SIIA: Tecnologias del lenguaje humano para una sociedad inclusiva, igualitaria, y accesible” (PROMETEU/2018/089), by the Spanish Government through project “Modelang: Modeling the behavior of digital entities by Human Language Technologies” (RTI2018-094653-B-C22), and project “INTEGER - Intelligent Text Generation” (RTI2018-094649-B-I00). Also, this paper is also based upon work from COST Action CA18231 “Multi3Generation: Multi-task, Multilingual, Multi-modal Language Generation”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robiert SepĂșlveda-Torres .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

SepĂșlveda-Torres, R., Vicente, M., Saquete, E., Lloret, E., Palomar, M. (2021). Exploring Summarization to Enhance Headline Stance Detection. In: MĂ©tais, E., Meziane, F., Horacek, H., Kapetanios, E. (eds) Natural Language Processing and Information Systems. NLDB 2021. Lecture Notes in Computer Science(), vol 12801. Springer, Cham. https://doi.org/10.1007/978-3-030-80599-9_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-80599-9_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-80598-2

  • Online ISBN: 978-3-030-80599-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics