Exploring Summarization to Enhance Headline Stance Detection

Sepúlveda-Torres, Robiert; Vicente, Marta; Saquete, Estela; Lloret, Elena; Palomar, Manuel

doi:10.1007/978-3-030-80599-9_22

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12801))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

2663 Accesses
8 Citations

Abstract

The spread of fake news and misinformation is causing serious problems to society, partly due to the fact that more and more people only read headlines or highlights of news assuming that everything is reliable, instead of carefully analysing whether it can contain distorted or false information. Specifically, the headline of a correctly designed news item must correspond to a summary of the main information of that news item. Unfortunately, this is not always happening, since various interests, such as increasing the number of clicks as well as political interests can be behind of the generation of a headlines that does not meet its intended original purpose. This paper analyses the use of automatic news summaries to determine the stance (i.e., position) of a headline with respect to the body of text associated with it. To this end, we propose a two-stage approach that uses summary techniques as input for both classifiers instead of the full text of the news body, thus reducing the amount of information that must be processed while maintaining the important information. The experimentation has been carried out using the Fake News Challenge FNC-1 dataset, leading to a 94.13% accuracy, surpassing the state of the art. It is especially remarkable that the proposed approach, which uses only the relevant information provided by the automatic summaries instead of the full text, is able to classify the different stance categories with very competitive results, so it can be concluded that the use of the automatic extractive summaries has a positive impact for determining the stance of very short information (i.e., headline, sentence) with respect to its whole content.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.fakenewschallenge.org/ (accessed online 18 March, 2021).
2.
Implementation available at https://github.com/rsepulveda911112/Headline-Stance-Detection.
3.
https://pypi.org/project/sumy/.
4.
This metric assigns higher weight to examples correctly classified, as long as they belonged to a different class from the unrelated one.
5.
This is computed as the mean of those per-class F scores.
6.
https://github.com/hanselowski/athene_system/ (accessed online 15 March, 2021).
7.
https://github.com/Cisco-Talos/fnc-1 (accessed online 15 March, 2021).

References

Hanselowski, A., Avinesh, P.V.S., Schiller, B., Caspelherr, F.: Description of the system developed by team athene in the FNC-1 (2017). https://github.com/hanselowski/athene_system/blob/master/system_description_athene.pdf
Babakar, M., et al.: Fake News Challenge - I (2016). http://www.fakenewschallenge.org/. Accessed 29 May 2020
Baird, S., Sibley, D., Pan, Y.: Talos targets disinformation with fake news challenge victory (2017). blog.talosintelligence.com/2017/06/talos-fake-news-challenge.html. Accessed 29 May 2020
Banko, M., Mittal, V.O., Witbrock, M.J.: Headline generation based on statistical translation. In: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, pp. 318–325. Association for Computational Linguistics (2000)
Google Scholar
Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media, Inc. (2009)
Google Scholar
Chen, Y., Conroy, N.J., Rubin, V.L.: News in an online world: the need for an “automatic crap detector”. In: Proceedings of the 78th ASIS&T Annual Meeting: Information Science with Impact: Research in and for the Community. American Society for Information Science (2015)
Google Scholar
Chesney, S., Liakata, M., Poesio, M., Purver, M.: Incongruent headlines: yet another way to mislead your readers. Proc. Nat. Lang. Process. Meets J. 2017, 56–61 (2017)
Google Scholar
Dernoncourt, F., Ghassemi, M., Chang, W.: A repository of corpora for summarization. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation. European Language Resources Association (2018)
Google Scholar
van Dijk, T.A.: News as Discourse. L. Erlbaum Associates, Communication Series (1988)
Google Scholar
Dorr, B., Zajic, D., Schwartz, R.: Hedge trimmer: a parse-and-trim approach to headline generation. In: Proceedings of the North American of the Association for Computational Linguistics, Text Summarization Workshop, pp. 1–8 (2003)
Google Scholar
Duan, Y., Jatowt, A.: Across-time comparative summarization of news articles. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining, pp. 735–743. Association for Computing Machinery, New York (2019)
Google Scholar
Dulhanty, C., Deglint, J.L., Daya, I.B., Wong, A.: Taking a stance on fake news: Towards automatic disinformation assessment via deep bidirectional transformer language models for stance detection. arXiv preprint arXiv:1911.11951 (2019)
Esmaeilzadeh, S., Peh, G.X., Xu, A.: Neural abstractive text summarization and fake news detection. Computing Research Repository abs/1904.00788 (2019)
Google Scholar
Ferreira, W., Vlachos, A.: Emergent: a novel data-set for stance classification. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, pp. 1163–1168. Association for Computational Linguistics (2016)
Google Scholar
Gabielkov, M., Ramachandran, A., Chaintreau, A., Legout, A.: Social clicks: what and who gets read on Twitter? ACM SIGMETRICS Performance Eval. Rev. 44, 179–192 (2016)
Article Google Scholar
Gavrilov, D., Kalaidin, P., Malykh, V.: Self-attentive model for headline generation. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) ECIR 2019. LNCS, vol. 11438, pp. 87–93. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15719-7_11
Chapter Google Scholar
Hanselowski, A., et al.: A retrospective analysis of the fake news challenge stance-detection task. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1859–1874. Association for Computational Linguistics (2018)
Google Scholar
Iwama, K., Kano, Y.: Multiple news headlines generation using page metadata. In: Proceedings of the 12th International Conference on Natural Language Generation, pp. 101–105. Association for Computational Linguistics (2019)
Google Scholar
Kuiken, J., Schuth, A., Spitters, M., Marx, M.: Effective headlines of newspaper articles in a digital environment. Digit. J. 5(10), 1300–1314 (2017)
Google Scholar
Lai, G., Xie, Q., Liu, H., Yang, Y., Hovy, E.: RACE: large-scale reading comprehension dataset from examinations. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 785–794. Association for Computational Linguistics (2017)
Google Scholar
Liu, Y., et al.: RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Mackie, S., McCreadie, R., Macdonald, C., Ounis, I.: Experiments in newswire summarisation. In: Ferro, N., et al. (eds.) ECIR 2016. LNCS, vol. 9626, pp. 421–435. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30671-1_31
Chapter Google Scholar
Metcalf, L., Casey, W.: Metrics, similarity, and sets. In: Cybersecurity and Applied Mathematics, pp. 3–22. Elsevier (2016)
Google Scholar
Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404–411. Association for Computational Linguistics (2004)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Nenkova, A.: Automatic text summarization of newswire: lessons learned from the document understanding conference. In: Proceedings of the 20th National Conference on Artificial Intelligence, vol. 3, pp. 1436–1441. AAAI Press (2005)
Google Scholar
Passalis, N., Tefas, A.: Learning bag-of-embedded-words representations for textual information retrieval. Pattern Recogn. 81, 254–267 (2018)
Article Google Scholar
Pouliquen, B., Steinberger, R., Best, C.: Automatic detection of quotations in multilingual news. Proc. Recent Adv. Nat. Lang. Process. 2007, 487–492 (2007)
Google Scholar
Reimers, N., Gurevych, I.: Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)
Riedel, B., Augenstein, I., Spithourakis, G.P., Riedel, S.: A simple but tough-to-beat baseline for the Fake News Challenge stance detection task. Computing Research Repository, CoRR abs/1707.03264 (2017)
Google Scholar
Silverman, C.: Lies, damn lies and viral content (2019). http://towcenter.org/research/lies-damn-lies-and-viral-content/. Accessed 29 May 2020
Slovikovskaya, V.: Transfer learning from transformers to fake news challenge stance detection (FNC-1) task. arXiv preprint arXiv:1910.14353 (2019)
Tan, J., Wan, X., Xiao, J.: From neural sentence summarization to headline generation: a coarse-to-fine approach. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 4109–4115. AAAI Press (2017)
Google Scholar
Tsipursky, G., Votta, F., Roose, K.M.: Fighting fake news and post-truth politics with behavioral science: the pro-truth pledge. Behav.Soc. Issues 27(1), 47–70 (2018). https://doi.org/10.5210/bsi.v27i0.9127
Article Google Scholar
Vicente, M.E., Pastor, E.L.: Relevant content selection through positional language models: an exploratory analysis. Proces. del Leng. Nat. 65, 75–82 (2020)
Google Scholar
Vlachos, A., Riedel, S.: Identification and verification of simple claims about statistical properties. Proc. Conf. Empirical Methods Nat. Lang. Process. 2015, 2596–2601 (2015)
Google Scholar
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp. 353–355. Association for Computational Linguistics (2018)
Google Scholar
Wei, W., Wan, X.: Learning to identify ambiguous and misleading news headlines. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 4172–4178. AAAI Press (2017)
Google Scholar
Zajic, D., Dorr, B., Schwartz, R.: Automatic headline generation for newspaper stories. In: Proceedings of the Workshop on Automatic Summarization 2002, pp. 78–85 (2002)
Google Scholar
Zhang, Q., Liang, S., Lipani, A., Ren, Z., Yilmaz, E.: From stances’ imbalance to their hierarchical representation and detection. In: The World Wide Web Conference, pp. 2323–2332. ACM (2019)
Google Scholar
Zhu, C., Yang, Z., Gmyr, R., Zeng, M., Huang, X.: Make lead bias in your favor: A simple and effective method for news summarization. arXiv preprint arXiv:1912.11602 (2019)

Download references

Acknowledgements

This research work has been partially funded by Generalitat Valenciana through project “SIIA: Tecnologias del lenguaje humano para una sociedad inclusiva, igualitaria, y accesible” (PROMETEU/2018/089), by the Spanish Government through project “Modelang: Modeling the behavior of digital entities by Human Language Technologies” (RTI2018-094653-B-C22), and project “INTEGER - Intelligent Text Generation” (RTI2018-094649-B-I00). Also, this paper is also based upon work from COST Action CA18231 “Multi3Generation: Multi-task, Multilingual, Multi-modal Language Generation”.

Author information

Authors and Affiliations

Department of Software and Computing Systems, University of Alicante, Carretera de San Vicente s/n 03690, Alicante, Spain
Robiert Sepúlveda-Torres, Marta Vicente, Estela Saquete, Elena Lloret & Manuel Palomar

Authors

Robiert Sepúlveda-Torres
View author publications
You can also search for this author in PubMed Google Scholar
Marta Vicente
View author publications
You can also search for this author in PubMed Google Scholar
Estela Saquete
View author publications
You can also search for this author in PubMed Google Scholar
Elena Lloret
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Palomar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Robiert Sepúlveda-Torres .

Editor information

Editors and Affiliations

Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais
University of Derby, Derby, UK
Farid Meziane
German Research Center for Artificial Intelligence, Saarbrücken, Germany
Helmut Horacek
University of Hertfordshire, Hatfield, UK
Epaminondas Kapetanios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sepúlveda-Torres, R., Vicente, M., Saquete, E., Lloret, E., Palomar, M. (2021). Exploring Summarization to Enhance Headline Stance Detection. In: Métais, E., Meziane, F., Horacek, H., Kapetanios, E. (eds) Natural Language Processing and Information Systems. NLDB 2021. Lecture Notes in Computer Science(), vol 12801. Springer, Cham. https://doi.org/10.1007/978-3-030-80599-9_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-80599-9_22
Published: 20 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80598-2
Online ISBN: 978-3-030-80599-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics