Abstract
Social networks are used every day to report daily events, although the information published in them many times correspond to fake news. Detecting these fake news has become a research topic that can be approached using deep learning. However, most of the current research on the topic is available only for the English language. When working on fake news detection in other languages, such as Spanish, one of the barriers is the low quantity of labeled datasets available in Spanish. Hence, we explore if it is convenient to translate an English dataset to Spanish using Statistical Machine Translation. We use the translated dataset to evaluate the accuracy of several deep learning architectures and compare the results from the translated dataset and the original dataset in fake news classification. Our results suggest that the approach is feasible, although it requires high-quality translation techniques, such as those found in the translation’s neural-based models.
Keywords
- Fake news
- English-to-Spanish translation
- Statistical Machine Translation
- Deep learning
This is a preview of subscription content, access via your institution.
Buying options




References
Ajao, O., Bhowmik, D., Zargari, S.: Fake news identification on Twitter with hybrid CNN and RNN models. In: Proceedings of the 9th International Conference on Social Media and Society, SMSociety 2018, pp. 226–230 (2018)
Allcott, H., Gentzkow, M.: Social media and fake news in the 2016 election. J. Econ. Perspect. 31, 211–36 (2017)
Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72 (2005)
Boididou, C., Papadopoulos, S., Zampoglou, M., Apostolidis, L., Papadopoulou, O., Kompatsiaris, Y.: Detection and visualization of misleading content on Twitter. Int. J. Multimedia Inf. Retrieval 7(1), 71–86 (2017). https://doi.org/10.1007/s13735-017-0143-x
Caled, D., Silva, M.: FTR-18: Collecting rumours on football transfer news. In: Conference on Information and Knowledge Management Workshops, CIKM, vol. 2482. CEUR-WS (2019)
Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web, WWW, Hyderabad, India, pp. 675–684 (2011)
Cañete, J., Chaperon, G., Fuentes, R., Ho, J.-H., Kang, H., Pérez, J.: Spanish pre-trained BERT model and evaluation data. In: PML4DC at ICLR 2020 (2020)
Costa-jussà, M.R., Zampieri, M., Pal, S.: A neural approach to language variety translation. In: Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects, pp. 275–282. Association for Computational Linguistics (2018)
Deepak, S., Bhadrachalam, C.: Deep neural approach to fake-news identification. Procedia Comput. Sci. 167, 2236–2243 (2020)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, USA, vol. 1. (Long and Short Papers), pp. 4171–4186 (2019)
Ferrara, E.: Manipulation and abuse on social media. ACM SIGWEB Newsletter, pp. 1–9 (2015)
Jehl, L.: Machine Translation for Twitter. Master’s thesis, University of Edinburgh (2010)
Klein, G., Kim, Y., Deng, Y., Senellart, J., Rush, A.: OpenNMT: open-source toolkit for neural machine translation. In: Proceedings of ACL, System Demonstrations, pp. 67–72 (2017)
Koehn, P., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pp. 177–180 (2007)
Kwon, S., Cha, M., Jung, K.: Rumor detection over varying time windows. PLOS One 12, e0168344 (2017)
Liu, Y.: Early detection of fake news on social media. PhD thesis, New Jersey Institute of Technology (2019)
Lohar, P., Popović, M., Way, A.: Building English-to-Serbian machine translation system for IMDb movie reviews. In: Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing, pp. 105–113 (2019)
Ma, J., et al.: Detecting rumors from microblogs with recurrent neural networks. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI2016, pp. 3818–3824 (2016)
Ma, J., Gao, W., Wong, K.-F.: Detect rumors in microblog posts using propagation structure via kernel learning. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 708–717, (2017)
Ma, J., Gao, W., Wong, K.-F.: Rumor detection on Twitter with tree-structured recursive neural networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1980–1989 (2018)
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142–150 (2011)
Mendoza, M., Poblete, B., Castillo, C.: Twitter under crisis: can we trust what we RT? In: Proceedings of the 1st Workshop on Social Media Analytics, SOMA, Washington, USA, pp. 71–79 (2010)
Nouhaila, B., Habib, A., Abdellah, A., Abdelhamid, I.E.F.: Arabic machine translation using bidirectional LSTM encoder-decoder (2018)
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)
Posadas-Durán, J.-P., Gomez-Adorno, H., Sidorov, G., Escobar, J.: Detection of fake news in a new corpus for the Spanish language. J. Intell. Fuzzy Syst. 36(5), 4868–4876 (2019)
Pourebrahim, N., Sultana, S., Edwards, J., Gochanour, A., Mohanty, S.: Understanding communication dynamics on Twitter during natural disasters: a case study of hurricane sandy. Int. J. Disaster Risk Reduct. 37, 101176 (2019)
Providel, E., Mendoza, M.: Using deep learning to detect rumors in Twitter. In: Meiselwitz, G. (ed.) HCII 2020, Part I. LNCS, vol. 12194, pp. 321–334. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49570-1_22
Qazvinian, V., Rosengren, E., Radev, D.R., Mei, Q.: Rumor has it: identifying misinformation in microblogs. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1589–1599 (2011)
Ramírez, V.: Plebiscito Colombia 2016 (2016). https://data.world/bikthor/plebiscito-colombia-2016
Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 851–860 (2010)
Sen, S., Banik, D., Ekbal, A., Bhattacharyya, P.: IITP English-Hindi machine translation system at WAT 2016. In: Proceedings of the 3rd Workshop on Asian Translation (WAT2016), pp. 216–222, Osaka, Japan (2016)
Tiedemann, J.: Parallel data, tools and interfaces in OPUS. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC2012), pp. 2214–2218 (2012)
Tiedemann, J.: Parallel data, tools and interfaces in OPUS. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC2012) (2012)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, 2–4 May 2013, Workshop Track Proceedings (2013)
Vathsala, M., Holi, G.: RNN based machine translation and transliteration for Twitter data. Int. J. Speech Technol. 23, 499–504 (2020)
Wang, Y., et al.: EANN: event adversarial neural networks for multi-modal fake news detection. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, pp. 849–857 (2018)
Yu, F., Liu, Q., Wu, S., Wang, L., Tan, T.: A convolutional approach for misinformation identification. In: IJCAI2017, pp. 3901–3907 (2017)
Zubiaga, A., Aker, A., Bontcheva, K., Liakata, M., Procter, R.: Detection and resolution of rumours in social media: a survey. ACM Comput. Surv. 51, 1–36 (2018)
Acknowledgements
Mr. Mendoza acknowledge funding from the Millennium Institute for Foundational Research on Data. Mr. Mendoza was also funded by ANID PIA/APOYO AFB180002 and ANID FONDECYT 1200211.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Ruíz, S., Providel, E., Mendoza, M. (2021). Fake News Detection via English-to-Spanish Translation: Is It Really Useful?. In: Meiselwitz, G. (eds) Social Computing and Social Media: Experience Design and Social Network Analysis . HCII 2021. Lecture Notes in Computer Science(), vol 12774. Springer, Cham. https://doi.org/10.1007/978-3-030-77626-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-77626-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77625-1
Online ISBN: 978-3-030-77626-8
eBook Packages: Computer ScienceComputer Science (R0)