Skip to main content

Multilingual Detection of Check-Worthy Claims Using World Languages and Adapter Fusion

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2023)

Abstract

Check-worthiness detection is the task of identifying claims, worthy to be investigated by fact-checkers. Resource scarcity for non-world languages and model learning costs remain major challenges for the creation of models supporting multilingual check-worthiness detection.

This paper proposes cross-training adapters on a subset of world languages, combined by adapter fusion, to detect claims emerging globally in multiple languages. (1) With a vast number of annotators available for world languages and the storage-efficient adapter models, this approach is more cost efficient. Models can be updated more frequently and thus stay up-to-date. (2) Adapter fusion provides insights and allows for interpretation regarding the influence of each adapter model on a particular language.

The proposed solution often outperformed the top multilingual approaches in our benchmark tasks.

This paper is accepted at ECIR 2023.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://bit.ly/3eMIZ9q.

  2. 2.

    We share our source code at https://bit.ly/3rH6yXu.

  3. 3.

    We use the base version of the model, which consists of 12 layers.

  4. 4.

    2e-5 gives better results on the development set of CT22.

  5. 5.

    BigIR is the state of art approach, but there is no associated paper/code describing the system.

References

  1. Alam, F., et al.: Fighting the covid-19 infodemic: modeling the perspective of journalists, fact-checkers, social media platforms, policy makers, and the society. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 611–649 (2021)

    Google Scholar 

  2. Barrón-Cedeño, A., et al.: Overview of CheckThat! 2020: Automatic Identification and Verification of Claims in Social Media. In: Arampatzisa, A., et al. (eds.) CLEF 2020. LNCS, vol. 12260, pp. 215–236. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58219-7_17

    Chapter  Google Scholar 

  3. Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8440–8451 (2020)

    Google Scholar 

  4. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, (Long and Short Papers) 1, pp. 4171–4186. Association for Computational Linguistics (2019)

    Google Scholar 

  5. Du, S.M., Gollapalli, S.D., Ng, S.K.: Nus-ids at checkthat! 2022: identifying check-worthiness of tweets using checkthat5. Working Notes of CLEF (2022)

    Google Scholar 

  6. Fuhr, N.: An information nutritional label for online documents. SIGIR Forum 51(3), 46–66 (2017)

    Article  Google Scholar 

  7. Gencheva, P., Nakov, P., Màrquez, L., Barrón-Cedeño, A., Koychev, I.: A context-aware approach for detecting worth-checking claims in political debates. In: Mitkov, R., Angelova, G. (eds.) Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, Varna, Bulgaria, September 2–8, 2017, pp. 267–276. INCOMA Ltd. (2017)

    Google Scholar 

  8. Ginsborg, L., Gori, P.: Report on a survey for fact checkers on covid-19 vaccines and disinformation. Tech. rep., European Digital Media Observatory (EDMO) (2021). https://cadmus.eui.eu/handle/1814/70917

  9. Graves, L.: Understanding the promise and limits of automated fact-checking (2018)

    Google Scholar 

  10. Grootendorst, M.: Bertopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv preprint arXiv:2203.05794 (2022)

  11. Hassan, N., Adair, B., Hamilton, J.T., Li, C., Tremayne, M., Yang, J., Yu, C.: The quest to automate fact-checking. In: Proceedings of the 2015 Computation Journalism Symposium (2015)

    Google Scholar 

  12. Hassan, N., Arslan, F., Li, C., Tremayne, M.: Toward automated fact-checking: Detecting check-worthy factual claims by claimbuster. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13–17, 2017, pp. 1803–1812. ACM (2017)

    Google Scholar 

  13. Hassan, N., Li, C., Tremayne, M.: Detecting check-worthy factual claims in presidential debates. In: Bailey, J., et al., (eds.) Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, VIC, Australia, October 19–23, 2015, pp. 1835–1838. ACM (2015)

    Google Scholar 

  14. Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: ICML. In: Proceedings of Machine Learning Research, vol. 97, pp. 2790–2799. PMLR (2019)

    Google Scholar 

  15. Jaradat, I., Gencheva, P., Barrón-Cedeño, A., Màrquez, L., Nakov, P.: ClaimRank: detecting check-worthy claims in Arabic and English. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pp. 26–30. Association for Computational Linguistics, New Orleans, Louisiana (2018)

    Google Scholar 

  16. Kartal, Y.S., Kutlu, M.: Re-think before you share: A comprehensive study on prioritizing check-worthy claims. In: IEEE Trans. Comput. Soci. Syst. (2022)

    Google Scholar 

  17. Lespagnol, C., Mothe, J., Ullah, M.Z.: Information nutritional label and word embedding to estimate information check-worthiness. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR’19, Assoc. Comput. Mach., New York, NY, USA, pp. 941–944. (2019)

    Google Scholar 

  18. Nakov, P., et al.: The clef-2022 checkthat! lab on fighting the COVID-19 infodemic and fake news detection. In: European Conference on Information Retrieval (2022)

    Google Scholar 

  19. Nakov, P., et al.: Automated fact-checking for assisting human fact-checkers. In: IJCAI. pp. 4551–4558. https://ijcai.org

    Google Scholar 

  20. Nakov, P., et al.: The CLEF-2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News. In: Hiemstra, D., et al. (eds.) ECIR 2021. LNCS, vol. 12657, pp. 639–649. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72240-1_75

    Chapter  Google Scholar 

  21. Nakov, P., et al.: Overview of the CLEF–2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News. In: Candan, K., et al. (eds.) CLEF 2021. LNCS, vol. 12880, pp. 264–291. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85251-1_19

    Chapter  Google Scholar 

  22. Pfeiffer, J., Kamath, A., Rücklé, A., Cho, K., Gurevych, I.: Adapterfusion: Non-destructive task composition for transfer learning. In: Merlo, P., Tiedemann, J., Tsarfaty, R. (eds.) In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021, Online, April 19–23, 2021, pp. 487–503. Association for Computational Linguistics (2021)

    Google Scholar 

  23. Pfeiffer, J., et al.: AdapterHub: a framework for adapting transformers. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 46–54. Association for Computational Linguistics, Online (2020)

    Google Scholar 

  24. Pfeiffer, J., Vulić, I., Gurevych, I., Ruder, S.: MAD-X: an Adapter-Based Framework for Multi-Task Cross-Lingual Transfer. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 7654–7673. Association for Computational Linguistics, (2020)

    Google Scholar 

  25. Pfeiffer, J., Vulić, I., Gurevych, I., Ruder, S.: UNKs everywhere: adapting multilingual language models to new scripts. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 10186–10203 (2021)

    Google Scholar 

  26. Reimers, N., Gurevych, I.: Sentence-bert: Sentence embeddings using siamese bert-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992 (2019)

    Google Scholar 

  27. Rücklé, A., et al.: AdapterDrop: On the efficiency of adapters in transformers. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 7930–7946. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic (2021)

    Google Scholar 

  28. Schlicht, I.B., de Paula, A.F.M., Rosso, P.: UPV at checkthat! 2021: mitigating cultural differences for identifying multilingual check-worthy claims. In: CLEF (Working Notes). CEUR Workshop Proceedings, 2936, pp. 465–475. CEUR-WS.org (2021)

    Google Scholar 

  29. Shaar, S., et al.: Overview of the CLEF-2021 checkthat! lab task 1 on check-worthiness estimation in tweets and political debates. In: CLEF (Working Notes). CEUR Workshop Proceedings, 2936, pp. 369–392. CEUR-WS.org (2021)

    Google Scholar 

  30. Singh, k, et al.: Misinformation, believability, and vaccine acceptance over 40 countries: Takeaways from the initial phase of the covid-19 infodemic. PLoS ONE 17(2), e0263381 (2022)

    Article  Google Scholar 

  31. Stickland, A.C., Murray, I.: BERT and pals: Projected attention layers for efficient adaptation in multi-task learning. In: ICML. Proc. Mach. Learn. Res. 97, pp. 5986–5995. PMLR (2019)

    Google Scholar 

  32. Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: ICML. Proc. Mach. Learn. Res. vol. 70, pp. 3319–3328. PMLR (2017)

    Google Scholar 

  33. Üstün, A., Bisazza, A., Bouma, G., van Noord, G.: UDapter: Language adaptation for truly universal dependency parsing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (Empirical Methods in Natural Language Processing) Association for Computational Linguistics, pp. 2302–2315 (2020)

    Google Scholar 

  34. Uyangodage, L., Ranasinghe, T., Hettiarachchi, H.: can multilingual transformers fight the COVID-19 infodemic? In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pp. 1432–1437. INCOMA Ltd., Held Online (2021)

    Google Scholar 

  35. Vasileva, S., Atanasova, P., Màrquez, L., Barrón-Cedeño, A., Nakov, P.: It takes nine to smell a rat: Neural multi-task learning for check-worthiness prediction. In: Mitkov, R., Angelova, G. (eds.) Proceedings of the International Conference on Recent Advances in Natural Language Processing, RANLP 2019, Varna, Bulgaria, September 2–4, 2019, pp. 1229–1239. INCOMA Ltd. (2019)

    Google Scholar 

  36. Wolf, T., et al.: Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics, Online (2020)

    Google Scholar 

  37. Xue, L., et al.: mt5: a massively multilingual pre-trained text-to-text transformer. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 483–498 (2021)

    Google Scholar 

Download references

Acknowledgements

We would like to thank the anonymous reviewers, Joan Plepi, Flora Sakketou, Akbar Karimi and Nico Para for their constructive feedback. The work of Ipek Schlicht was part of the KID2 project (led by DW Innovation and co-funded by BKM). The work of Lucie Flek was part of the BMBF projects DeFaktS and DynSoDA. The work of Paolo Rosso was carried out in the framework of IBERIFIER (INEA/CEF/ICT/A202072381931 n.2020-EU-IA-0252), XAI Disinfodemics (PLEC2021-007681) and MARTINI (PCI2022-134990–2).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ipek Baris Schlicht .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Schlicht, I.B., Flek, L., Rosso, P. (2023). Multilingual Detection of Check-Worthy Claims Using World Languages and Adapter Fusion. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13980. Springer, Cham. https://doi.org/10.1007/978-3-031-28244-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-28244-7_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-28243-0

  • Online ISBN: 978-3-031-28244-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics