Abstract
Check-worthiness detection is the task of identifying claims, worthy to be investigated by fact-checkers. Resource scarcity for non-world languages and model learning costs remain major challenges for the creation of models supporting multilingual check-worthiness detection.
This paper proposes cross-training adapters on a subset of world languages, combined by adapter fusion, to detect claims emerging globally in multiple languages. (1) With a vast number of annotators available for world languages and the storage-efficient adapter models, this approach is more cost efficient. Models can be updated more frequently and thus stay up-to-date. (2) Adapter fusion provides insights and allows for interpretation regarding the influence of each adapter model on a particular language.
The proposed solution often outperformed the top multilingual approaches in our benchmark tasks.
This paper is accepted at ECIR 2023.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
We share our source code at https://bit.ly/3rH6yXu.
- 3.
We use the base version of the model, which consists of 12 layers.
- 4.
2e-5 gives better results on the development set of CT22.
- 5.
BigIR is the state of art approach, but there is no associated paper/code describing the system.
References
Alam, F., et al.: Fighting the covid-19 infodemic: modeling the perspective of journalists, fact-checkers, social media platforms, policy makers, and the society. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 611–649 (2021)
Barrón-Cedeño, A., et al.: Overview of CheckThat! 2020: Automatic Identification and Verification of Claims in Social Media. In: Arampatzisa, A., et al. (eds.) CLEF 2020. LNCS, vol. 12260, pp. 215–236. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58219-7_17
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8440–8451 (2020)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, (Long and Short Papers) 1, pp. 4171–4186. Association for Computational Linguistics (2019)
Du, S.M., Gollapalli, S.D., Ng, S.K.: Nus-ids at checkthat! 2022: identifying check-worthiness of tweets using checkthat5. Working Notes of CLEF (2022)
Fuhr, N.: An information nutritional label for online documents. SIGIR Forum 51(3), 46–66 (2017)
Gencheva, P., Nakov, P., Màrquez, L., Barrón-Cedeño, A., Koychev, I.: A context-aware approach for detecting worth-checking claims in political debates. In: Mitkov, R., Angelova, G. (eds.) Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, Varna, Bulgaria, September 2–8, 2017, pp. 267–276. INCOMA Ltd. (2017)
Ginsborg, L., Gori, P.: Report on a survey for fact checkers on covid-19 vaccines and disinformation. Tech. rep., European Digital Media Observatory (EDMO) (2021). https://cadmus.eui.eu/handle/1814/70917
Graves, L.: Understanding the promise and limits of automated fact-checking (2018)
Grootendorst, M.: Bertopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv preprint arXiv:2203.05794 (2022)
Hassan, N., Adair, B., Hamilton, J.T., Li, C., Tremayne, M., Yang, J., Yu, C.: The quest to automate fact-checking. In: Proceedings of the 2015 Computation Journalism Symposium (2015)
Hassan, N., Arslan, F., Li, C., Tremayne, M.: Toward automated fact-checking: Detecting check-worthy factual claims by claimbuster. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13–17, 2017, pp. 1803–1812. ACM (2017)
Hassan, N., Li, C., Tremayne, M.: Detecting check-worthy factual claims in presidential debates. In: Bailey, J., et al., (eds.) Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, VIC, Australia, October 19–23, 2015, pp. 1835–1838. ACM (2015)
Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: ICML. In: Proceedings of Machine Learning Research, vol. 97, pp. 2790–2799. PMLR (2019)
Jaradat, I., Gencheva, P., Barrón-Cedeño, A., Màrquez, L., Nakov, P.: ClaimRank: detecting check-worthy claims in Arabic and English. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pp. 26–30. Association for Computational Linguistics, New Orleans, Louisiana (2018)
Kartal, Y.S., Kutlu, M.: Re-think before you share: A comprehensive study on prioritizing check-worthy claims. In: IEEE Trans. Comput. Soci. Syst. (2022)
Lespagnol, C., Mothe, J., Ullah, M.Z.: Information nutritional label and word embedding to estimate information check-worthiness. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR’19, Assoc. Comput. Mach., New York, NY, USA, pp. 941–944. (2019)
Nakov, P., et al.: The clef-2022 checkthat! lab on fighting the COVID-19 infodemic and fake news detection. In: European Conference on Information Retrieval (2022)
Nakov, P., et al.: Automated fact-checking for assisting human fact-checkers. In: IJCAI. pp. 4551–4558. https://ijcai.org
Nakov, P., et al.: The CLEF-2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News. In: Hiemstra, D., et al. (eds.) ECIR 2021. LNCS, vol. 12657, pp. 639–649. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72240-1_75
Nakov, P., et al.: Overview of the CLEF–2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News. In: Candan, K., et al. (eds.) CLEF 2021. LNCS, vol. 12880, pp. 264–291. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85251-1_19
Pfeiffer, J., Kamath, A., Rücklé, A., Cho, K., Gurevych, I.: Adapterfusion: Non-destructive task composition for transfer learning. In: Merlo, P., Tiedemann, J., Tsarfaty, R. (eds.) In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021, Online, April 19–23, 2021, pp. 487–503. Association for Computational Linguistics (2021)
Pfeiffer, J., et al.: AdapterHub: a framework for adapting transformers. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 46–54. Association for Computational Linguistics, Online (2020)
Pfeiffer, J., Vulić, I., Gurevych, I., Ruder, S.: MAD-X: an Adapter-Based Framework for Multi-Task Cross-Lingual Transfer. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 7654–7673. Association for Computational Linguistics, (2020)
Pfeiffer, J., Vulić, I., Gurevych, I., Ruder, S.: UNKs everywhere: adapting multilingual language models to new scripts. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 10186–10203 (2021)
Reimers, N., Gurevych, I.: Sentence-bert: Sentence embeddings using siamese bert-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992 (2019)
Rücklé, A., et al.: AdapterDrop: On the efficiency of adapters in transformers. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 7930–7946. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic (2021)
Schlicht, I.B., de Paula, A.F.M., Rosso, P.: UPV at checkthat! 2021: mitigating cultural differences for identifying multilingual check-worthy claims. In: CLEF (Working Notes). CEUR Workshop Proceedings, 2936, pp. 465–475. CEUR-WS.org (2021)
Shaar, S., et al.: Overview of the CLEF-2021 checkthat! lab task 1 on check-worthiness estimation in tweets and political debates. In: CLEF (Working Notes). CEUR Workshop Proceedings, 2936, pp. 369–392. CEUR-WS.org (2021)
Singh, k, et al.: Misinformation, believability, and vaccine acceptance over 40 countries: Takeaways from the initial phase of the covid-19 infodemic. PLoS ONE 17(2), e0263381 (2022)
Stickland, A.C., Murray, I.: BERT and pals: Projected attention layers for efficient adaptation in multi-task learning. In: ICML. Proc. Mach. Learn. Res. 97, pp. 5986–5995. PMLR (2019)
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: ICML. Proc. Mach. Learn. Res. vol. 70, pp. 3319–3328. PMLR (2017)
Üstün, A., Bisazza, A., Bouma, G., van Noord, G.: UDapter: Language adaptation for truly universal dependency parsing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (Empirical Methods in Natural Language Processing) Association for Computational Linguistics, pp. 2302–2315 (2020)
Uyangodage, L., Ranasinghe, T., Hettiarachchi, H.: can multilingual transformers fight the COVID-19 infodemic? In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pp. 1432–1437. INCOMA Ltd., Held Online (2021)
Vasileva, S., Atanasova, P., Màrquez, L., Barrón-Cedeño, A., Nakov, P.: It takes nine to smell a rat: Neural multi-task learning for check-worthiness prediction. In: Mitkov, R., Angelova, G. (eds.) Proceedings of the International Conference on Recent Advances in Natural Language Processing, RANLP 2019, Varna, Bulgaria, September 2–4, 2019, pp. 1229–1239. INCOMA Ltd. (2019)
Wolf, T., et al.: Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics, Online (2020)
Xue, L., et al.: mt5: a massively multilingual pre-trained text-to-text transformer. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 483–498 (2021)
Acknowledgements
We would like to thank the anonymous reviewers, Joan Plepi, Flora Sakketou, Akbar Karimi and Nico Para for their constructive feedback. The work of Ipek Schlicht was part of the KID2 project (led by DW Innovation and co-funded by BKM). The work of Lucie Flek was part of the BMBF projects DeFaktS and DynSoDA. The work of Paolo Rosso was carried out in the framework of IBERIFIER (INEA/CEF/ICT/A202072381931 n.2020-EU-IA-0252), XAI Disinfodemics (PLEC2021-007681) and MARTINI (PCI2022-134990–2).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Schlicht, I.B., Flek, L., Rosso, P. (2023). Multilingual Detection of Check-Worthy Claims Using World Languages and Adapter Fusion. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13980. Springer, Cham. https://doi.org/10.1007/978-3-031-28244-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-28244-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28243-0
Online ISBN: 978-3-031-28244-7
eBook Packages: Computer ScienceComputer Science (R0)