Fake News Detection in Dravidian Languages Using Transformer Models

Raja, Eduri; Soni, Badal; Borgohain, Samir Kumar

doi:10.1007/978-981-99-6690-5_39

Eduri Raja⁴¹,
Badal Soni⁴¹ &
Samir Kumar Borgohain⁴¹

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1087))

Included in the following conference series:

International Conference on Computer Vision, High-Performance Computing, Smart Devices, and Networks

116 Accesses

Abstract

Nowadays, fake news is spreading rapidly. Many resources are available for fake news detection in high-resource languages like English. Due to the lack of annotated data and corpora for low-resource languages, detecting fake news in low-resource languages is difficult. There is a need for a system for fake news detection in low-resource languages like Dravidian languages. In this research, we used Telugu, Kannada, Tamil, and Malayalam languages and tested with four transformer models: mBERT, XLM-RoBERTa, IndicBERT, and MuRIL. MuRIL gives the best accuracy in these models compared to the remaining models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/NLP-Researcher/Indic-fake-news-datasets.

References

Shu K, Mahudeswaran D, Wang S, Lee D, Liu H (2020) FakeNewsNet: a data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data 8:171–188. https://doi.org/10.1089/big.2020.0062
Horne B, Adali S (2017) This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In: Proceedings of the international AAAI conference on web and social media (2017)
Google Scholar
Przybyla P (2020) Capturing the style of fake news. In: Proceedings of AAAI conference on artificial intellegence, vol 34, pp 490–497. https://doi.org/10.1609/aaai.v34i01.5386
Zellers R, Holtzman A, Rashkin H, Bisk Y, Farhadi A, Roesner F, Choi Y (2020) Defending against neural fake news. Neurips
Google Scholar
Silva RM, Santos RLS, Almeida TA, Pardo TAS (2020) Towards automatically filtering fake news in Portuguese. Expert Syst Appl 146:113199. https://doi.org/10.1016/j.eswa.2020.113199
Potthast M, Kiesel J, Reinartz K, Bevendorff J, Stein B (2017) A stylometric inquiry into hyperpartisan and fake news. arXiv Preprint http://arxiv.org/abs/1702.05638
Nguyen VH, Sugiyama K, Nakov P, Kan MY (2020) FANG: leveraging social context for fake news detection using graph representation. In: International conference on information & knowledge management. Proceedings. https://doi.org/10.1145/3340531.3412046
Devlin J, Chang MW, Lee K, Toutanova K (2018) BERT: pretraining of deep bidirectional transformers for language understanding. arXiv Preprint http://arxiv.org/abs/1810.04805
Wu X, Lode M (2020) Language models are unsupervised multitask learners (summarization). OpenAI Blog 1:1–7
Google Scholar
Liu C, Wu X, Yu M, Li G, Jiang J, Huang W, Lu X (2019) A two-stage model based on BERT for short fake news detection. In: Lecture notes in computer science (Including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics, vol 11776 LNAI, pp 172–183. https://doi.org/10.1007/978-3-030-29563-9_17
Jwa H, Oh D, Park K, Kang JM, Lim H (2019) exBAKE: automatic fake news detection model based on bidirectional encoder representations from transformers (BERT). Appl Sci 9:4062. https://doi.org/10.3390/app9194062
Vijjali R, Potluri P, Kumar S, Teki S (2020) Two stage transformer model for covid-19 fake news detection and fact checking. arXiv Preprint http://arxiv.org/abs/2011.13253
Kakwani D, Kunchukuttan A, Golla S, Gokul NC, Bhattacharyya A, Khapra MM, Kumar P (2020) IndicNLPSuite: Monolingual Corpora, evaluation benchmarks and pre-trained multilingual language models for Indian languages. In: Findings of EMNLP
Google Scholar
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Khanuja S, Bansal D, Mehtani S, Khosla S, Dey A, Gopalan B, Margam DK, Aggarwal P, Nagipogu RT, Dave S, Gupta S, Gali SCB, Subramanian V, Talukdar P (2021) Muril: multilingual representations for Indian languages. arXiv:2103.10730
Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzman F, Grave E, Ott M, Zettle-moyer L, Stoyanov V (2019) Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116

Download references

Author information

Authors and Affiliations

National Institute of Technology Silchar, Silchar, Assam, India
Eduri Raja, Badal Soni & Samir Kumar Borgohain

Authors

Eduri Raja
View author publications
You can also search for this author in PubMed Google Scholar
Badal Soni
View author publications
You can also search for this author in PubMed Google Scholar
Samir Kumar Borgohain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eduri Raja .

Editor information

Editors and Affiliations

Delhi Technological University, Delhi, India
Ruchika Malhotra
Jawaharlal Nehru Technological University Kakinada, Kakinada, Andhra Pradesh, India
L. Sumalatha
Universiti Teknikal Malaysia Melaka, Melaka, Malaysia
S. M. Warusia Yassin
National Institute of Technology Silchar, Silchar, Assam, India
Ripon Patgiri
National Institute of Technology Silchar, Silchar, Assam, India
Naresh Babu Muppalaneni

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Raja, E., Soni, B., Borgohain, S.K. (2024). Fake News Detection in Dravidian Languages Using Transformer Models. In: Malhotra, R., Sumalatha, L., Yassin, S.M.W., Patgiri, R., Muppalaneni, N.B. (eds) High Performance Computing, Smart Devices and Networks. CHSN 2022. Lecture Notes in Electrical Engineering, vol 1087. Springer, Singapore. https://doi.org/10.1007/978-981-99-6690-5_39

Download citation

DOI: https://doi.org/10.1007/978-981-99-6690-5_39
Published: 02 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-6689-9
Online ISBN: 978-981-99-6690-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics