Abstract
Nowadays, fake news is spreading rapidly. Many resources are available for fake news detection in high-resource languages like English. Due to the lack of annotated data and corpora for low-resource languages, detecting fake news in low-resource languages is difficult. There is a need for a system for fake news detection in low-resource languages like Dravidian languages. In this research, we used Telugu, Kannada, Tamil, and Malayalam languages and tested with four transformer models: mBERT, XLM-RoBERTa, IndicBERT, and MuRIL. MuRIL gives the best accuracy in these models compared to the remaining models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shu K, Mahudeswaran D, Wang S, Lee D, Liu H (2020) FakeNewsNet: a data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data 8:171–188. https://doi.org/10.1089/big.2020.0062
Horne B, Adali S (2017) This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In: Proceedings of the international AAAI conference on web and social media (2017)
Przybyla P (2020) Capturing the style of fake news. In: Proceedings of AAAI conference on artificial intellegence, vol 34, pp 490–497. https://doi.org/10.1609/aaai.v34i01.5386
Zellers R, Holtzman A, Rashkin H, Bisk Y, Farhadi A, Roesner F, Choi Y (2020) Defending against neural fake news. Neurips
Silva RM, Santos RLS, Almeida TA, Pardo TAS (2020) Towards automatically filtering fake news in Portuguese. Expert Syst Appl 146:113199. https://doi.org/10.1016/j.eswa.2020.113199
Potthast M, Kiesel J, Reinartz K, Bevendorff J, Stein B (2017) A stylometric inquiry into hyperpartisan and fake news. arXiv Preprint http://arxiv.org/abs/1702.05638
Nguyen VH, Sugiyama K, Nakov P, Kan MY (2020) FANG: leveraging social context for fake news detection using graph representation. In: International conference on information & knowledge management. Proceedings. https://doi.org/10.1145/3340531.3412046
Devlin J, Chang MW, Lee K, Toutanova K (2018) BERT: pretraining of deep bidirectional transformers for language understanding. arXiv Preprint http://arxiv.org/abs/1810.04805
Wu X, Lode M (2020) Language models are unsupervised multitask learners (summarization). OpenAI Blog 1:1–7
Liu C, Wu X, Yu M, Li G, Jiang J, Huang W, Lu X (2019) A two-stage model based on BERT for short fake news detection. In: Lecture notes in computer science (Including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics, vol 11776 LNAI, pp 172–183. https://doi.org/10.1007/978-3-030-29563-9_17
Jwa H, Oh D, Park K, Kang JM, Lim H (2019) exBAKE: automatic fake news detection model based on bidirectional encoder representations from transformers (BERT). Appl Sci 9:4062. https://doi.org/10.3390/app9194062
Vijjali R, Potluri P, Kumar S, Teki S (2020) Two stage transformer model for covid-19 fake news detection and fact checking. arXiv Preprint http://arxiv.org/abs/2011.13253
Kakwani D, Kunchukuttan A, Golla S, Gokul NC, Bhattacharyya A, Khapra MM, Kumar P (2020) IndicNLPSuite: Monolingual Corpora, evaluation benchmarks and pre-trained multilingual language models for Indian languages. In: Findings of EMNLP
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Khanuja S, Bansal D, Mehtani S, Khosla S, Dey A, Gopalan B, Margam DK, Aggarwal P, Nagipogu RT, Dave S, Gupta S, Gali SCB, Subramanian V, Talukdar P (2021) Muril: multilingual representations for Indian languages. arXiv:2103.10730
Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzman F, Grave E, Ott M, Zettle-moyer L, Stoyanov V (2019) Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Raja, E., Soni, B., Borgohain, S.K. (2024). Fake News Detection in Dravidian Languages Using Transformer Models. In: Malhotra, R., Sumalatha, L., Yassin, S.M.W., Patgiri, R., Muppalaneni, N.B. (eds) High Performance Computing, Smart Devices and Networks. CHSN 2022. Lecture Notes in Electrical Engineering, vol 1087. Springer, Singapore. https://doi.org/10.1007/978-981-99-6690-5_39
Download citation
DOI: https://doi.org/10.1007/978-981-99-6690-5_39
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-6689-9
Online ISBN: 978-981-99-6690-5
eBook Packages: Computer ScienceComputer Science (R0)