Skip to main content

Fake News Detection in Dravidian Languages Using Transformer Models

  • Conference paper
  • First Online:
High Performance Computing, Smart Devices and Networks (CHSN 2022)

Abstract

Nowadays, fake news is spreading rapidly. Many resources are available for fake news detection in high-resource languages like English. Due to the lack of annotated data and corpora for low-resource languages, detecting fake news in low-resource languages is difficult. There is a need for a system for fake news detection in low-resource languages like Dravidian languages. In this research, we used Telugu, Kannada, Tamil, and Malayalam languages and tested with four transformer models: mBERT, XLM-RoBERTa, IndicBERT, and MuRIL. MuRIL gives the best accuracy in these models compared to the remaining models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/NLP-Researcher/Indic-fake-news-datasets.

References

  1. Shu K, Mahudeswaran D, Wang S, Lee D, Liu H (2020) FakeNewsNet: a data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data 8:171–188. https://doi.org/10.1089/big.2020.0062

  2. Horne B, Adali S (2017) This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In: Proceedings of the international AAAI conference on web and social media (2017)

    Google Scholar 

  3. Przybyla P (2020) Capturing the style of fake news. In: Proceedings of AAAI conference on artificial intellegence, vol 34, pp 490–497. https://doi.org/10.1609/aaai.v34i01.5386

  4. Zellers R, Holtzman A, Rashkin H, Bisk Y, Farhadi A, Roesner F, Choi Y (2020) Defending against neural fake news. Neurips

    Google Scholar 

  5. Silva RM, Santos RLS, Almeida TA, Pardo TAS (2020) Towards automatically filtering fake news in Portuguese. Expert Syst Appl 146:113199. https://doi.org/10.1016/j.eswa.2020.113199

  6. Potthast M, Kiesel J, Reinartz K, Bevendorff J, Stein B (2017) A stylometric inquiry into hyperpartisan and fake news. arXiv Preprint http://arxiv.org/abs/1702.05638

  7. Nguyen VH, Sugiyama K, Nakov P, Kan MY (2020) FANG: leveraging social context for fake news detection using graph representation. In: International conference on information & knowledge management. Proceedings. https://doi.org/10.1145/3340531.3412046

  8. Devlin J, Chang MW, Lee K, Toutanova K (2018) BERT: pretraining of deep bidirectional transformers for language understanding. arXiv Preprint http://arxiv.org/abs/1810.04805

  9. Wu X, Lode M (2020) Language models are unsupervised multitask learners (summarization). OpenAI Blog 1:1–7

    Google Scholar 

  10. Liu C, Wu X, Yu M, Li G, Jiang J, Huang W, Lu X (2019) A two-stage model based on BERT for short fake news detection. In: Lecture notes in computer science (Including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics, vol 11776 LNAI, pp 172–183. https://doi.org/10.1007/978-3-030-29563-9_17

  11. Jwa H, Oh D, Park K, Kang JM, Lim H (2019) exBAKE: automatic fake news detection model based on bidirectional encoder representations from transformers (BERT). Appl Sci 9:4062. https://doi.org/10.3390/app9194062

  12. Vijjali R, Potluri P, Kumar S, Teki S (2020) Two stage transformer model for covid-19 fake news detection and fact checking. arXiv Preprint http://arxiv.org/abs/2011.13253

  13. Kakwani D, Kunchukuttan A, Golla S, Gokul NC, Bhattacharyya A, Khapra MM, Kumar P (2020) IndicNLPSuite: Monolingual Corpora, evaluation benchmarks and pre-trained multilingual language models for Indian languages. In: Findings of EMNLP

    Google Scholar 

  14. Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

  15. Khanuja S, Bansal D, Mehtani S, Khosla S, Dey A, Gopalan B, Margam DK, Aggarwal P, Nagipogu RT, Dave S, Gupta S, Gali SCB, Subramanian V, Talukdar P (2021) Muril: multilingual representations for Indian languages. arXiv:2103.10730

  16. Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzman F, Grave E, Ott M, Zettle-moyer L, Stoyanov V (2019) Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eduri Raja .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Raja, E., Soni, B., Borgohain, S.K. (2024). Fake News Detection in Dravidian Languages Using Transformer Models. In: Malhotra, R., Sumalatha, L., Yassin, S.M.W., Patgiri, R., Muppalaneni, N.B. (eds) High Performance Computing, Smart Devices and Networks. CHSN 2022. Lecture Notes in Electrical Engineering, vol 1087. Springer, Singapore. https://doi.org/10.1007/978-981-99-6690-5_39

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-6690-5_39

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-6689-9

  • Online ISBN: 978-981-99-6690-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics