Skip to main content
Log in

I-S\(^2\)FND: a novel interpretable self-ensembled semi-supervised model based on transformers for fake news detection

  • Research
  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

One of the serious consequences of social media usage is fake information dissemination that locomotes society towards negativity. Existing solutions focus on supervised fake news detection models, which requires extensive labelled data. In this paper, we deal with two different problems of fake news detection such as (1) Detecting fake news with limited annotated data and (2) Interpretability of the proposed model on fake news detection. We address these issues by designing an Interpretable Self Ensembled Semi-Supervised Fake News Detection Model (I-S\(^2\)FND). In I-S\(^2\)FND, the model learns the enhanced representations of labelled and unlabelled fake news by incorporating an adaptive pseudo-labelling mechanism on unlabelled data. Moreover, interpretation of the model on text using the gradients improves the identification of essential words in the content of fake news. Based on the experimental findings, it is evident that the proposed model outperforms existing state-of-the-art models by approximately 5% in terms of accuracy when trained with only a limited amount of labeled data across different datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Algorithm 3
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Availability of supporting data

https://www.kaggle.com/jruvika/fake-news-detection www.kaggle.com/c/fake-news/data https://github.com/KaiDMML/FakeNewsNet

References

  • Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of Economic Perspectives, 31, 211–236. https://doi.org/10.1257/jep.31.2.211

    Article  Google Scholar 

  • Alrubaian, M., Al-Qurishi, M., Hassan, M. M., et al. (2018). A credibility analysis system for assessing information on twitter. IEEE Transactions on Dependable and Secure Computing, 15(4), 661–674. https://doi.org/10.1109/TDSC.2016.2602338

    Article  Google Scholar 

  • Bansal, R., Paka, W.S., Sengupta, S., et al. (2021) Combining exogenous and endogenous signals with a semi-supervised co-attention network for early detection of covid-19 fake tweets. Pacific-Asia conference on knowledge discovery and data mining pp 188–200

  • Chen, J., Yang, Z., & Yang, D. (2020). MixText: Linguistically-informed interpolation of hidden space for semi-supervised text classification. Proceedings of the 58th annual meeting of the association for computational linguistics. https://doi.org/10.18653/v1/2020.acl-main.194

  • Choraś, M., Demestichas., K., Giełczyk, A., et al. (2021). Advanced machine learning techniques for fake news (online disinformation) detection: A systematic mapping study. Applied Soft Computing, 101,. https://doi.org/10.1016/j.asoc.2020.107050

  • Clark, K., Khandelwal, U., Levy, O., et al. (2019). What does BERT look at? an analysis of BERT’s attention. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, Florence, Italy, pp 276–286. https://doi.org/10.18653/v1/W19-4828. https://aclanthology.org/W19-4828

  • Croce, D., Castellucci, G., & Basili, R. (2020). GAN-BERT: Generative adversarial learning for robust text classification with a bunch of labeled examples. Proceedings of the 58th annual meeting of the association for computational linguistics. https://doi.org/10.18653/v1/2020.acl-main.191

  • De Souza, M., Nogueira, B., & Rossi, R. (2021). A network-based positive and unlabeled learning approach for fake news detection. Machine Learning. https://doi.org/10.1007/s10994-021-06111-6

  • Devlin, J., Chang, M.W., Lee, K., et al. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the conference of the north american chapter of the association for computational linguistics: human language technologies 1. https://doi.org/10.48550/arXiv.1810.04805

  • Dong, X., & Qian, L. (2022). Semi-supervised bidirectional rnn for misinformation detection. Machine Learning with Applications, 10(100), 428. https://doi.org/10.1016/j.mlwa.2022.100428. https://www.sciencedirect.com/science/article/pii/S2666827022001037

  • Dong, X., Victor, U., & Qian, L. (2020). Two-path deep semisupervised learning for timely fake news detection. IEEE Transactions on Computational Social Systems, 7(6), 1386–1398. https://doi.org/10.1109/TCSS.2020.3027639

    Article  Google Scholar 

  • Engelen, V., Hoos, J. E., et al. (2020). A survey on semi-supervised learning. Mach Learn, 109,. https://doi.org/10.1007/s10994-019-05855-6

  • FND1 (2017). Retrieved from https://www.kaggle.com/jruvika/fake-news-detection

  • FND2 (2018). Retrieved from https://www.kaggle.com/c/fake-news/data

  • Gadek, G., & Guélorget, P. (2020). An interpretable model to measure fakeness and emotion in news. Procedia Computer Science, 176, 78–87. https://doi.org/10.1016/j.procs.2020.08.009, knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 24th International Conference KES2020

  • Galli, A., Masciari, E., Moscato, V., et al. (2022). A comprehensive benchmark for fake news detection. Journal of Intelligent Information Systems, 59,. https://doi.org/10.1007/s10844-021-00646-9

  • Gossipcop (2019) Retrieved from https://github.com/KaiDMML/FakeNewsNet

  • Guacho, G.B., Abdali, S., Shah, N., et al. (2018). Semi-supervised content-based detection of misinformation via tensor embeddings. 2018 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM) pp 322–325. https://doi.org/10.1109/ASONAM.2018.8508241

  • Jin, Z., Cao, J., Zhang, Y., et al. (2016). News verification by exploiting conflicting social viewpoints in microblogs. Proceedings of the thirtieth AAAI conference on artificial intelligence p 2972–2978

  • Karisani, P., Karisani, N. (2021). Semi-Supervised Text Classification via Self-Pretraining, Association for Computing Machinery

  • Li, X., Lu, P., Hu, L., et al. (2021). A novel self-learning semi-supervised deep learning network to detect fake news on social media. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-021-11065-x

  • Li, Y., & Ye, J. (2018). Learning adversarial networks for semi-supervised text classification via policy gradient. Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, p 1715–1723. https://doi.org/10.1145/3219819.3219956

  • Liu, C. L., Hsaio, W. H., Lee, C. H., et al. (2016). Semi-supervised text classification with universum learning. IEEE Transactions on Cybernetics, 46(2), 462–473. https://doi.org/10.1109/TCYB.2015.2403573

  • Liu, Y., & Wu, Y.F.B. (2018). Early detection of fake news on social media through propagation path classification with recurrent and convolutional networks. Proceedings of the Thirty-Second AAAI conference on artificial intelligence

  • Lundberg, S.M., & Lee, S.I. (2017). A unified approach to interpreting model predictions. Advances in neural information processing systems 30

  • Meel, P., & Vishwakarma, D.K. (2021a). Fake news detection using semi-supervised graph convolutional network. https://doi.org/10.48550/arXiv.2109.13476

  • Meel, P., & Vishwakarma, D. K. (2021). A temporal ensembling based semi-supervised convnet for the detection of fake news articles. Expert Systems with Applications, 177,. https://doi.org/10.1016/j.eswa.2021.115002

  • Mohseni, S., & Ragan, E. (2018). Combating fake news with interpretable news feed algorithms. https://doi.org/10.48550/arXiv.1811.12349

  • Mohseni S, Ragan E, & Hu X (2019). Open issues in combating fake news: Interpretability as an opportunity. https://doi.org/10.48550/arXiv.1904.03016

  • Paka, W. S., Bansal, R., Kaushik, A., et al. (2021). Cross-sean: A cross-stitch semi-supervised neural attention model for covid-19 fake news detection. Applied Soft Computing, 107,. https://doi.org/10.1016/j.asoc.2021.107393

  • Politifact (2019). Retrieved from https://github.com/KaiDMML/FakeNewsNet

  • Qiao, Y., Wiechmann, D., & Kerz, E. (2020). A language-based approach to fake news detection through interpretable features and BRNN. Proceedings of the 3rd international workshop on rumours and deception in social media (RDSM)

  • Ramnath, S., Nema, P., Sahni, D., et al. (2020). Towards interpreting BERT for reading comprehension based QA. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, pp 3236–3242. 10.18653/v1/2020.emnlp-main.261. https://aclanthology.org/2020.emnlp-main.261

  • Reis, J. C. S., Correia, A., Murai, F., et al. (2019). Supervised learning for fake news detection. IEEE Intelligent Systems, 34(2), 76–81. https://doi.org/10.1109/MIS.2019.2899143

    Article  Google Scholar 

  • Ribeiro, M.T., Singh, S., & Guestrin, C. (2016). " why should i trust you?" explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144

  • Sachan, D. S., Zaheer, M., & Salakhutdinov, R. (2019). Revisiting lstm networks for semi-supervised text classification via mixed objective function. Proceedings of the AAAI conference on artificial intelligence, 33,. https://doi.org/10.1609/aaai.v33i01.33016940

  • Sharma, K., Qian, F., Jiang, H., et al. (2019). Combating fake news: A survey on identification and mitigation techniques 10. https://doi.org/10.1145/3305260

  • Shu, K., Sliva, A., Wang, S., et al. (2017). Fake news detection on social media: A data mining perspective. SIGKDD Explor Newsl, 19,. https://doi.org/10.1145/3137597.3137600

  • Shu, K., Wang, S., Liu, H. (2018). Understanding user profiles on social media for fake news detection. 2018 IEEE conference on multimedia information processing and retrieval (MIPR). https://doi.org/10.1109/MIPR.2018.00092

  • Shu, K., Cui, L., Wang, S., et al. (2019a). Defend: Explainable fake news detection. Proceedings of the 25th ACM SIGKDD International conference on knowledge discovery & data mining p 395–405. https://doi.org/10.1145/3292500.3330935

  • Shu, K., Zhou, X., Wang, S., et al. (2019b). The role of user profiles for fake news detection. Proceedings of the 2019 IEEE/ACM international conference on advances in social networks analysis and mining. https://doi.org/10.1145/3341161.3342927

  • Shu, K., Mahudeswaran, D., Wang, S., et al. (2020). Fakenewsnet: A data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data, 8, 171–188. https://doi.org/10.1089/big.2020.0062

  • Varshini, U. S. S., Sree, R. P., Srinivas, M., et al. (2023). Rdgt-gan: Robust distribution generalization of transformers for covid-19 fake news detection. IEEE Transactions on Computational Social Systems, 1–15. https://doi.org/10.1109/TCSS.2023.3269595

  • Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146–1151. https://doi.org/10.1126/science.aap9559

    Article  Google Scholar 

  • Wynne, H.E., Wint, Z.Z. (2019). Content based fake news detection using n-gram models. Proceedings of the 21st international conference on information integration and web-based applications & services. https://doi.org/10.1145/3366030.3366116

  • Yang, X., Song, Z., King, I., et al. (2021). A survey on deep semi-supervised learning. https://doi.org/10.48550/ARXIV.2103.00550

  • Zhang, D., Xu, J., Zadorozhny, V., et al. (2022). Fake news detection based on statement conflict. Journal of Intelligent Information Systems, 59. https://doi.org/10.1007/s10844-021-00678-1

  • Zhou, X., & Zafarani, R. (2019). Network-based fake news detection: A pattern-driven approach. SIGKDD Explor Newsl, 21, 48–60. https://doi.org/10.1145/3373464.3373473

Download references

Funding

Not Applicable

Author information

Authors and Affiliations

Authors

Contributions

U. Shivani Sri Varshini implemented the proposed model. R. Praneetha Sree, M. Srinivas and R.B.V. Subramanyam wrote manuscript. R. Praneetha Sree, M. Srinivas and R.B.V. Subramanyam equally contributed for the proposed model designing and implementation. All authors contributed equally.

Corresponding author

Correspondence to Shivani Sri Varshini U.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

Not Applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

U, S.S.V., R, P.S., M, S. et al. I-S\(^2\)FND: a novel interpretable self-ensembled semi-supervised model based on transformers for fake news detection. J Intell Inf Syst (2023). https://doi.org/10.1007/s10844-023-00821-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10844-023-00821-0

Keywords

Navigation