Skip to main content
Log in

Hoax news-inspector: a real-time prediction of fake news using content resemblance over web search results for authenticating the credibility of news articles

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Nowadays social media is one of the important medium of sharing thoughts and opinions of the individual due to its easy access and also it provides an opportunity to the malicious user to post deliberately fabricated false content to influence people for creating controversies, playing with public emotions, etc. The spread of contaminated information such as Rumours, Hoax, Accidental misinformation, etc. over the web is becoming an emergency situation that can have a very harmful impact on society and individuals. In this paper, we have developed an automated system “Hoax-News Inspector” for the detection of fake news that propagates through the web and social media in the form of text. To distinguish fake and real reports on an early basis, we identified prominent features by exploring two sets of attributes that lead to information spread: Article/post-content-based features, Sentiment based features and the mixture of both called as Hybrid features. The proposed algorithm is trained and tested on the self-generated dataset as well as one of the popular existing datasets Liar. It has been found that the proposed algorithm gives the best results using the Random Forest classifier with an accuracy of 95% by considering all sets of features. Detecting and verifying news have many practical applications for business markets, news consumers, and time-sensitive services, which generally help to minimize the spread of false information. Our proposed system Hoax News-Inspector can automatically collect fabricated news data and classify it into binary classes Fake or Real, which later benefits further research for predicting and understanding Fake news.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

source of Useful URLs

Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://thelogicalindian.com/fact-check/delhi-police-jamia-students-protest-citizenship-amendment-act/

  2. https://www.youtube.com/watch?v=2v1juUg5ZM4

  3. https://www.cs.ucsb.edu/˜william/data/liar_dataset.zip

  4. https://pypi.org/project/rake-nltk/

  5. https://www.snopes.com/fact-check/celebrity-death-hoaxes/

  6. https://ezinearticles.com/?Article-Writing---How-Many-Paragraphs-Should-My-Article-Be?&id=4019539

  7. https://en.wikipedia.org/wiki/Wikipedia:Database_reports/Recent_deaths

References

  • Afroz S, Brennan M, Greenstadt R (2012) Detecting hoaxes, frauds, and deception in writing style online. In: Proceedings of the 2012 IEEE symposium on security and privacy, pp 461–475

  • Ajao O, Bhowmik D, Zargari S (2018). Fake news identification on twitter with hybrid cnn and rnn models. In: Proceedings of the 9th international conference on social media and society, pp 226–230

  • Aker A, Derczynski L, Bontcheva K (2017) Simple open stance classification for rumour analysis. arXiv Preprint. arXiv:1708.05286

  • Bodnar T, Tucker C, Hopkinson K, Bilén S G (2014) Increasing the veracity of event detection on social media networks through user trust modeling. In: Proceedings of the 2014 IEEE international conference on big data (Big Data), pp 636–643

  • Bondielli A, Marcelloni F (2019) A survey on fake news and rumour detection techniques. Inf Sci 497:38–55. https://doi.org/10.1016/j.ins.2019.05.035

    Article  Google Scholar 

  • Breiman L (2017) Classification and regression trees. Routledge, Abingdon

    Book  Google Scholar 

  • Castillo C, Mendoza M, Poblete B (2011) Information Credibility on Twitter. In : Proceedings of the 20th international conference on world wide web, 675–684. https://doi.org/10.1145/1963405.1963500

  • Cervon G, Franzese P, Keesee APK (2010) Algorithm quasi-optimal (AQ) learning. Wiley Interdisciplinary Reviews: Computational Statistics, 2(2):218–236

  • Chakraborty A, Paranjape B, Kakarla S, Ganguly N (2016) Stop clickbait: Detecting and preventing clickbaits in online news media. In: Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp 9–16

  • Chen YC, Liu ZY, Kao HY (2017) IKM at SemEval-2017 Task 8: Convolutional neural networks for stance detection and rumor verification. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp 465–469

  • Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv Preprint. arXiv:1406.1078

  • Feng VW, Hirst G (2013) Detecting Deceptive Opinions with Profile Compatibility. In: Proceedings of the Sixth International Joint Conference on Natural Language Processing, pp 338–346. https://www.aclweb.org/anthology/I13-1039

  • Giasemidis G, Singleton C, Agrafiotis I, Nurse JRC, Pilgrim A, Willis C, Greetham DV (2016) Determining the Veracity of Rumours on Twitter. CoRR, abs/1611.0. http://arxiv.org/abs/1611.06314

  • Gupta A, Kumaraguru P, Castillo C, Meier P (2014) TweetCred: {A} Real-time Web-based System for Assessing Credibility of Content on Twitter. CoRR, abs/1405.5. http://arxiv.org/abs/1405.5490

  • Hardalov M, Koychev I, Nakov P (2016) In search of credible news. In: Proceedings of the International Conference on Artificial Intelligence: Methodology, Systems, and Applications, pp 172–180

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Horne BD, Adali S (2017). This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In: Proceedings of the Eleventh International AAAI Conference on Web and Social Media

  • Janze C, Risius M (2017) Automatic Detection of Fake News on Social Media Platforms. PACIS, p 261

  • Khan JY, Khondaker M, Islam T, Iqbal A, Afroz S (2019). A Benchmark Study on Machine Learning Methods for Fake News Detection. arXiv Preprint. arXiv:1905.04749

  • Kwon S, Cha, M, Jung K, Chen W, Wang Y (2013) Prominent Features of Rumor Propagation in Online Social Media. In: Proceedings of the 2013 IEEE 13th International Conference on Data Mining, pp 1103–1108

  • Kwon S, Cha M, Jung K (2017) Rumor detection over varying time windows. PLoS ONE 12:1–19. https://doi.org/10.1371/journal.pone.0168344

    Article  Google Scholar 

  • Lafferty J, Mccallum A, Pereira F (2002) Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. Proc ICML

  • Li Q, Hu Q, Lu Y, Yang Y, Cheng J (2019) Multi-level word features based on CNN for fake news detection in cultural communication. Personal and Ubiquitous Computing, 1–14. https://doi.org/10.1007/s00779-019-01289-y

  • Ma J, Gao W, Wei Z, Lu Y, Wong, KF (2015) Detect Rumors Using Time Series of Social Context Information on Microblogging Websites. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. https://doi.org/10.1145/2806416.2806607

  • Ma J, Gao W, Mitra P, Kwon S, Jansen B J, Wong K F, Cha M (2016). Detecting Rumors from Microblogs with Recurrent Neural Networks. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 3818–3824. http://dl.acm.org/citation.cfm?id=3061053.3061153

  • Ma J, Gao W, Wong K F (2017) Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers): 708–717. https://doi.org/10.18653/v1/P17-1066

  • Pennycook G, Bear A, Collins E, Rand D G (2019) The Implied Truth Effect: Attaching Warnings to a Subset of Fake News Headlines Increases Perceived Accuracy of Headlines Without Warnings. Management Science

  • Pérez-Rosas V, Mihalcea R (2015) Experiments in open domain deception detection.In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1120–1125

  • Qazvinian V, Rosengren E, Radev D R, Mei Q (2011) Rumor Has It: Identifying Misinformation in Microblogs. In: Proceedings of the conference on empirical methods in natural language processing, pp 1589–1599. http://dl.acm.org/citation.cfm?id=2145432.2145602

  • Qin Y, Wurzer D, Lavrenko V, Tang C (2016) Spotting Rumors via Novelty Detection. CoRR, abs/1611.0. http://arxiv.org/abs/1611.06322

  • Quinlan J R (2014) C4. 5: programs for machine learning. Elsevier

  • Rubin VL, Lukoianova T (2015) Truth and deception at the rhetorical structure level. J Assoc Inf Sci Technol 66(5):905–917

    Article  Google Scholar 

  • Rubin V, Conroy N, Chen Y, & Cornwell S (2016) Fake news or truth? using satirical cues to detect potentially misleading news. In: Proceedings of the Second Workshop on Computational Approaches to Deception Detection: 7–17

  • Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor Newsl 19(1):22–36

    Article  Google Scholar 

  • Swift A (2016) Americans’ trust in mass media sinks to new low. Gallup News 14

  • Vishwakarma DK, Varshney D, Yadav A (2019) Detection and veracity analysis of fake news via scrapping and authenticating the web search. Cogn Syst Res. https://doi.org/10.1016/j.cogsys.2019.07.004

    Article  Google Scholar 

  • Vosoughi S (2015) Automatic detection and verification of rumors on Twitter. PhD diss Massachusetts Institute of Technology

  • Vosoughi S, Mohsenvand MN, Roy D (2017) Rumor Gauge: predicting the veracity of rumors on twitter. ACM Trans Knowl Discov Data 11(4):50. https://doi.org/10.1145/3070644 ((1–50:30))

    Article  Google Scholar 

  • Wang W (2017) Liar, Liar pants on fire: a new benchmark dataset for fake news detection. arxiv:1705.00648

  • Wang S, Terano T (2015). Detecting rumor patterns in streaming social media. In: Proceedings of the 2015 IEEE International Conference on Big Data (Big Data): 2709–2715

  • Wang F, Lan M, Wu Y (2017). ECNU at SemEval-2017 Task 8: Rumour evaluation using effective features and supervised ensemble models. In : Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017): 491–496

  • Wojtusiak J, Kaufman KA (2010) Ryszard S. Michalski: the vision and evolution of machine learning. Adv Mach Learn I:3–22

    Article  Google Scholar 

  • Wu K, Yang S, Zhu K Q (2015) False rumors detection on Sina Weibo by propagation structures. In: Proceedings of the 2015 IEEE 31st International Conference on Data Engineering, pp 651–662

  • Yang F, Liu Y, Yu X, Yang M (2012) Automatic Detection of Rumor on Sina Weibo. In: Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics, 13:1–13:7. https://doi.org/10.1145/2350190.2350203

  • Yu F, Liu Q, Wu S, Wang L, Tan T (2017) A Convolutional Approach for Misinformation Identification. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, {IJCAI-17}, 3901–3907. https://doi.org/https://doi.org/10.24963/ijcai.2017/545

  • Zeng L, Starbird K, Spiro ES (2016) unconfirmed: Classifying rumor stance in crisis-related social media messages. In: Proceedings of the Tenth International AAAI Conference on Web and Social Media

  • Zhang H, Fan Z, Zheng J, Liu Q (2012) An improving deception detection method in computer-mediated communication. Journal of Networks, 7(11):1811

  • Zubiaga A, Liakata M, Procter R (2016) Learning Reporting Dynamics during Breaking News for Rumour Detection in Social Media. arxiv:1610.07363

  • Zubiaga A, Liakata M, Procter R (2017) Exploiting context for rumour detection in social media. In: Proceedings of the International Conference on Social Informatics, pp109–123

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dinesh Kumar Vishwakarma.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Varshney, D., Vishwakarma, D.K. Hoax news-inspector: a real-time prediction of fake news using content resemblance over web search results for authenticating the credibility of news articles. J Ambient Intell Human Comput 12, 8961–8974 (2021). https://doi.org/10.1007/s12652-020-02698-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-020-02698-1

Keywords

Navigation