Identifying click baits using various machine learning and deep learning techniques

Ghosh, Sohom

doi:10.1007/s41870-020-00473-1

Identifying click baits using various machine learning and deep learning techniques

Original Research
Published: 16 May 2020

Volume 13, pages 1235–1242, (2021)
Cite this article

International Journal of Information Technology Aims and scope Submit manuscript

Sohom Ghosh ORCID: orcid.org/0000-0002-4113-0958¹

250 Accesses
4 Citations
Explore all metrics

Abstract

In today’s world, most readers prefer to read news online as they get instant access to what is happening right now. Furthermore, personalized recommendations help in keeping users engaged. Along with these virtues, online news has some vices as well. One such vice is the presence of alluring social media posts (tweets) relating to news articles whose sole purpose is to draw the attention of the users rather than directing them to read the actual content. Such posts are referred to as click baits. The objective of this paper is to develop a system which is capable of predicting how likely are the social media posts (tweets) relating to new articles tend to be click baits. GloVe embeddings [Pennington et al. in: Empirical methods in natural language processing (EMNLP), pp 1532–1543, 2014] have been used to represent text data numerically. Various novel features (like Word mover’s distances (Kusner et al. in: Proceedings of the 32nd international conference on international conference on machine learning, ICML’15, vol 37, pp 957–966, 2015), subjectivity, polarity of the tweets and so on) have been engineered. Several machine learning-based models like Logistic Regression, Random Forest, XG-Boost and Light GBM have been trained for classification. Moreover, we have also implemented a few deep learning-based models like Deep Neural Networks and Long Short Term Memory for developing this predictive system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sentiment analysis in tweets: an assessment study from classical to modern word representation models

Article 15 November 2022

Deep Learning for Twitter Sentiment Analysis: The Effect of Pre-trained Word Embedding

Sentiment analysis in Portuguese tweets: an evaluation of diverse word representation models

Article 28 June 2023

References

Chakraborty A, Sarkar R, Mrigen A, Ganguly N (2017) Tabloids in the era of social media? Understanding the production and consumption of clickbaits in twitter. In: PACMHCI, pp 1–21
Chakraborty A, Paranjape B, Kakarla S, Ganguly N (2016) Stop clickbait: detecting and preventing clickbaits in online news media. In: 2016 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 9–16
Rony MMU, Hassan N, Yousuf M (2017) Diving deep into clickbaits: Who use them to what extents in which topics with what effects? In: ASONAM
Deudon M (2018) Learning semantic similarity in a continuous space. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31, Curran Associates, Inc., pp 986–997
Kusner MJ, Sun Y, Kolkin NI, Weinberger KQ (2015) From word embeddings to document distances. In: Proceedings of the 32nd international conference on international conference on machine learning, ICML’15, JMLR.org, vol 37, pp 957–966
Anand A, Chakraborty T, Park N (2016) We used neural networks to detect clickbaits: You won’t believe what happened next! CoRR. abs/1612.01340. arXiv:1612.01340v2
Biyani P, Tsioutsiouliklis K, Blackmer J (2016) “8 amazing secrets for getting more clicks”: detecting clickbaits in news streams using article informality. In: Proceedings of the thirtieth AAAI conference on artificial intelligence, AAAI’16, AAAI Press, pp 94–100
Cao X, Le T, Zhang J (2017) Machine learning based detection of clickbait posts in social media. CoRR. abs/1710.01977. arXiv:1710.01977v1
Elyashar A, Bendahan J, Puzis R (2017) Detecting clickbait in online social media: You won’t believe how we did it. CoRR. abs/1710.06699. arXiv:1710.06699v1
Glenski M, Ayton E, Arendt D, Volkova S (2017) Fishing for clickbaits in social images and texts with linguistically-infused neural network models. CoRR. abs/1710.06390. arXiv:1710.06390v1
Grigorev A (2017) Identifying clickbait posts on social media with an ensemble of linear models. CoRR. abs/1710.00399. arXiv:1710.00399v1
Hu B, Lu Z, Li H, Chen Q (2014) Convolutional neural network architectures for matching natural language sentences. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems, vol 27, Curran Associates, Inc., pp 2042–2050
Indurthi V, Oota SR (2017) Clickbait detection using word embeddings. CoRR. abs/1710.02861. arXiv:1710.02861v1
Omidvar A, Jiang H, An A (2018) Using neural network for identifying clickbait in online news media. CoRR. abs/1806.07713. arXiv:1806.07713v1
Papadopoulou O, Zampoglou M, Papadopoulos S, Kompatsiaris I (2017) A two-level classification approach for detecting clickbait posts using text-based features. CoRR. abs/1710.08528. arXiv:1710.08528v1
Thomas P (2017) Clickbait identification using neural networks. arXiv e-prints: arXiv:1710.08721. arXiv:1710.08721v1
Zhou Y (2017) Clickbait detection in tweets using self-attentive network. CoRR. abs/1710.05364. arXiv:1710.05364v1
Potthast M, Gollub T, Komlossy K, Schuster S, Wiegmann M, Fernandez EPG, Hagen M, Stein B (2018) Crowdsourcing a large corpus of clickbait on twitter. In: COLING
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Empirical methods in natural language processing (EMNLP), pp 1532–1543
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article Google Scholar
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD’16, ACM, New York, NY, USA, pp 785–794
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY (2017) Lightgbm: a highly efficient gradient boosting decision tree. In: NIPS
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems, NIPS’13, vol 2, Curran Associates Inc, USA, pp 3111–3119
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pretraining of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Brew J (2019) HuggingFace’s transformers: state-of-the-art natural language processing. CoRR. abs/1910.03771 arXiv:1910.03771v3

Download references

Acknowledgements

We thank Mr Yashu Kant Gupta, Mr Asif Iquebal Ajazi, Mr Uttam Kumar Pandey and Dr Swati Agarwal for their valuable time, guidance and support.

Author information

Authors and Affiliations

Artificial Intelligence, Personal Investments, Centre of Excellence Fidelity Investments, Bangalore, India
Sohom Ghosh

Authors

Sohom Ghosh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sohom Ghosh.

Additional information

Part of this work has been done when the author was previously associated with Times Internet and BITS, Pilani.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghosh, S. Identifying click baits using various machine learning and deep learning techniques. Int. j. inf. tecnol. 13, 1235–1242 (2021). https://doi.org/10.1007/s41870-020-00473-1

Download citation

Received: 08 May 2019
Accepted: 09 May 2020
Published: 16 May 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s41870-020-00473-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identifying click baits using various machine learning and deep learning techniques

Abstract

Access this article

Similar content being viewed by others

Sentiment analysis in tweets: an assessment study from classical to modern word representation models

Deep Learning for Twitter Sentiment Analysis: The Effect of Pre-trained Word Embedding

Sentiment analysis in Portuguese tweets: an evaluation of diverse word representation models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Identifying click baits using various machine learning and deep learning techniques

Abstract

Access this article

Similar content being viewed by others

Sentiment analysis in tweets: an assessment study from classical to modern word representation models

Deep Learning for Twitter Sentiment Analysis: The Effect of Pre-trained Word Embedding

Sentiment analysis in Portuguese tweets: an evaluation of diverse word representation models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation