Clickbait Detection

  • Martin PotthastEmail author
  • Sebastian Köpsel
  • Benno Stein
  • Matthias Hagen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9626)


This paper proposes a new model for the detection of clickbait, i.e., short messages that lure readers to click a link. Clickbait is primarily used by online content publishers to increase their readership, whereas its automatic detection will give readers a way of filtering their news stream. We contribute by compiling the first clickbait corpus of 2992 Twitter tweets, 767 of which are clickbait, and, by developing a clickbait model based on 215 features that enables a random forest classifier to achieve 0.79 ROC-AUC at 0.76 precision and 0.76 recall.


Clickbait Random Forest Content Publishing Twitter Tweets News Streams 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Ajani, S.: A full 63% of buzzfeed’s posts are clickbait (2015).
  2. 2.
    Beckman, J.: Saved you a click—don’t click on that. I already did (2015).
  3. 3.
    Blom, J.N., Hansen, K.R.: Click bait: forward-reference as lure in online news headlines. J. Pragmat. 76, 87–100 (2015)CrossRefGoogle Scholar
  4. 4.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Rocca, J.: Dale-Chall easy word list (2013).
  6. 6.
    Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of ICML 2006, pp. 233–240 (2006)Google Scholar
  7. 7.
    Eidnes, L.: Auto-generating clickbait with recurrent neural networks (2015).
  8. 8.
    El-Arini, K., Tang, J.: News feed FYI: click-baiting (2014).
  9. 9.
    Gianotto, A.: Downworthy—a browser plugin to turn hyperbolic viral headlines into what they really mean (2014).
  10. 10.
    Gunning, R.: The fog index after twenty years. J. Bus. Commun. 6(2), 3–13 (1969)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Hagey, K.: Henry Blodget’s Second Act (2011).
  12. 12.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRefGoogle Scholar
  13. 13.
    Imagga Image Tagging Technology (2015).
  14. 14.
    John, G.H., langley, P.: Estimating continuous distributions in bayesian classifiers. In: Proceedings of UAI 1995, pp. 338–345 (1995)Google Scholar
  15. 15.
    Kempe, R.: Clickbait spoilers—channeling traffic from clickbaiting sites back to reputable providers of original content (2015).
  16. 16.
    Koechley, P.: Why the title matters more than the talk (2012).
  17. 17.
    Kohlschütter, C., Fankhauser, P., Nejdl, W.: Boilerplate detection using shallow text features. In: Proceedings of WSDM 2010, pp. 441–450 (2010)Google Scholar
  18. 18.
    le Cessie, S., van Houwelingen, J.C.: Ridge estimators in logistic regression. Appl. Stat. 41(1), 191–201 (1992)CrossRefzbMATHGoogle Scholar
  19. 19.
    Loewenstein, G.: The psychology of curiosity: a review and reinterpretation. Psychol. Bull. 116(1), 75 (1994)CrossRefGoogle Scholar
  20. 20.
    Mizrahi, A.: HuffPo spoilers—I give in to click-bait so you don’t have to (2015).
  21. 21.
    NewsWhip Media Tracker (2015).
  22. 22.
    Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Lioma, C.: Terrier: a high performance and scalable information retrieval platform. In: OSIR @ SIGIR (2006)Google Scholar
  23. 23.
    Smith, B.: Why buzzfeed doesn’t do clickbait (2015).
  24. 24.
    Stempeck, M.: Upworthy spoiler—words that describe the links that follow (2015).
  25. 25.
    Stone, P.J., Dunphy, D.C., Smith, M.S., Inquirer, T.G.: A Computer Approach to Content Analysis. MIT Press, Cambridge (1966)Google Scholar
  26. 26.
    Vijgen, B.: The listicle: an exploring research on an interesting shareable new media phenomenon. Stud. Univ. Babes-Bolyai-Ephemerides 1, 103–122 (2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Martin Potthast
    • 1
    Email author
  • Sebastian Köpsel
    • 1
  • Benno Stein
    • 1
  • Matthias Hagen
    • 1
  1. 1.Bauhaus-Universität WeimarWeimarGermany

Personalised recommendations