Credible or Incredible? Dissecting Urban Legends

  • Marco Guerini
  • Carlo Strapparava
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8404)


Urban legends are a genre of modern folklore, consisting of stories about rare and exceptional events, just plausible enough to be believed. In our view, while urban legends represent a form of “sticky” deceptive text, they are marked by a tension between the credible and incredible. They should be credible like a news article and incredible like a fairy tale. In particular we will focus on the idea that urban legends should mimic the details of news (who, where, when) to be credible, while they should be emotional and readable like a fairy tale to be catchy and memorable. Using NLP tools we will provide a quantitative analysis of these prototypical characteristics. We also lay out some machine learning experiments showing that it is possible to recognize an urban legend using just these simple features.


Sentiment Analysis Fairy Tale Computational Linguistics Business Traveler Name Entity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Dawkins, R.: The Selfish Gene, vol. 199. Oxford University Press, USA (2006)Google Scholar
  2. 2.
    Coleman, L.: Alligators-in-the-sewers: a journalistic origin. Journal of American Folklore, 335–338 (1979)Google Scholar
  3. 3.
    Heath, C., Heath, D.: Made to stick: Why some ideas survive and others die. Random House (2007)Google Scholar
  4. 4.
    Gladwell, M.: The tipping point: How little things can make a big difference. Little, Brown (2000)Google Scholar
  5. 5.
    Brunvand, J.: The vanishing hitchhiker: American urban legends and their meanings. WW Norton & Company (1981)Google Scholar
  6. 6.
    Mihalcea, R., Strapparava, C.: The lie detector: Explorations in the automatic recognition of deceptive language. In: Proceedings of ACL 2009, Singapore, pp. 309–312 (2009)Google Scholar
  7. 7.
    Ott, M., Choi, Y., Cardie, C., Hancock, J.T.: Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 309–319. Association for Computational Linguistics, Stroudsburg (2011)Google Scholar
  8. 8.
    Sculley, D., Otey, M.E., Pohl, M., Spitznagel, B., Hainsworth, J., Zhou, Y.: Detecting adversarial advertisements in the wild. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 274–282. ACM (2011)Google Scholar
  9. 9.
    Lumezanu, C., Feamster, N., Klein, H.: # bias: Measuring the tweeting behavior of propagandists. In: Sixth International AAAI Conference on Weblogs and Social Media (2012)Google Scholar
  10. 10.
    Feng, S., Xing, L., Gogar, A., Choi, Y.: Distributional footprints of deceptive product reviews. In: Proceedings of the 2012 International AAAI Conference on WebBlogs and Social Media (June 2012)Google Scholar
  11. 11.
    Simmons, M., Adamic, L.A., Adar, E.: Memes online: Extracted, subtracted, injected, and recollected. In: ICWSM (2011)Google Scholar
  12. 12.
    Lerman, K., Ghosh, R.: Information contagion: an empirical study of the spread of news on digg and twitter social networks. In: Proceedings of 4th International Conference on Weblogs and Social Media, ICWSM 2010 (2010)Google Scholar
  13. 13.
    Khabiri, E., Hsu, C.F., Caverlee, J.: Analyzing and predicting community preference of socially generated metadata: A case study on comments in the digg community. In: ICWSM (2009)Google Scholar
  14. 14.
    Aaditeshwar Seth, J.Z., Cohen, R.: A multi-disciplinary approach for recommending weblog messages. In: The AAAI 2008 Workshop on Enhanced Messaging (2008)Google Scholar
  15. 15.
    Lerman, K., Galstyan, A.: Analysis of social voting patterns on digg. In: Proceedings of the First Workshop on Online Social Networks, WOSP 2008, pp. 7–12. ACM, New York (2008)CrossRefGoogle Scholar
  16. 16.
    Aral, S., Walker, D.: Creating social contagion through viral product design: A randomized trial of peer influence in networks. Management Science 57, 1623–1639 (2011)CrossRefGoogle Scholar
  17. 17.
    Jamali, S.: Comment mining, popularity prediction, and social network analysis. Master’s thesis, George Mason University, Fairfax, VA (2009)Google Scholar
  18. 18.
    Berger, J.A., Milkman, K.L.: Social Transmission, Emotion, and the Virality of Online Content. Social Science Research Network Working Paper Series (2009)Google Scholar
  19. 19.
    Danescu-Niculescu-Mizil, C., Cheng, J., Kleinberg, J., Lee, L.: You had me at hello: How phrasing affects memorability. In: Proceedings of the ACL (2012)Google Scholar
  20. 20.
    Guerini, M., Strapparava, C., Özbal, G.: Exploring text virality in social networks. In: Proceedings of ICWSM 2011, Barcelona, Spain (2011)Google Scholar
  21. 21.
    Guerini, M., Pepe, A., Lepri, B.: Do linguistic style and readability of scientific abstracts affect their virality. In: Proceedings of ICWSM 2012 (2012)Google Scholar
  22. 22.
    Lobo, P.V., de Matos, D.M.: Fairy tale corpus organization using latent semantic mapping and an item-to-item top-n recommendation algorithm. In: Language Resources and Evaluation Conference, LREC (2010)Google Scholar
  23. 23.
    Hart, M.: Project gutenberg. Project Gutenberg (2000)Google Scholar
  24. 24.
    Pianta, E., Girardi, C., Zanoli, R.: The textpro tool suite. In: Proceedings of LREC (2008)Google Scholar
  25. 25.
    Verhagen, M., Pustejovsky, J.: Temporal processing with the tarsqi toolkit. In: 22nd International Conference on on Computational Linguistics: Demonstration Papers, pp. 189–192. Association for Computational Linguistics (2008)Google Scholar
  26. 26.
    Pustejovsky, J., Castano, J., Ingria, R., Sauri, R., Gaizauskas, R., Setzer, A., Katz, G., Radev, D.: Timeml: Robust specification of event and temporal expressions in text. New Directions in Question Answering 3, 28–34 (2003)Google Scholar
  27. 27.
    Gatti, L., Guerini, M.: Assessing sentiment strength in words prior polarities. In: Proceedings of the 24th International Conference on Computational Linguistics, COLING 2012 (2012)Google Scholar
  28. 28.
    Esuli, A., Sebastiani, F.: SentiWordNet: A publicly available lexical resource for opinion mining. In: Proceedings of LREC 2006, Genova, IT, pp. 417–422 (2006)Google Scholar
  29. 29.
    Tonelli, S., Manh, K.T., Pianta, E.: Making readability indices readable. In: Proceedings of the First Workshop on Predicting and Improving Text Readability for target reader populations, pp. 40–48. Association for Computational Linguistics (2012)Google Scholar
  30. 30.
    Gunning, R.: The technique of clear writing. McGraw-Hill (1952)Google Scholar
  31. 31.
    Flesch, R.: The Art of plain talk. Harper (1946)Google Scholar
  32. 32.
    Kincaid, J., Fishburne, R., Rogers, R., Chissom, B.: Derivation of new readability formulas (automated readability index, fog count, and flesch reading ease formula) for navy enlisted personnel. Research branch report 8-75, Chief of Naval Technical Training: Naval Air Station Memphis (1975)Google Scholar
  33. 33.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)Google Scholar
  34. 34.
    Chang, C., Lin, C.: Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2,  27 (2011)Google Scholar
  35. 35.
    Yeh, A.: More accurate tests for the statistical significance of result differences. In: Proceedings of the 18th Conference on Computational Linguistics, vol. 2, pp. 947–953. Association for Computational Linguistics (2000)Google Scholar
  36. 36.
    Gorodkin, J.: Comparing two K-category assignments by K-category correlation coefficient. Computational Biology and Chemistry 28, 367–374 (2004)CrossRefMATHGoogle Scholar
  37. 37.
    Heath, C., Bell, C., Sternberg, E.: Emotional selection in memes: The case of urban legends. Journal of Personality and Social Psychology 81, 1028–1041 (2001)CrossRefGoogle Scholar
  38. 38.
    Pennebaker, J., Francis, M.: Linguistic inquiry and word count: LIWC. Erlbaum Publishers (2001)Google Scholar
  39. 39.
    Louis, A., Nenkova, A.: What makes writing great? first experiments on article quality prediction in the science journalism domain. Transactions of ACL (2013)Google Scholar
  40. 40.
    Strapparava, C., Mihalcea, R.: Learning to identify emotions in text. In: Proceedings of the 23rd Annual ACM Symposium on Applied Computing (APPLIED COMPUTING 2008), Fortaleza, Brazil, pp. 1556–1560 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Marco Guerini
    • 1
  • Carlo Strapparava
    • 2
  1. 1.Trento-RISETrentoItaly
  2. 2.FBK-IrstTrentoItaly

Personalised recommendations