Skip to main content
Log in

Language agnostic meme-filtering for hashtag-based social network analysis

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript


Users in social networks utilize hashtags for a variety of reasons. In many cases, hashtags serve retrieval purposes by labeling the content they accompany. More often than not, hashtags are used to promote content, ideas, or conversations producing viral memes. This paper addresses a specific case of hashtag classification: meme-filtering. We argue that hashtags that are correlated with memes may hinder many valuable social media algorithms like trend detection and event identification. We propose and evaluate a set of language-agnostic features that aid the separation of these two classes: meme-hashtags and event-hashtags. The proposed approach is evaluated on two large datasets of Twitter messages written in English and German. A proof-of-concept application of the meme-filtering approach to the problem of event detection is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others


  1. Obviously, there are exception to this rule. In our times, information about earthquakes appears on social media first. However, this is still related to a real-world event.


  • Aha DW, Kibler DF, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6:37–66

    Google Scholar 

  • Bauckhage C (2011) Insights into internet memes. In: ICWSM

  • Boyd D, Golder S, Lotan G (2010) Tweet, tweet, retweet: conversational aspects of retweeting on twitter. In: Proceedings of the 2010 43rd Hawaii international conference on system sciences, IEEE Computer Society, Washington, DC, HICSS ’10, pp 1–10. doi:10.1109/HICSS.2010.412

  • Burton S, Soboleva A (2011) Interactive or reactive? Marketing with twitter. J Consum Mark 28(7):491–499

    Article  Google Scholar 

  • Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15(1):3133–3181

    MATH  MathSciNet  Google Scholar 

  • Grant WJ, Moon B, Busby Grant J (2010) Digital dialogue? Australian politicians’ use of the social network tool twitter. Aust J Polit Sci 45(4):579–604

    Article  Google Scholar 

  • Gupta A, Sycara KP, Gordon GJ, Hefny A (2013) Exploring friend’s influence in cultures in twitter. In: ASONAM, pp 584–591

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11(1):10–18

    Article  Google Scholar 

  • Hawn C (2009) Take two aspirin and tweet me in the morning: how twitter, facebook, and other social media are reshaping health care. Health Aff 28(2):361–368

    Article  Google Scholar 

  • Kamath KY, Caverlee J (2013) Spatio-temporal meme prediction: learning what hashtags will be popular where. In: Proceedings of the 22nd ACM international conference on conference on information & knowledge management, ACM, pp 1341–1350

  • Kamath KY, Caverlee J, Lee K, Cheng Z (2013) Spatio-temporal dynamics of online memes: A study of geo-tagged tweets. In: Proceedings of the 22nd international conference on world wide web, International World Wide Web Conferences Steering Committee, pp 667–678

  • Kleinberg J (2003) Bursty and hierarchical structure in streams. Data Min Knowl Discov 7(4):373–397

    Article  MathSciNet  Google Scholar 

  • Kouloumpis E, Wilson T, Moore J (2011) Twitter sentiment analysis: The good the bad and the omg!. ICWSM 11:538–541

    Google Scholar 

  • Lappas T, Arai B, Platakis M, Kotsakos D, Gunopulos D (2009) On burstiness-aware search for document sequences. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, Paris, June 28–July 1, 2009, pp 477–486, doi:10.1145/1557019.1557075

  • Leskovec J, Backstrom L, Kleinberg J (2009) Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 497–506

  • Parker J, Wei Y, Yates A, Frieder O, Goharian N (2013) A framework for detecting public health trends with twitter. In: ASONAM, pp 556–563

  • Petrovic S, Osborne M, McCreadie R, Macdonald C, Ounis I, Shrimpton L (2013) Can twitter replace newswire for breaking news. In: Seventh international AAAI conference on weblogs and social media

  • Platakis M, Kotsakos D, Gunopulos D (2009) Searching for events in the blogosphere. In: Proceedings of the 18th international conference on world wide web, ACM, pp 1225–1226

  • Qi X, Tang W, Wu Y, Guo G, Fuller E, Zhang CQ (2014) Optimal local community detection in social networks based on density drop of subgraphs. Pattern Recogn Lett 36:46–53

    Article  Google Scholar 

  • Quercia D, Kosinski M, Stillwell D, Crowcroft J (2011) Our twitter profiles, our selves: Predicting personality with twitter. In: Privacy, security, risk and trust (passat), 2011 IEE third international conference on social computing (socialcom), pp 180–185

  • Ruzzo WL, Tompa M (1999) A linear time algorithm for finding all maximal scoring subsequences. ISMB 99:234–241

    Google Scholar 

  • Sen S, Lam SK, Rashid AM, Cosley D, Frankowski D, Osterhouse J, Harper FM, Riedl J (2006) Tagging, communities, vocabulary, evolution. In: Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work, ACM, pp 181–190

  • Teevan J, Ramage D, Morris MR (2011) # twittersearch: a comparison of microblog search and web search. In: Proceedings of the fourth ACM international conference on Web search and data mining, ACM, pp 35–44

  • Tsur O, Rappoport A (2012) What’s in a hashtag?: content based prediction of the spread of ideas in microblogging communities. In: Proceedings of the fifth ACM international conference on Web search and data mining, ACM, pp 643–652

  • Valkanas G, Gunopulos D (2013) How the live web feels about events. In: Iyengar A, Nejdl W, Pei J, Rastogi R, He Q (eds) CIKM, ACM, pp 639–648

  • Yang J, Leskovec J (2011) Patterns of temporal variation in online media. In: Proceedings of the fourth ACM international conference on Web search and data mining, ACM, pp 177–186

Download references


The authors would like to thank the data annotators. This work has been co-financed by EU and Greek National funds through the Operational Program “Education and Lifelong Learning” of the National Strategic Reference Framework (NSRF) - Research Funding Programs: Heraclitus II fellowship, THALIS - GeomComp, THALIS - DISFER, ARISTEIA - MMD,” and the EU funded project INSIGHT.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Dimitrios Kotsakos.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kotsakos, D., Sakkos, P., Katakis, I. et al. Language agnostic meme-filtering for hashtag-based social network analysis. Soc. Netw. Anal. Min. 5, 28 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: