Computational Framework for Generating Visual Summaries of Topical Clusters in Twitter Streams

  • Miray Kas
  • Bongwon Suh
Part of the Studies in Computational Intelligence book series (SCI, volume 526)


As a huge amount of tweets become available online, it has become an opportunity and a challenge to extract useful information from tweets for various purposes. This chapter proposes a novel way to extract topical structure from a large set of tweets and generate a usable summarization along with related topical keywords. Our system covers the full span of the topical analytics of tweets starting with collecting the tweets, processing and preparing them for text analysis, forming clusters of relevant words, and generating visual summaries of most relevant keywords along with their topical context. We evaluate our system by conducting a user study and the results suggest that users are able to detect relevant information and infer relationships between keywords better with our summarization method than they do with the commonly used word cloud visualizations.


Automated summarization Clustering Data mining Twitter Social networks Keyword extraction Topic modeling 


  1. 1.
    Engelbrecht, A.: Computational Intelligence: an Introduction. Wiley, Chichester (2007)CrossRefGoogle Scholar
  2. 2.
    Chen, S.M.: Evaluating weapon systems using fuzzy arithmetic operations. Fuzzy Sets Syst. 77(3), 265–276 (1996)CrossRefGoogle Scholar
  3. 3.
    Palade, V., Bocaniala, C.D.: Computational Intelligence in Fault Diagnosis, 1st ed. Springer Publishing Company, New York (2003)Google Scholar
  4. 4.
    Hwang, S.M., Chen,J.R.: Temperature prediction using fuzzy time series. Trans. Syst. Man Cybern. Part B:Cybern. 30(2), 263–275 (2000)Google Scholar
  5. 5.
    Pedrycs, W., Peters, J.F.: Computational intelligence in software engineering. In: Canadian Conference on Engineering Innovation: Voyage of Discovery, pp. 253–256. St. Johns (1997)Google Scholar
  6. 6.
    Pedrycs, W.: Computational intelligence as an emerging paradigm of software engineering. In: Proceedings of the 14th International Conference on Software Engineering and Knowledge Engineering (SEKE ‘02), pp. 7–14 (2002)Google Scholar
  7. 7.
    Wang, L.: Data Mining with Computational Intelligence. Springer, Heidelberg (2009)Google Scholar
  8. 8.
    Beni, G., Wang, J.: Swarm intelligence in cellular robotic systems. In: NATO Advanced Workshop on Robots and Biological Systems. Tuscany (1989)Google Scholar
  9. 9.
    Kennedy, J.: The particle swarm: social adaptation of knowledge. In: International Conference on Evolutionary Computation, (1997)Google Scholar
  10. 10.
    Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or news media? In: WWW, pp. 591–600 (2010)Google Scholar
  11. 11.
    Naaman, M., Boase, J., Lai, C.H.: Is it really about me?: message content in social awareness streams. In: CSCW, pp. 189–192 (2010)Google Scholar
  12. 12.
    Boyd, D., Golder, S., Lotan, G.: Tweet, tweet, retweet: conversational aspects of retweeting on Twitter. In: HICSS, pp. 1–10 (2010)Google Scholar
  13. 13.
    Java, A., Song, X., Finin, T., Tseng, B.: Why we Twitter: understanding microblogging usage and communities. In: WebKDD & SNA-KDD, pp. 56–65 (2007)Google Scholar
  14. 14.
    Bollen, J., Mao, H., Zeng, X.J.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011)Google Scholar
  15. 15.
    Asur, S., Huberman, B.A.: Predicting the future with social media. In: arXiv Preprint (2010)Google Scholar
  16. 16.
    O’Connor, B., Krieger, M., Ahn, D.: Tweet motif: exploratory search and topic summarization for Twitter. In: ICWSM, pp. 384–385 (2010)Google Scholar
  17. 17.
    Kaye, J.J., et al.: Nokia internet pulse: a long term deployment and iteration of a Twitter visualization. In: CHI EA, pp. 829–844 (2012)Google Scholar
  18. 18.
    Ramage, D., Dumais, S., Liebling, D.: Characterizing microblogs with topic models. In: ICWSM, pp. 384–385 (2010)Google Scholar
  19. 19.
    Acar, A., Muraki, Y.: Twitter for crisis communication: lessons learned from Japan’s tsunami disaster. Int. J. Web Based Communities 7(3), 392–402 (2011)CrossRefGoogle Scholar
  20. 20.
    Li, R., Lei, K.H., Khadiwala, R., Chang, K.C.C.: TEDAS: a Twitter-based event detection system and analysis system. In: ICDE, pp. 1273–1276 (2012)Google Scholar
  21. 21.
    Shamma, D.A., Kennedy, L., Churchill, E.F.: Tweet the debates: understanding community annotation of uncollected sources. In: WSM, pp. 3–10 (2009)Google Scholar
  22. 22.
    Brooks, A.L., Churchill, E.F.: Tune in, tweet on, twit out: information snacking on Twitter. In: CHI, pp. 1–4 (2010)Google Scholar
  23. 23.
    Bernstein, M.S., et al.: Eddi: interactive topic-based browsing of social status streams Eddi: interactive topic-based browsing of social status streams. In: UIST, pp. 303–312 (2010)Google Scholar
  24. 24.
    Archambault, D., Greene, D., Cunningham, P., Hurley, N.: Theme crowds: multi resolution summaries of Twitter usage. In: SMUC, pp. 77–84 (2011)Google Scholar
  25. 25.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATHGoogle Scholar
  26. 26.
    Liu, S., et al.: Interactive, topic-based visual text summarization and analysis. In: CIKM, pp. 543–552 (2009)Google Scholar
  27. 27.
    Hafez, A.I., Ghali, N.I., Hassanien, A.E., Fahmy, A.A.: Genetic algorithms for community detection in social networks. In: International Conference on Intelligent Systems Design and Applications (ISDA), pp. 460–465, Kochi (2012)Google Scholar
  28. 28.
    Pizzuti, C.: Boosting the detection of modular community structure with genetic algorithms and local search. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing (SAC), pp. 226–231 (2012)Google Scholar
  29. 29.
    Pizzuti, C.: Mesoscopic analysis of networks with genetic algorithms. World Wide Web, pp. 1–21 (2012)Google Scholar
  30. 30.
    Brown, M.A., Alkadry, M.: Predictors of social networking and individual performance. In: Citizen 2.0: Public and Governmental Interaction through Web 2.0 Technologies. IGI Global, New York, p. 17 (2012) (Ch 8)Google Scholar
  31. 31.
    Wang, C.G., Szeto, K.Y.: Sales potential optimization on directed social networks: a quasi-parallel genetic algorithm approach. Appl. Evol. Comput. (LNCS) 7248, 114–123 (2012)Google Scholar
  32. 32.
    Baldwin, B., Carpenter, B.: Ling Pipe. (2003)
  33. 33.
    Anderberg, M.R.: Cluster Analysis for Applications. Academic Press Inc., New York (1973)MATHGoogle Scholar
  34. 34.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall advanced reference series, Upper Saddle River (1988)Google Scholar
  35. 35.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)CrossRefGoogle Scholar
  36. 36.
    Pedrycz, W.: Knowledge based clustering in computational intelligence. In: Challenges in Computational Intelligence, pp. 317–341. Springer, Berlin (2007)Google Scholar
  37. 37.
    Xu, R., Wunsch, D.: Computational intelligence in clustering algorithms, with applications. In: Algorithms for Approximation, pp. 31–50. Springer, Berlin (2007)Google Scholar
  38. 38.
    Sibson, R.: SLINK: an optimally efficient algorithm for the single-link cluster method. Comput. J. 16(1), 30–34 (1973)MathSciNetCrossRefGoogle Scholar
  39. 39.
    Sorensen, T.: A method of establishing groups of equal amplitude in plant sociology. Vidensk. Selsk. Biol. Skr. 5(4), 1 (1948)Google Scholar
  40. 40.
    Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)MathSciNetCrossRefMATHGoogle Scholar
  41. 41.
    Hamerly, G., Elkan, C.: Learning the k in k-means. In: NIPS, pp. 281–288 (2003)Google Scholar
  42. 42.
    Song, Y., Wang, H., Wang, Z., Li, H., Chen, W.: Short text conceptualization using a probabilistic knowledgebase. In: IJCAI, pp. 2330–2336 (2011)Google Scholar
  43. 43.
    Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1986)Google Scholar
  44. 44.
    Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)CrossRefGoogle Scholar
  45. 45.
    Ogievetsky, M., Heer, V., Bostock, J.; D3 data-driven documents. IEEE Trans. Vis. Comput. Graph. 17(12), 2301–2309 (2011)Google Scholar
  46. 46.
    Shneiderman, B., Wattenberg, M.: Ordered treemap layouts. In: INFOVIS, pp. 73–78 (2001)Google Scholar
  47. 47.
    Rivadeneira, A.W., Gruen, D.M., Muller, M.J., Millen, D.R.: Getting our head in the clouds. In: CHI, pp. 995–998 (2007)Google Scholar
  48. 48.
    Carmel, D., Uziel, E., Guy, I., Mass, Y., Roitman, H.: Folksonomy-based term extraction for word cloud generation. In: CIKM, pp. 2437–2440 (2011)Google Scholar
  49. 49.
    Herring, S.R., Poon, C.M., Balasi, G.A., Bailey, B.P.: Tweet spiration: leveraging social media for design inspiration. In: CHI EA, pp. 2311–2316 (2011)Google Scholar
  50. 50.
    Lowongtrakool, C., Hiransakolwong, N.: Noise filtering in unsupervised clustering using computation intelligence. Int. J. Math. Anal. 6(59), 2911–2920 (2012)MATHGoogle Scholar
  51. 51.
    Laorden, C., Sanz, B., Santos, I., Galan-Garcia, P., Bringas, P.: Collective classification for spam filtering. In: Computational Intelligence in Security for Information Systems, vol. 6694, pp. 1–8. Malaga (2011)Google Scholar
  52. 52.
    Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: Seventh annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference, pp. 1–9. Redmond (2010)Google Scholar
  53. 53.
    Mathioudakis, M., Koudas, N.: Twitter monitor: trend detection over the Twitter stream. In: SIGMOD, pp. 1155–1158 (2010)Google Scholar
  54. 54.
    Singer, P., Wagner, C., Strohmaier, M.: Understanding co-evolution of social and content networks on Twitter. In: WWW, pp. 57–60 (2010)Google Scholar
  55. 55.
    Cataldi, M., Di Caro, L., Schifanella, C.: Emerging topic detection on Twitter based on temporal and social terms evaluation. In: MDMKDD, vol. 4, pp. 4–10 (2010)Google Scholar
  56. 56.
    Jo, Y., Hopcroft, J., Lagoze, J.: The web of topics: discovering the topology of topic evolution in a corpus. In: WWW, pp. 257–266 (2011)Google Scholar
  57. 57.
    Lin, C.X., Mei, Q., Han, J., Jiang, Y., Danilevsky, M.: The joint inference of topic diffusion and evolution in social communities. In: ICDM, pp. 378–387 (2011)Google Scholar
  58. 58.
    Back, T., Fogel, D.B., Michalewicz, Z.: Handbook of Evolutionary Computation, 1st edn. IOP Publishing Ltd, Bristol (1997)CrossRefGoogle Scholar
  59. 59.
    Raidl, G.: Evolutionary computation: an overview and recent trends. ÖGAI J. 24, 2–7 (2005)Google Scholar
  60. 60.
    Borgs, C., et al.: Dynamics of bid optimization in online advertisement auctions. In: WWW, pp. 531–540 (2007)Google Scholar
  61. 61.
    Yih, W., Goodman, J., Carvalho, V.R.: Finding advertising keywords on web pages. In: WWW, pp. 213–222 (2006)Google Scholar
  62. 62.
    Jin, Y.: A comprehensive survey of fitness approximation in evolutionary computation. Soft. Comput. 9(1), 3–12 (2005)CrossRefGoogle Scholar
  63. 63.
    Jin, Y., Olhofer, M., Sendhoff, B.: A framework for evolutionary optimization with approximate fitness functions. IEEE Trans. Evol. Comput. 6(5), 481–494 (2002)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Electrical and Computer EngineeringCarnegie Mellon UniversityPittsburghUSA
  2. 2.Advanced Technology LabsAdobe Systems IncSan JoseUSA

Personalised recommendations