Advertisement

TweeProfiles: Detection of Spatio-temporal Patterns on Twitter

  • Tiago Cunha
  • Carlos Soares
  • Eduarda Mendes Rodrigues
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8933)

Abstract

Online social networks present themselves as valuable information sources about their users and their respective behaviours and interests. Many researchers in data mining have analysed these types of data, aiming to find interesting patterns. This paper addresses the problem of identifying and displaying tweet profiles by analysing multiple types of data: spatial, temporal, social and content. The data mining process that extracts the patterns is composed by the manipulation of the dissimilarity matrices for each type of data, which are fed to a clustering algorithm to obtain the desired patterns. This paper studies appropriate distance functions for the different types of data, the normalization and combination methods available for different dimensions and the existing clustering algorithms. The visualization platform is designed for a dynamic and intuitive usage, aimed at revealing the extracted profiles in an understandable and interactive manner. In order to accomplish this, various visualization patterns were studied and widgets were chosen to better represent the information. The use of the project is illustrated with data from the Portuguese twittosphere.

Keywords

Data Mining Clustering Spatio-temporal patterns Visualization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Lee, C.-H., Yang, H.-C., Chien, T.-F., Wen, W.-S.: A Novel Approach for Event Detection by Mining Spatio-temporal Information on Microblogs. In: 2011 International Conference on Advances in Social Networks Analysis and Mining, pp. 254–259 (2011)Google Scholar
  2. 2.
    Bosnjak, M., Oliveira, E.: TwitterEcho: a distributed focused crawler to support open research with twitter data. In: Proc. of the Intl. Workshop on Social Media Applications in News and Entertainment (SMANE 2012), ACM 2012 International World Wide Web Conference (2012)Google Scholar
  3. 3.
    Golder, S.: Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter. HICSS 2010 Proceedings of the 2010 43rd Hawaii International Conference on System Sciences, 1–10 (2010)Google Scholar
  4. 4.
    Abel, F., Gao, Q., Houben, G.J., Tao, K.: Analyzing Temporal Dynamics in Twitter Profiles for Personalized Recommendations in the Social Web. In: Proceedings of ACM WebSci 2011, 3rd International Conference on Web Science (2011)Google Scholar
  5. 5.
    Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Elsevier Science & Technology, Massachussets (2006)Google Scholar
  6. 6.
    Ramakrishnan, R., T. Zhang, M.L.: BIRCH: an efficient data clustering method for very large databases. In: Procedings SIGMOD 1996 Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, vol. 1, pp. 103–114 (1996)Google Scholar
  7. 7.
    Karypis, G., Han, E.H., Kumar, V.: Chameleon: Hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (1999)CrossRefGoogle Scholar
  8. 8.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)Google Scholar
  9. 9.
    Hinneburg, A., Keim, D.A.: An efficient approach to clustering in large multimedia databases with noise. In: Proceedings of 4th International Conference in Knowledge Discovery and Data Mining (KDD 1998), pp. 58–65 (1998)Google Scholar
  10. 10.
    Wang, W., Yang, J., Muntz, R.: STING: A statistical information grid approach to spatial data mining. In: Proceedings of the 23rd International Conference on Very Large Data Bases, pp. 186–195 (1997)Google Scholar
  11. 11.
    Akcora, C.G., Carminati, B., Ferrari, E.: Network and profile based measures for user similarities on social networks. In: 2011 IEEE International Conference on Information Reuse & Integration, pp. 292–298 (2011)Google Scholar
  12. 12.
    Dekker, A.: Conceptual Distance in Social Network Analysis. Journal of Social Structure 6 (2005)Google Scholar
  13. 13.
    Ryu, H., Lease, M., Woodward, N.: Finding and exploring memes in social media. In: Proceedings of the 23rd ACM conference on Hypertext and social media - HT 2012, p. 295 (2012)Google Scholar
  14. 14.
    Manning, C.D., Raghavan, P., Schtze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefzbMATHGoogle Scholar
  15. 15.
    Lopes, A.A., Pinho, R., Paulovich, F.V., Minghim, R.: Visual text mining using association rules. Computers & Graphics 31(3), 316–326 (2007)CrossRefGoogle Scholar
  16. 16.
    Rangrej, A., Kulkarni, S., Tendulkar, A.V.: Comparative study of clustering techniques for short text documents. In: Proceedings of the 20th International Conference Companion on World Wide Web - WWW 2011 (2011)Google Scholar
  17. 17.
    Lee, C.-H.: Mining spatio-temporal information on microblogging streams using a density-based online clustering method. Expert Systems with Applications 39(10), 9623–9641 (2012)CrossRefGoogle Scholar
  18. 18.
    Mahdiraji, A.: Clustering data stream: A survey of algorithms. International Journal of Knowledge-Based and Intelligent Engineering Systems 13(2), 39–44 (2009)Google Scholar
  19. 19.
    Provost, F., Fawcett, T.: Data Science and its Relationship to Big Data and Data-Driven Decision Making. Big Data 1(1), 51–59 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Tiago Cunha
    • 1
  • Carlos Soares
    • 1
  • Eduarda Mendes Rodrigues
    • 1
  1. 1.Faculdade de Engenharia da Universidade do PortoPortoPortugal

Personalised recommendations