Abstract
Online social networks present themselves as valuable information sources about their users and their respective behaviours and interests. Many researchers in data mining have analysed these types of data, aiming to find interesting patterns. This paper addresses the problem of identifying and displaying tweet profiles by analysing multiple types of data: spatial, temporal, social and content. The data mining process that extracts the patterns is composed by the manipulation of the dissimilarity matrices for each type of data, which are fed to a clustering algorithm to obtain the desired patterns. This paper studies appropriate distance functions for the different types of data, the normalization and combination methods available for different dimensions and the existing clustering algorithms. The visualization platform is designed for a dynamic and intuitive usage, aimed at revealing the extracted profiles in an understandable and interactive manner. In order to accomplish this, various visualization patterns were studied and widgets were chosen to better represent the information. The use of the project is illustrated with data from the Portuguese twittosphere.
An earlier version of this work was presented at the Encontro Nacional de Inteligência Artificial e Computacional (Brazilian AI meeting - ENIAC).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lee, C.-H., Yang, H.-C., Chien, T.-F., Wen, W.-S.: A Novel Approach for Event Detection by Mining Spatio-temporal Information on Microblogs. In: 2011 International Conference on Advances in Social Networks Analysis and Mining, pp. 254–259 (2011)
Bosnjak, M., Oliveira, E.: TwitterEcho: a distributed focused crawler to support open research with twitter data. In: Proc. of the Intl. Workshop on Social Media Applications in News and Entertainment (SMANE 2012), ACM 2012 International World Wide Web Conference (2012)
Golder, S.: Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter. HICSS 2010 Proceedings of the 2010 43rd Hawaii International Conference on System Sciences, 1–10 (2010)
Abel, F., Gao, Q., Houben, G.J., Tao, K.: Analyzing Temporal Dynamics in Twitter Profiles for Personalized Recommendations in the Social Web. In: Proceedings of ACM WebSci 2011, 3rd International Conference on Web Science (2011)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Elsevier Science & Technology, Massachussets (2006)
Ramakrishnan, R., T. Zhang, M.L.: BIRCH: an efficient data clustering method for very large databases. In: Procedings SIGMOD 1996 Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, vol. 1, pp. 103–114 (1996)
Karypis, G., Han, E.H., Kumar, V.: Chameleon: Hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (1999)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)
Hinneburg, A., Keim, D.A.: An efficient approach to clustering in large multimedia databases with noise. In: Proceedings of 4th International Conference in Knowledge Discovery and Data Mining (KDD 1998), pp. 58–65 (1998)
Wang, W., Yang, J., Muntz, R.: STING: A statistical information grid approach to spatial data mining. In: Proceedings of the 23rd International Conference on Very Large Data Bases, pp. 186–195 (1997)
Akcora, C.G., Carminati, B., Ferrari, E.: Network and profile based measures for user similarities on social networks. In: 2011 IEEE International Conference on Information Reuse & Integration, pp. 292–298 (2011)
Dekker, A.: Conceptual Distance in Social Network Analysis. Journal of Social Structure 6 (2005)
Ryu, H., Lease, M., Woodward, N.: Finding and exploring memes in social media. In: Proceedings of the 23rd ACM conference on Hypertext and social media - HT 2012, p. 295 (2012)
Manning, C.D., Raghavan, P., Schtze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)
Lopes, A.A., Pinho, R., Paulovich, F.V., Minghim, R.: Visual text mining using association rules. Computers & Graphics 31(3), 316–326 (2007)
Rangrej, A., Kulkarni, S., Tendulkar, A.V.: Comparative study of clustering techniques for short text documents. In: Proceedings of the 20th International Conference Companion on World Wide Web - WWW 2011 (2011)
Lee, C.-H.: Mining spatio-temporal information on microblogging streams using a density-based online clustering method. Expert Systems with Applications 39(10), 9623–9641 (2012)
Mahdiraji, A.: Clustering data stream: A survey of algorithms. International Journal of Knowledge-Based and Intelligent Engineering Systems 13(2), 39–44 (2009)
Provost, F., Fawcett, T.: Data Science and its Relationship to Big Data and Data-Driven Decision Making. Big Data 1(1), 51–59 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Cunha, T., Soares, C., Mendes Rodrigues, E. (2014). TweeProfiles: Detection of Spatio-temporal Patterns on Twitter. In: Luo, X., Yu, J.X., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2014. Lecture Notes in Computer Science(), vol 8933. Springer, Cham. https://doi.org/10.1007/978-3-319-14717-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-14717-8_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14716-1
Online ISBN: 978-3-319-14717-8
eBook Packages: Computer ScienceComputer Science (R0)