Abstract
Local exceptionality detection on social interaction networks includes the analysis of resources created by humans (e. g., social media) as well as those generated by sensor devices in the context of (complex) interactions. This paper provides a structured overview on a line of work comprising a set of papers that focus on data-driven exploration and modeling in the context of social network analysis, community detection and pattern mining.
You have full access to this open access chapter, Download conference paper PDF
1 Introduction
In ubiquitous and social environments, a variety of heterogenous multi-relational data is generated, e. g., by sensors and social media. Then, a set of complex networks can be derived, in the form of social interaction networks [2], capturing distinct facets of the interaction space [19]. In that context, local exceptionality detection – based on subgroup discovery and exceptional model mining – provides flexible approaches for data exploration, assessment, and the detection of unexpected and interesting phenomena.
Subgroup discovery [3, 15, 23] is an approach for discovering interesting subgroups – as an instance of local pattern detection [20]. The interestingness is usually defined by a certain property of interest formalized by a quality function. In the simplest case, a binary target variable is considered, where the share in a subgroup can be compared to the share in the dataset in order to detect (exceptional) deviations. More complex target concepts consider sets of target variables. In particular, exceptional model mining [3, 12] focuses on more complex quality functions. In the context of ubiquitous data and social media, interesting target concepts are given, e. g., by densely connected graph structures (communities) [5], unexpected spatio-semantic distributions [8], or exceptional matches between online-offline relations [13] for behavioral characterization.
This paper focuses on formalizations and applications of subgroup discovery and exceptional model mining in the context of social interaction networks. We summarize recent work on community detection, behavior characterization and spatio-temporal analysis, and efficient implementation (comprising the papers [1, 2, 4–8, 10, 13]). In that way, we provide a compact and structured overview of recent scientific advances in this field, covering specific methods and their applications for analyzing social interactions.
2 Methods
Social interaction networks [2, 17, 18] focus on user-related social networks in social media capturing social relations inherent in social interactions, social activities and other social phenomena which act as proxies for social user-relatedness. Therefore, according to the categorization of Wassermann and Faust [22, p. 37 ff.] social interaction networks focus on interaction relations between people as the corresponding actors. This also includes interaction data from sensors and mobile devices, as long as the data is created by real users [1, 2].
In such contexts, exploratory data analysis is an important approach, e. g., for getting first insights into the data. In particular, descriptive data mining aims to uncover certain patterns for characterization and description of the data and the captured relations. Typically, the goal of the methods is not only an actionable model, but also a human interpretable set of patterns [16].
Subgroup discovery and exceptional model mining are prominent methods for local exceptionality detection that can be configured and adapted to various analytical tasks. Local exceptionality detection especially supports the goal of explanation-aware data mining [9], due to its more interpretable results, e. g., for characterizing a set of data, for concept description, for providing regularities and associations between elements in general, and for detecting and characterizing unexpected situations, e. g., events or episodes. In the following, we summarize approaches and methods for local exceptionality detection on attributed graphs, for behavioral characterization, and spatio-temporal analysis. Furthermore, we address issues of scalability and large-scale data processing.
2.1 Description-Oriented Community Detection
Communities can intuitively be defined as subsets of nodes of a graph with a dense structure in the corresponding subgraph. However, for mining such communities usually only structural aspects are taken into account. Typically, no concise nor easily interpretable community description is provided.
In [5], we focus on description-oriented community detection using subgroup discovery. For providing both structurally valid and interpretable communities we utilize the graph structure as well as additional descriptive features of the graph’s nodes. We aim at identifying communities according to standard community quality measures, while providing characteristic descriptions at the same time. We propose several optimistic estimates of standard community quality functions to be used for efficient pruning of the search space in an exhaustive branch-and-bound algorithm. We present examples of an evaluation using five real-world data sets, obtained from three different social media applications, showing runtime improvements of several orders of magnitude. The results also indicate significant semantic structures compared to the baselines. A further application of this method to the exploratory analysis of social media using geo-references in demonstrated in [2, 6]. A scalable implementation of the described description-oriented community detection approach, i. e., the COMODO algorithm [5], is described in [7], which is also suited for large-scale data processing utilizing the Map/Reduce framework [11]. With that, we can apply the same method for in-memory datasets as well as for large-scale datasets supporting efficient processing.
2.2 Behavioral Characterization on Social Interaction Networks
Important structures that emerge in social interaction networks are given by subgroups. As outlined above, we can apply community detection in order to mine both the graph structure and descriptive features in order to obtain description-oriented communities. However, we can also analyze subgroups in a social interaction network from a compositional perspective, i. e., neglecting the graph structure. Then, we focus on the attributes of subsets of nodes or on derived parameters of these, e. g., corresponding to roles, centrality scores, etc. In addition, we can also consider sequential data, e. g., for characterization of exceptional link trails, i. e., sequential transitions, as presented in [4].
In [1], we discuss a number of exemplary analysis results of social behavior in mobile social networks, focusing on the characterization of links and roles. For that, we describe the configuration, adaptation and extension of the subgroup discovery methodology in that context. In addition, we can analyze multiplex networks by considering the match between different networks, and deviations between the networks, respectively. A description of characteristic (mis-)matches in a multiplex network, for example, is presented in [13] regarding relations between online and offline social interaction networks. Outlining these examples, we demonstrate that local exceptionality detection is a flexible approach for compositional analysis in social interaction networks.
2.3 Exceptional Model Mining for Spatio-Temporal Analysis
Exploratory analysis on ubiquitous data needs to handle different heterogenous and complex data types. In [2, 8], we present an adaptation of subgroup discovery using exceptional model mining formalizations on ubiquitous social interaction networks. Then, we can detect locally exceptional patterns, e. g., corresponding to bursts or special events in a dynamic network. Furthermore, we propose subgroup discovery and assessment approaches for obtaining interesting descriptive patterns and provide a novel graph-based analysis approach for assessing the relations between the obtained subgroup set. This exploratory visualization approaches allows for the comparison of subgroups according to their relations to other subgroups and to include further parameters, e. g., geo-spatial distribution indicators. We present and discuss analysis results utilizing a real-world ubiquitous social media dataset.
3 Conclusions and Outlook
Subgroup discovery and exceptional model mining provide powerful and comprehensive methods for knowledge discovery and exploratory analyis in the context of local exceptionality detection. In this paper, we presented according approaches and methods, specifically targeting social interaction networks, and showed how to implement local exceptionality detection on both a methodological and practical level.
Interesting future directions for adapting and extending local exceptionality detection in social contexts include extended postprocessing and presentation options, e. g., [3]. In addition, extensions to predictive modeling, e. g., link prediction [2, 21] are interesting options to explore. Furthermore, extending the analysis of sequential data in online or offline social contexts, e. g., based on Markov chains as exceptional models [4, 10], or network dynamics [14] are further interesting options for future work.
References
Atzmueller, M.: Mining social media: key players, sentiments, and communities. WIREs Data Min. Knowl. Discovery (DMKD) 2(5), 411–419 (2012)
Atzmueller, M.: Data mining on social interaction networks. JDMDH 29, 1–21 (2014)
Atzmueller, M.: Subgroup discovery. WIREs DMKD 5(1), 35–49 (2015)
Atzmueller, M.: Detecting community patterns capturing exceptional link trails. In: Proceedings IEEE/ACM ASONAM. IEEE Press, Boston, MA, USA (2016)
Atzmueller, M., Doerfel, S., Mitzlaff, F.: Description-oriented community detection using exhaustive subgroup discovery. Inf. Sci. 329, 965–984 (2016)
Atzmueller, M., Lemmerich, F.: Exploratory pattern mining on social media using geo-references and social tagging information. IJWS 2(1/2), 80–112 (2013)
Atzmueller, M., Mollenhauer, D., Schmidt, A.: Big Data analytics using local exceptionality detection. In: Enterprise Big Data Engineering, Analytics, and Management. IGI Global, Hershey, PA, USA (2016)
Atzmueller, M., Mueller, J., Becker, M.: Exploratory subgroup analytics on ubiquitous data. In: Atzmueller, M., Chin, A., Scholz, C., Trattner, C. (eds.) MUSE/MSM 2013, LNAI 8940. LNCS, vol. 8940, pp. 1–20. Springer, Heidelberg (2015)
Atzmueller, M., Roth-Berghofer, T.: The mining and analysis continuum of explaining uncovered. In: Proceedings 30th SGAI International Conference on Artificial Intelligence (2010)
Atzmueller, M., Schmidt, A., Kibanov, M.: DASHTrails: an approach for modeling and analysis of distribution-adapted sequential hypotheses and trails. In: Proceedings WWW 2016 (Companion). IW3C2/ACM (2016)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Duivesteijn, W., Feelders, A.J., Knobbe, A.: Exceptional model mining. Data Min. Knowl. Discovery 30(1), 47–98 (2016)
Kibanov, M., Atzmueller, M., Illig, J., Scholz, C., Barrat, A., Cattuto, C., Stumme, G.: Is web content a good proxy for real-life interaction? a case study considering online and offline interactions of computer scientists. In: Proceedings of the IEEE/ACM ASONAM. ACM (2015)
Kibanov, M., Atzmueller, M., Scholz, C., Stumme, G.: Temporal evolution of contacts and communities in networks of face-to-face human interactions. Sci. China Inf. Sci. 57, 32103 (2014)
Klösgen, W.: Explora: a multipattern and multistrategy discovery assistant. In: Advances in Knowledge Discovery and Data Mining, pp. 249–271. AAAI Press (1996)
Mannila, H.: Theoretical frameworks for data mining. SIGKDD Explor. 1(2), 30–32 (2000)
Mitzlaff, F., Atzmueller, M., Benz, D., Hotho, A., Stumme, G.: Community assessment using evidence networks. In: Atzmueller, M., Hotho, A., Strohmaier, M., Chin, A. (eds.) MUSE/MSM 2010. LNCS, vol. 6904, pp. 79–98. Springer, Heidelberg (2011)
Mitzlaff, F., Atzmueller, M., Benz, D., Hotho, A., Stumme, G.: User-Relatedness and Community Structure in Social Interaction Networks. CoRR/abs 1309.3888 (2013)
Mitzlaff, F., Atzmueller, M., Hotho, A., Stumme, G.: The social distributional hypothesis. J. Soc. Netw. Anal. Min. 4(216), 1–14 (2014)
Morik, K.: Detecting interesting instances. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, pp. 13–23. Springer, Heidelberg (2002). doi:10.1007/3-540-45728-3_2
Scholz, C., Atzmueller, M., Barrat, A., Cattuto, C., Stumme, G.: New insights and methods for predicting face-to-face contacts. In: Proceedings ICWSM. AAAI, Palo Alto, CA, USA (2013)
Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. No. 8 in Structural Analysis in the Social Sciences, 1st edn. Cambridge University Press, New York (1994)
Wrobel, S.: An algorithm for multi-relational discovery of subgroups. In: Komorowski, J., Zytkow, J. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 78–87. Springer, Heidelberg (1997). doi:10.1007/3-540-63223-9_108
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Atzmueller, M. (2016). Local Exceptionality Detection on Social Interaction Networks. In: Berendt, B., et al. Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2016. Lecture Notes in Computer Science(), vol 9853. Springer, Cham. https://doi.org/10.1007/978-3-319-46131-1_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-46131-1_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46130-4
Online ISBN: 978-3-319-46131-1
eBook Packages: Computer ScienceComputer Science (R0)