Abstract
Knowledge Discovery in Databases (KDD) is a data analysis process which, in contrast to conventional data analysis, automatically generates and evaluates very many hypotheses, deals with complex, i.e. large, high dimensional, multi relational, dynamic, or heterogeneous data, and produces understandable results for those who “own the data”. With these objectives, subgroup mining searches for hypotheses that can be supported or confirmed by the given data and that are represented as a specialization of one of three general hypothesis types: deviating subgroups, associations between two subgroups, and partially ordered sets of subgroups where the partial ordering usually relates to time. This paper gives a short introduction into the methods of subgroup mining. Especially the main preprocessing, data mining and postprocessing steps are discussed in more detail for two applications. We conclude with some problems of the current state of the art of subgroup mining.
Preview
Unable to display preview. Download preview PDF.
References
Hand, D.: Data mining—reaching beyond statistics. Journal of Official Statistics 3 (1998).
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, I.: Fast Discovery of Association Rules. In: Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.): Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge (1996) 307–328.
Mannila, H., Toivonen, H., Verkamo, I.: Discovery of frequent episodes in event sequences, Data Mining and Knowledge Discovery 1 (3) (1997) 259–289.
Klösgen, W.: Deviation and association patterns for subgroup mining in temporal, spatial, and textual data bases. In: Polkowski, L., Skowron, A. (eds.). Rough Sets and Current Trends in Computing. Lecture Notes in Artificial Intelligence, Vol. 1424. Springer-Verlag, Berlin Heidelberg New York (1998) 1–18.
Feldman, R., Klösgen, W., Zilberstein, A.: Visualization Techniques to Explore Data Mining Results for Document Collections. In: Heckerman, D., Mannila, H., Pregibon, D. (eds.) Proceedings of Third International Conference on Knowledge Discovery and Data Mining (KDD-97). AAAI Press, Menlo Park (1997).
Wrobel, S.: An Algorithm for Multi-relational Discovery of Subgroups. In: Komorowski, J., Zytkow, J. (eds): Principles of Data Mining and Knowledge Discovery. Lecture Notes in Artificial Intelligence, Vol. 1263. Springer-Verlag, Berlin Heidelberg New York (1997) 78–87.
Siebes, A.: Data Surveying: Foundations of an Inductive Query Language. In: Fayyad, U., Uthurusamy, R. (eds.): Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDDM95). AAAI Press, Menlo Park, CA: (1995).
Klösgen, W.: Exploration of Simulation Experiments by Discovery. In: Fayyad, U., Uthurusamy, R. (eds.). Proceedings of AAAI-94 Workshop on Knowledge Discovery in Databases. AAAI Press, Menlo Park (1994).
Klösgen, W.: Explora: A Multipattern and Multistrategy Discovery Assistant. In: Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.): Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge, MA (1996).
Friedman, J., Fisher, N.: Bump Hunting in High-Dimensional Data. Statistics and Computing (1998).
Quinlan, R.: Learning Logical Definitions from Relations. Machine Learning 5(3) (1990).
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Klösgen, W. (1999). Applications and research problems of subgroup mining. In: Raś, Z.W., Skowron, A. (eds) Foundations of Intelligent Systems. ISMIS 1999. Lecture Notes in Computer Science, vol 1609. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0095086
Download citation
DOI: https://doi.org/10.1007/BFb0095086
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65965-5
Online ISBN: 978-3-540-48828-6
eBook Packages: Springer Book Archive