Applications and research problems of subgroup mining

  • Willi Klösgen
Invited Papers
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1609)

Abstract

Knowledge Discovery in Databases (KDD) is a data analysis process which, in contrast to conventional data analysis, automatically generates and evaluates very many hypotheses, deals with complex, i.e. large, high dimensional, multi relational, dynamic, or heterogeneous data, and produces understandable results for those who “own the data”. With these objectives, subgroup mining searches for hypotheses that can be supported or confirmed by the given data and that are represented as a specialization of one of three general hypothesis types: deviating subgroups, associations between two subgroups, and partially ordered sets of subgroups where the partial ordering usually relates to time. This paper gives a short introduction into the methods of subgroup mining. Especially the main preprocessing, data mining and postprocessing steps are discussed in more detail for two applications. We conclude with some problems of the current state of the art of subgroup mining.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hand, D.: Data mining—reaching beyond statistics. Journal of Official Statistics 3 (1998).Google Scholar
  2. 2.
    Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, I.: Fast Discovery of Association Rules. In: Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.): Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge (1996) 307–328.Google Scholar
  3. 3.
    Mannila, H., Toivonen, H., Verkamo, I.: Discovery of frequent episodes in event sequences, Data Mining and Knowledge Discovery 1 (3) (1997) 259–289.CrossRefGoogle Scholar
  4. 4.
    Klösgen, W.: Deviation and association patterns for subgroup mining in temporal, spatial, and textual data bases. In: Polkowski, L., Skowron, A. (eds.). Rough Sets and Current Trends in Computing. Lecture Notes in Artificial Intelligence, Vol. 1424. Springer-Verlag, Berlin Heidelberg New York (1998) 1–18.Google Scholar
  5. 5.
    Feldman, R., Klösgen, W., Zilberstein, A.: Visualization Techniques to Explore Data Mining Results for Document Collections. In: Heckerman, D., Mannila, H., Pregibon, D. (eds.) Proceedings of Third International Conference on Knowledge Discovery and Data Mining (KDD-97). AAAI Press, Menlo Park (1997).Google Scholar
  6. 6.
    Wrobel, S.: An Algorithm for Multi-relational Discovery of Subgroups. In: Komorowski, J., Zytkow, J. (eds): Principles of Data Mining and Knowledge Discovery. Lecture Notes in Artificial Intelligence, Vol. 1263. Springer-Verlag, Berlin Heidelberg New York (1997) 78–87.Google Scholar
  7. 7.
    Siebes, A.: Data Surveying: Foundations of an Inductive Query Language. In: Fayyad, U., Uthurusamy, R. (eds.): Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDDM95). AAAI Press, Menlo Park, CA: (1995).Google Scholar
  8. 8.
    Klösgen, W.: Exploration of Simulation Experiments by Discovery. In: Fayyad, U., Uthurusamy, R. (eds.). Proceedings of AAAI-94 Workshop on Knowledge Discovery in Databases. AAAI Press, Menlo Park (1994).Google Scholar
  9. 9.
    Klösgen, W.: Explora: A Multipattern and Multistrategy Discovery Assistant. In: Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.): Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge, MA (1996).Google Scholar
  10. 10.
    Friedman, J., Fisher, N.: Bump Hunting in High-Dimensional Data. Statistics and Computing (1998).Google Scholar
  11. 11.
    Quinlan, R.: Learning Logical Definitions from Relations. Machine Learning 5(3) (1990).Google Scholar

Copyright information

© Springer-Verlag 1999

Authors and Affiliations

  • Willi Klösgen
    • 1
  1. 1.German National Research Center for Information Technology (GMD)St. AugustinGermany

Personalised recommendations