Advertisement

UMiner: A Data Mining System Handling Uncertainty and Quality

  • Michalis Vazirgiannis
  • Maria Halkidi
  • Dimitrios Gunopulos
Part of the Advanced Information and Knowledge Processing book series (AI&KP)

Abstract

The explosive growth of data collections in the science and business applications and the need to analyze and extract useful knowledge from this data leads to a new generation of tools and techniques grouped under the term data mining [FU96]. Their objective is to deal with volumes of data and automate the data mining and knowledge discovery from large data repositories. The majorities of data mining systems produce a particular enumeration of patterns over data sets accomplishing a limited set of tasks, such as clustering, classification and rules extraction [BL96, FPSU96]. However, there are some aspects in the data mining process that are under-addressed by the current approaches in database and data mining applications. These aspects are:
  1. i)

    the revealing and handling of uncertainty in the context of data mining tasks. In traditional data mining systems database values are not overlapping and treated equally in the classification process. The different values in the database are classified in the available categories in a crisp manner i.e. they may be classified into at most one cluster. Also all the values that are classified in a cluster belong to it with the same degree of belief Thus, there is significant information included in classification results that is not exploited by the traditional classification approaches.

     
  2. ii)

    the evaluation of data mining results based on well-established quality criteria. Most of the clustering algorithms depend on assumptions and initial guesses in order to define the subgroups presented in a data set [TK99]. As a consequence, in most applications the final clustering scheme requires some sort of evaluation.

     

Keywords

Data Mining Cluster Algorithm Validity Index Cluster Scheme Cluster Validity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [AHV02]
    C. Amanatidis, M. Halkidi, M. Vazirgiannis. “UMiner: A Data mining system that handles uncertainty and quality”. Demo paper in the Proceedings of EDBT Conference, Prague, March 2002.Google Scholar
  2. [BL96]
    M. Berry, G. Linoff. Data Mining Techniques for Marketing, Sales and Customer Support. John Wiley & Sons, Inc, 1996.Google Scholar
  3. [FPSU96]
    U. Fayyad, G. Piatesky-Shapiro, P. Smuth and Ramasamy Uthurusamy. Advances in Knowledge Discovery and Data Mining. AAAI Press, 1996Google Scholar
  4. [FU96]
    U. Fayyad, R. Uthurusamy. “Data Mining and Knowledge Discovery in Databases”, Communications of the ACM, Vol. 39, No. 11, November 1996.Google Scholar
  5. [HVB00]
    M. Halkidi, M. Vazirgiannis, Y. Batistakis. “Quality scheme assessment in the clustering process”, in Proceedings of PKDD, Lyon, France, 2000.Google Scholar
  6. [HV0la]
    M. Halkidi, M. Vazirgiannis. “Clustering Validity Assessment: Finding the optimal partitioning of a data set”, in Proceedings of IEEE — International Conference on Data Mining (ICDM) Conference, California, USA, November 2001.Google Scholar
  7. [HV0lb]
    M. Halkidi, M. Vazirgiannis. “A data set oriented approach for clustering algorithm selection”, in Proceedings of PKDD (Principles and Practice of Knowledge Discovery in Databases), Freiburg, Germany, 2001.Google Scholar
  8. [HV02a]
    M. Halkidi, M. Vazirgiannis. “Managing uncertainty and quality in the classification process”, in Proceedings of SETN Conference, Thessaloniki, Greece, April 2002.Google Scholar
  9. [HV02b]
    M. Halkidi, M. Vazirgiannis. “Clustering validity assessment using multi representatives”. Poster paper in the Proceedings of SETN Conference, Thessaloniki, Greece, April 2002.Google Scholar
  10. [KP96]
    W. Kelly, J. Painter. “Hypertrapezoidal Fuzzy Membership Functions”, in Fifth IEEE International Conference on Fuzzy Systems, New Orleans, September 8, pp. 1279–1284, 1996.CrossRefGoogle Scholar
  11. [TK99]
    S. Theodoridis, K. Koutroubas. Pattern Rrecognition, Academic Press, 1999.Google Scholar
  12. [V98]
    M. Vazirgiannis, “A classification and relationship extraction scheme for relational databases based on fuzzy logic”, in Proceedings of the Pacific-Asian KDD’ 98 Conference, Melbourne, Australia, 1998.Google Scholar
  13. [VH00]
    M. Vazirgiannis, M. Halkidi. “Uncertainty handling in the datamining process with fuzzy logic”, in Proceedings of the IEEE-FUZZY Conference, Texas, May, 2000.Google Scholar

Copyright information

© Springer-Verlag London 2003

Authors and Affiliations

  • Michalis Vazirgiannis
    • 1
  • Maria Halkidi
    • 1
  • Dimitrios Gunopulos
    • 2
  1. 1.Department of InformaticsAthens University of Economics and BusinessGreece
  2. 2.Department of Computer Science and EngineeringUniversity of CaliforniaRiversideUSA

Personalised recommendations