Abstract
The explosive growth of data collections in the science and business applications and the need to analyze and extract useful knowledge from this data leads to a new generation of tools and techniques grouped under the term data mining [FU96]. Their objective is to deal with volumes of data and automate the data mining and knowledge discovery from large data repositories. The majorities of data mining systems produce a particular enumeration of patterns over data sets accomplishing a limited set of tasks, such as clustering, classification and rules extraction [BL96, FPSU96]. However, there are some aspects in the data mining process that are under-addressed by the current approaches in database and data mining applications. These aspects are:
-
i)
the revealing and handling of uncertainty in the context of data mining tasks. In traditional data mining systems database values are not overlapping and treated equally in the classification process. The different values in the database are classified in the available categories in a crisp manner i.e. they may be classified into at most one cluster. Also all the values that are classified in a cluster belong to it with the same degree of belief Thus, there is significant information included in classification results that is not exploited by the traditional classification approaches.
-
ii)
the evaluation of data mining results based on well-established quality criteria. Most of the clustering algorithms depend on assumptions and initial guesses in order to define the subgroups presented in a data set [TK99]. As a consequence, in most applications the final clustering scheme requires some sort of evaluation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
C. Amanatidis, M. Halkidi, M. Vazirgiannis. “UMiner: A Data mining system that handles uncertainty and quality”. Demo paper in the Proceedings of EDBT Conference, Prague, March 2002.
M. Berry, G. Linoff. Data Mining Techniques for Marketing, Sales and Customer Support. John Wiley & Sons, Inc, 1996.
U. Fayyad, G. Piatesky-Shapiro, P. Smuth and Ramasamy Uthurusamy. Advances in Knowledge Discovery and Data Mining. AAAI Press, 1996
U. Fayyad, R. Uthurusamy. “Data Mining and Knowledge Discovery in Databases”, Communications of the ACM, Vol. 39, No. 11, November 1996.
M. Halkidi, M. Vazirgiannis, Y. Batistakis. “Quality scheme assessment in the clustering process”, in Proceedings of PKDD, Lyon, France, 2000.
M. Halkidi, M. Vazirgiannis. “Clustering Validity Assessment: Finding the optimal partitioning of a data set”, in Proceedings of IEEE — International Conference on Data Mining (ICDM) Conference, California, USA, November 2001.
M. Halkidi, M. Vazirgiannis. “A data set oriented approach for clustering algorithm selection”, in Proceedings of PKDD (Principles and Practice of Knowledge Discovery in Databases), Freiburg, Germany, 2001.
M. Halkidi, M. Vazirgiannis. “Managing uncertainty and quality in the classification process”, in Proceedings of SETN Conference, Thessaloniki, Greece, April 2002.
M. Halkidi, M. Vazirgiannis. “Clustering validity assessment using multi representatives”. Poster paper in the Proceedings of SETN Conference, Thessaloniki, Greece, April 2002.
W. Kelly, J. Painter. “Hypertrapezoidal Fuzzy Membership Functions”, in Fifth IEEE International Conference on Fuzzy Systems, New Orleans, September 8, pp. 1279–1284, 1996.
S. Theodoridis, K. Koutroubas. Pattern Rrecognition, Academic Press, 1999.
M. Vazirgiannis, “A classification and relationship extraction scheme for relational databases based on fuzzy logic”, in Proceedings of the Pacific-Asian KDD’ 98 Conference, Melbourne, Australia, 1998.
M. Vazirgiannis, M. Halkidi. “Uncertainty handling in the datamining process with fuzzy logic”, in Proceedings of the IEEE-FUZZY Conference, Texas, May, 2000.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag London
About this chapter
Cite this chapter
Vazirgiannis, M., Halkidi, M., Gunopulos, D. (2003). UMiner: A Data Mining System Handling Uncertainty and Quality. In: Uncertainty Handling and Quality Assessment in Data Mining. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-4471-0031-7_5
Download citation
DOI: https://doi.org/10.1007/978-1-4471-0031-7_5
Publisher Name: Springer, London
Print ISBN: 978-1-4471-1119-1
Online ISBN: 978-1-4471-0031-7
eBook Packages: Springer Book Archive