Skip to main content

UMiner: A Data Mining System Handling Uncertainty and Quality

  • Chapter
Uncertainty Handling and Quality Assessment in Data Mining

Abstract

The explosive growth of data collections in the science and business applications and the need to analyze and extract useful knowledge from this data leads to a new generation of tools and techniques grouped under the term data mining [FU96]. Their objective is to deal with volumes of data and automate the data mining and knowledge discovery from large data repositories. The majorities of data mining systems produce a particular enumeration of patterns over data sets accomplishing a limited set of tasks, such as clustering, classification and rules extraction [BL96, FPSU96]. However, there are some aspects in the data mining process that are under-addressed by the current approaches in database and data mining applications. These aspects are:

  1. i)

    the revealing and handling of uncertainty in the context of data mining tasks. In traditional data mining systems database values are not overlapping and treated equally in the classification process. The different values in the database are classified in the available categories in a crisp manner i.e. they may be classified into at most one cluster. Also all the values that are classified in a cluster belong to it with the same degree of belief Thus, there is significant information included in classification results that is not exploited by the traditional classification approaches.

  2. ii)

    the evaluation of data mining results based on well-established quality criteria. Most of the clustering algorithms depend on assumptions and initial guesses in order to define the subgroups presented in a data set [TK99]. As a consequence, in most applications the final clustering scheme requires some sort of evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. Amanatidis, M. Halkidi, M. Vazirgiannis. “UMiner: A Data mining system that handles uncertainty and quality”. Demo paper in the Proceedings of EDBT Conference, Prague, March 2002.

    Google Scholar 

  2. M. Berry, G. Linoff. Data Mining Techniques for Marketing, Sales and Customer Support. John Wiley & Sons, Inc, 1996.

    Google Scholar 

  3. U. Fayyad, G. Piatesky-Shapiro, P. Smuth and Ramasamy Uthurusamy. Advances in Knowledge Discovery and Data Mining. AAAI Press, 1996

    Google Scholar 

  4. U. Fayyad, R. Uthurusamy. “Data Mining and Knowledge Discovery in Databases”, Communications of the ACM, Vol. 39, No. 11, November 1996.

    Google Scholar 

  5. M. Halkidi, M. Vazirgiannis, Y. Batistakis. “Quality scheme assessment in the clustering process”, in Proceedings of PKDD, Lyon, France, 2000.

    Google Scholar 

  6. M. Halkidi, M. Vazirgiannis. “Clustering Validity Assessment: Finding the optimal partitioning of a data set”, in Proceedings of IEEE — International Conference on Data Mining (ICDM) Conference, California, USA, November 2001.

    Google Scholar 

  7. M. Halkidi, M. Vazirgiannis. “A data set oriented approach for clustering algorithm selection”, in Proceedings of PKDD (Principles and Practice of Knowledge Discovery in Databases), Freiburg, Germany, 2001.

    Google Scholar 

  8. M. Halkidi, M. Vazirgiannis. “Managing uncertainty and quality in the classification process”, in Proceedings of SETN Conference, Thessaloniki, Greece, April 2002.

    Google Scholar 

  9. M. Halkidi, M. Vazirgiannis. “Clustering validity assessment using multi representatives”. Poster paper in the Proceedings of SETN Conference, Thessaloniki, Greece, April 2002.

    Google Scholar 

  10. W. Kelly, J. Painter. “Hypertrapezoidal Fuzzy Membership Functions”, in Fifth IEEE International Conference on Fuzzy Systems, New Orleans, September 8, pp. 1279–1284, 1996.

    Article  Google Scholar 

  11. S. Theodoridis, K. Koutroubas. Pattern Rrecognition, Academic Press, 1999.

    Google Scholar 

  12. M. Vazirgiannis, “A classification and relationship extraction scheme for relational databases based on fuzzy logic”, in Proceedings of the Pacific-Asian KDD’ 98 Conference, Melbourne, Australia, 1998.

    Google Scholar 

  13. M. Vazirgiannis, M. Halkidi. “Uncertainty handling in the datamining process with fuzzy logic”, in Proceedings of the IEEE-FUZZY Conference, Texas, May, 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag London

About this chapter

Cite this chapter

Vazirgiannis, M., Halkidi, M., Gunopulos, D. (2003). UMiner: A Data Mining System Handling Uncertainty and Quality. In: Uncertainty Handling and Quality Assessment in Data Mining. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-4471-0031-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-0031-7_5

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-1119-1

  • Online ISBN: 978-1-4471-0031-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics