Abstract
It is clear that nowadays analysis of complex systems is an important handicap in Statistics, Artificial Intelligence, Information Systems, Data visualization, and other fields.
Describing the structure or obtaining knowledge of complex systems is known as a difficult task. The combination of Data Analysis techniques (including clustering), Inductive Learning (knowledge-based systems), Management of Data Bases and Multidimensional Graphical Representation must produce benefits on this field.
Clustering based on rules (CBR) is a methodology developed with the aim of finding the structure of complex domains, which performs better than traditional clustering algorithms or knowledge based systems approaches. In our proposal, a combination of clustering and inductive learning is focussed to the problem of finding and interpreting special patterns (or concepts) from large data bases, in order to extract useful knowledge to represent real-world domains. This methodology and its behaviour as a Knowledge Discovery has been, in fact, presented in previous papers ([3],
The aim of this paper is to emphasize the reporting phase. Some tools oriented to the interpretation of the clusters are presented; automatic rules generation is presented and applied to a real research. Actually, in a KD system, data preparation and interpretation of the results is as important as the analysis itself. In this paper, missing data treatment is analysed; a statistical test, based on non parametric techniques, for comparing several classifications is presented. Also, a method for finding characteristic values of the classes is presented; this is based on the prototype of each class. Finally, these characterizations allow automatic generation of decision rules, as a predictive tool for future items.
This research has been partially financed by the project TIC’96-0878.
Chapter PDF
Keywords
References
Fayyad, U., et al. From Data Mining to Knowledge Discovery: An overview Advances in KD and DM, Fayyad, U., et. al. R. AAAI/MIT, 1996.
Gibert, K, Cortés, U (98) Clustering based on rules and knowledge discovery in ill-structured domains, Computación y Sistemas, México, 1998. (in press).
Gibert, K, Cortés, U. Weighing quantitative and qualitative variables in clustering methods, MATH-WARE 10(4), January 1997.
-Combining a knowledge based system with a clustering method for an inductive construction of models in: P. Cheeseman et al. (Eds.), Selecting Models from Data: AI and Statistics IV, LNS no 89 (Springer-Verlag, New York, 1994) 351–360.
Gibert, K., Sonicki, Z. (97) Classification based on rules and medical research. Proc Applied Stochastic Models and Data Analysis. Ed. Lauro et al., Napoli. pp 181–186.
Gower, J. C., A general coefficient for similarity, Biometrics, (27) 857–872.
Lebart, L et al. Traitement statistique des données. Dunod, Paris.
Nakhaeizadeh, G. Classification as a subtask of of Data Mining experiences form some industrial projects. In IFCS’96. Kobe, Japan (in press). pp. 17–20.
Sonicki, Z. et al. (93) The use of induction in routine laboratory diagnostics of thyroid, LIJECNICKI VJESNIK 115, pp 306–309 (in Croatian).
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gibert, K., Aluja, T., Cortés, U. (1998). Knowledge discovery with clustering based on rules. Interpreting results. In: Żytkow, J.M., Quafafou, M. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1998. Lecture Notes in Computer Science, vol 1510. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0094808
Download citation
DOI: https://doi.org/10.1007/BFb0094808
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65068-3
Online ISBN: 978-3-540-49687-8
eBook Packages: Springer Book Archive