A procedure to compute prototypes for data mining in non-structured domains

  • J. Méndez
  • M. Hernández
  • J. Lorenzo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1510)


This paper describes a technique for associating a set of symbols with an event in the context of knowledge discovery in database or data mining. The set of symbols is related to the keywords in a database which is used as an implicit knowledge source. The aim of this approach is to discover the significant keyword groups which best represent the event. A significant contribution of this work is a procedure which obtains the representative prototype of a group of symbolic data. It can be used for both, unsupervised learning to describe classes, and supervised learning to compute prototypes. The procedure involves defining an objective function and the subsequent hypothesis-exploring system and obtaining an advantageous procedure regarding computational costs.

Key words

learning data mining knowledge discovery symbolic clustering 


  1. 1.
    Bairoch A. and Apweiler R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acids Res., (25):31–36, 1997.CrossRefGoogle Scholar
  2. 2.
    Zadeh L. A. Fuzzy sets. Information and Control, 8:338–352, 1965.zbMATHMathSciNetCrossRefGoogle Scholar
  3. 3.
    Jain A.K. and Dubes R.C. Algorithms for Clustering Data. Printice Hall, 1988.Google Scholar
  4. 4.
    Moxon B. Defining data mining. DBMS online, August 1996. Scholar
  5. 5.
    Merz C.J. and Murphy P. UCI repository of machibe learning databases. Technical report, Departament of Information and Computer Science, University of California, Irvine, CA, 1996. Scholar
  6. 6.
    Mannila H. Methods and problems in data mining. In Proc. Int. Conf. on Database Theory. Springer-Verlag, January 1997.Google Scholar
  7. 7.
    Toivonen H. Discovery of frecuent patterns in large data collections. Technical Report Report A-1996-5, Dept. of Computer Science, University of Helssinki, Finlad, 1996.Google Scholar
  8. 8.
    Quinlan J.R. Induction of decision trees. Machine Learning, 1:81–106, 1986.Google Scholar
  9. 9.
    Decker K.M. and Focardi S. Technology overview: A report on data mining. Technical Report CSCS TR-95-02, Swiss Scientific Computer Center, May 1995.Google Scholar
  10. 10.
    Guigó R. and Temple F.S. Inferring correlation between database queries: Analysis of protein sequence patterns. IEEE PAMI, 25(10):1030–1041, 1988.Google Scholar
  11. 11.
    Duda R.O., and Hart P. Pattern Classification and Scene Analysis. Wiley and Sons, 1973.Google Scholar
  12. 12.
    Altschul S.F., Gish W., Miller W., Myers E.W., and Lipman D.J. Basic local alignment search tool. J. Mol. Biol., (215):403–410, 1990.CrossRefGoogle Scholar
  13. 13.
    Fayyad U.M., Haussler D., and Stolorz P. KDD for science data analysis; issues and examples. In Proc. Second Int. Conf. on Knowledge Discovery and Data Minig. AAAI Press, August 1996.Google Scholar
  14. 14.
    Fayyad U.M., Piatetsjy-Shapiro G., and Smyth P. Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 1996.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • J. Méndez
    • 1
  • M. Hernández
    • 1
  • J. Lorenzo
    • 1
  1. 1.Dpto. de Informática y SistemasUniversidad de Las Palmas de Gran CanariaLas PalmasSpain

Personalised recommendations