OntoExtractor: A Fuzzy-Based Approach in Clustering Semi-structured Data Sources and Metadata Generation

  • Zhan Cui
  • Ernesto Damiani
  • Marcello Leida
  • Marco Viviani
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3681)

Abstract

This paper describes a theoretical approach on data mining, information classifying and a global overview of our OntoExtractor application, concerning the analysis of incoming data flow and generate metadata structures.

In order to help the user to classify a big and varied group of data, our proposal is to use fuzzy-based techniques to compare and classify the data.

Before comparing the elements, the incoming flow of information has to be converted into a common structured format like XML.

With those structured documents now we can compare and cluster the various data and generate a metadata structure about this data repository.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bouchon-Meunier, B., Rifqi, M., Bothorel, S.: Towards general measures of comparison of objects. Fuzzy Sets and Systems 84, 143–153 (1996)MATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Ceravolo, P.: Extracting Role Hierarchies from Authentication Data Flows. Computer Systems Science & Engineering Journal (IJCSSE) 19(3), 121–127 (2004)Google Scholar
  3. 3.
    Ceravolo, P., Nocerino, M.C., Viviani, M.: Knowledge Extraction from Semistructured Data Based on Fuzzy Techniques. In: Knowledge-Based Intelligent Information and Engineering Systems, Proceedings of the 8th International Conference, KES 2004, Part III, pp. 328–334 (2004)Google Scholar
  4. 4.
    Damiani, E., Nocerino, M.C., Viviani, M.: Knowledge Extraction from an XML Data Flow: Building a Taxonomy based on Clustering Technique, Current Issues in Data and Knowledge Engineering. In: Proceedings of EUROFUSE 2004: 8th Meeting of the EURO Working Group on Fuzzy Sets, pp. 133–142 (2004)Google Scholar
  5. 5.
    Leida, M.: Structural information extraction techniques from semi-structured data flows, coming from differents data sources, Università degli Studi di Milano, DTI – Note del Polo – Research, No. 70 (2005)Google Scholar
  6. 6.
    RDF W3C Recommendationhttp://www.w3.org/TR/rdf-primer/

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Zhan Cui
    • 2
  • Ernesto Damiani
    • 1
  • Marcello Leida
    • 1
  • Marco Viviani
    • 1
  1. 1.Dipartimento di Tecnologie dell’InformazioneUniversità degli Studi di MilanoCremaItaly
  2. 2.Intelligent Systems Research CentreBT GroupIpswich, SuffolkUK

Personalised recommendations