An Ontology-Based Method to Link Database Integration and Data Mining within a Biomedical Distributed KDD
Over the last years, collaborative research has been continuously growing in many scientific areas such as biomedicine. However, traditional Knowledge Discovery in Databases (KDD) processes generally adopt centralized approaches that do not fully address many research needs in these distributed environments. This paper presents a method to improve traditional centralized KDD by adopting an ontology-based distributed model. Ontologies are used within this model: (i) as Virtual Schemas (VS) to solve structural heterogeneities in databases and (ii) as frameworks to guide automatic transformations when data is retrieved by users—Preprocessing Ontologies (PO). Both types of ontologies aim to facilitate data gathering and preprocessing while maintaining data source decentralization. This ontology-based approach allows to link database integration and data mining, improving final results, reusability and interoperability. The results obtained present improvements in outcome performance and new capabilities compared to traditional KDD processes.
KeywordsDatabase Integration Distributed KDD Ontologies Preprocessing Data Mining
Unable to display preview. Download preview PDF.
- 1.Fayyad, U., Shapiro, G., Smyth, P.: From Data Mining to Knowledge Discovery in databases. AI Magazine 17, 37–54 (1996)Google Scholar
- 5.Weiss, S.M., Indurkhya, N.: Predictive Data Mining: A Practical Guide. Morgan Kaufmann, San Francisco (1998)Google Scholar
- 7.SEER Cancer Statistics Review. Surveillance, Epidemiology and End Results (SEER) program, http://www.seer.cancer.gov/ (last accesed on April 2009)
- 9.Librelotto, G.R., Souza, W., Ramalho, J.C., Henriques, P.R.: Using the Ontology Paradigm to Integrate Information Systems. In: International Conference on Knowledge Engineering and Decision Support, pp. 497–504 (2003)Google Scholar
- 10.Xu, Z., Zhang, S., Dong, Y.: Mapping between Relational Database Schema and OWL Ontology for Deep Annotation. In: International Conference on Web Intelligence, pp. 548–552 (2006)Google Scholar
- 11.Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2005)Google Scholar