OntoDM-KDD: Ontology for Representing the Knowledge Discovery Process

  • Panče Panov
  • Larisa Soldatova
  • Sašo Džeroski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8140)


In this article, we present an ontology for representing the knowledge discovery (KD) process based on the CRISP-DM process model (OntoDM-KDD). OntoDM-KDD defines the most essential entities for describing data mining investigations in the context of KD in a two-layered ontological structure. The ontology is aligned and reuses state-of-the-art resources for representing scientific investigations, such as Information Artifact Ontology (IAO) and Ontology of Biomedical Investigations (OBI). It provides a taxonomy of KD specific actions, processes and specifications of inputs and outputs. OntoDM-KDD supports the annotation of DM investigations in application domains. The ontology has been thoroughly assessed following the best practices in ontology engineering, is fully interoperable with many domain resources and easily extensible. OntoDM-KDD is available at .


Knowledge Discovery in Databases CRISP-DM Data Mining Investigation Data Mining Domain Ontology 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Yang, Q., Wu, X.: 10 challenging problems in data mining research. International Journal of Information Technologies and Decision Making 5(4), 597–604 (2006)CrossRefGoogle Scholar
  2. 2.
    Kriegel, H.P., et al.: Future trends in data mining. Data Mining and Knowledge Discovery 15, 87–97 (2007)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Dietterich, T., Domingos, P., Getoor, L., Muggleton, S., Tadepalli, P.: Structured machine learning: the next ten years. Machine Learning 73 (2008)Google Scholar
  4. 4.
    Chapman, P., Kerber, R., Clinton, J., Khabaza, T., Reinartz, T., Wirth, R.: The CRISP-DM process model. Discussion Paper (1999)Google Scholar
  5. 5.
    King, R., et al.: The Automation of Science. Science 324(5923), 85–89 (2009)CrossRefGoogle Scholar
  6. 6.
    Smith, B., et al.: The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotech. 25(11), 1251–1255 (2007)CrossRefGoogle Scholar
  7. 7.
    Bernstein, A., Provost, F., Hill, S.: Toward intelligent assistance for a data mining process: An ontology-based approach for cost-sensitive classification. IEEE Trans. on Knowl. and Data Eng. 17(4), 503–518 (2005)CrossRefGoogle Scholar
  8. 8.
    Žáková, M., Kremen, P., Železný, F., Lavrač, N.: Automating knowledge discovery workflow composition through ontology-based planning. IEEE Transactions on Automation Science and Engineering 8(2), 253–264 (2010)Google Scholar
  9. 9.
    Diamantini, C., Potena, D.: Semantic annotation and services for KDD tools sharing and reuse. In: ICDMW 2008: Proceedings of the 2008 IEEE International Conference on Data Mining Workshops, pp. 761–770. IEEE Computer Society (2008)Google Scholar
  10. 10.
    Kietz, J., Serban, F., Bernstein, A., Fischer, S.: Towards cooperative planning of data mining workflows. In: Proceedings of Workshop on Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery, pp. 1–13 (2009)Google Scholar
  11. 11.
    Cannataro, M., Comito, C.: A data mining ontology for GRID programming. In: Proc. of Wshp. on Semantics in Peer-to-Peer and Grid Computing, pp. 113–134 (2003)Google Scholar
  12. 12.
    Brezany, P., Janciak, I., Tjoa, A.M.: Ontology-based construction of grid data mining workflows. In: Data Mining with Ontologies: Implementations, Findings and Frameworks, pp. 182–210. IGI Global (2007)Google Scholar
  13. 13.
    Hilario, M., et al.: A data mining ontology for algorithm selection and Meta-Mining. In: Proceedings of Workshop on Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery, pp. 76–88 (2009)Google Scholar
  14. 14.
    Vanschoren, J., Blockeel, H., Pfahringer, B., Holmes, G.: Experiment databases - a new way to share, organize and learn from experiments. Machine Learning 87(2), 127–158 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Panov, P., Džeroski, S., Soldatova, L.N.: OntoDM: An ontology of data mining. In: ICDMW 2008: Proceedings of the 2008 IEEE International Conference on Data Mining Workshops, pp. 752–760. IEEE Computer Society (2008)Google Scholar
  16. 16.
    Panov, P., Soldatova, L., Džeroski, S.: Representing entities in the OntoDM data mining ontology. In: Inductive Databases and Constraint-Based Data Mining, pp. 27–58. Springer, New York (2010)CrossRefGoogle Scholar
  17. 17.
    Smith, B., et al.: Relations in biomedical ontologies. Genome Biology 6(5), R46 (2005)CrossRefGoogle Scholar
  18. 18.
    Courtot, M., et al.: MIREOT: The minimum information to reference an external ontology term. Applied Ontology 6(1), 23–33 (2011)Google Scholar
  19. 19.
    Brinkman, R.R., et al.: Modeling biomedical experimental processes with obi. Journal of Biomedical Semantics 1(suppl. 1), S7 (2010)CrossRefGoogle Scholar
  20. 20.
    Grüninger, M., Fox, M.: Methodology for the Design and Evaluation of Ontologies. In: IJCAI 1995, Workshop on Basic Ontological Issues in Knowledge Sharing (April 13, 1995)Google Scholar
  21. 21.
    Sirin, E., Parsia, B.: SPARQL-DL: Sparql query for OWL-DL. In: 3rd OWL Experiences and Directions Workshop (OWLED 2007) (2007)Google Scholar
  22. 22.
    Stojanova, D., Panov, P., Gjorgjioski, V., Kobler, A., Dzeroski, S.: Estimating vegetation height and canopy cover from remotely sensed data with machine learning. Ecological Informatics 5(4), 256–266 (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Panče Panov
    • 1
  • Larisa Soldatova
    • 2
  • Sašo Džeroski
    • 1
  1. 1.Jožef Stefan InstituteLjubljanaSlovenia
  2. 2.Brunel UniversityLondonUnited Kingdom

Personalised recommendations