Towards an Ontology of Data Mining Investigations

  • Panče Panov
  • Larisa N. Soldatova
  • Sašo Džeroski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5808)


Motivated by the need for unification of the domain of data mining and the demand for formalized representation of outcomes of data mining investigations, we address the task of constructing an ontology of data mining. In this paper we present an updated version of the OntoDM ontology, that is based on a recent proposal of a general framework for data mining and it is aligned with the ontology of biomedical investigations (OBI) . The ontology aims at describing and formalizing entities from the domain of data mining and knowledge discovery. It includes definitions of basic data mining entities (e.g., datatype, dataset, data mining task, data mining algorithm etc.) and allows extensions with more complex data mining entities (e.g. constraints, data mining scenarios and data mining experiments). Unlike most existing approaches to constructing ontologies of data mining, OntoDM is compliant to best practices in engineering ontologies that describe scientific investigations (e.g., OBI ) and is a step towards an ontology of data mining investigations. OntoDM is available at: .


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bernstein, A., Provost, F., Hill, S.: Toward intelligent assistance for a data mining process: An ontology-based approach for cost-sensitive classification. IEEE Trans. on Knowl. and Data Eng. 17(4), 503–518 (2005)CrossRefGoogle Scholar
  2. 2.
    Blockeel, H.: Experiment databases: A novel methodology for experimental research. In: Bonchi, F., Boulicaut, J.-F. (eds.) KDID 2005. LNCS, vol. 3933, pp. 72–85. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    Blockeel, H., Vanschoren, J.: Experiment databases: Towards an improved experimental methodology in machine learning. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 6–17. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  4. 4.
    Boulicaut, J.-F., Klemettinen, M., Mannila, H.: Modeling KDD processes within the inductive database framework. In: Data Warehousing and Knowledge Discovery, pp. 293–302 (1999)Google Scholar
  5. 5.
    Brezany, P., Janciak, I., Tjoa, A.: Ontology-Based Construction of Grid Data Mining Workflows. In: Data Mining with Ontologies: Implementations, Findings and Frameworks. IGI Global (2007)Google Scholar
  6. 6.
    Cannataro, M., Comito, C.: A data mining ontology for grid programming. In: Proceedings of (SemPGrid2003), pp. 113–134 (2003)Google Scholar
  7. 7.
    Cannataro, M., Talia, D.: The knowledge GRID. Commun. ACM 46(1), 89–93 (2003)CrossRefzbMATHGoogle Scholar
  8. 8.
    Diamantini, C., Potena, D.: Semantic annotation and services for KDD tools sharing and reuse. In: ICDMW 2008, Washington, DC, USA, 2008, pp. 761–770. IEEE Computer Society Press, Los Alamitos (2008)Google Scholar
  9. 9.
    Džeroski, S.: Towards a general framework for data mining. In: Džeroski, S., Struyf, J. (eds.) KDID 2006. LNCS, vol. 4747, pp. 259–300. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Comm. Of The ACM 39, 58–64 (1996)CrossRefGoogle Scholar
  11. 11.
    Kalousis, A., Bernstein, A., Hilario, M.: Meta-learning with kernels and similarity functions for planning of data mining workflows. In: Proceedings of the Second PlanLearn Workshop 2008, pp. 23–28 (2008)Google Scholar
  12. 12.
    King, R.D., et al.: The Automation of Science. Science 324(5923), 85–89 (2009)CrossRefGoogle Scholar
  13. 13.
    Lister, A., Lord, Ph., Pocock, M., Wipat, A.: Annotation of SMBL models through rule-based semantic integration. In: Proc. of Bio-ontologies SIG/ ISMB 2009 (2009)Google Scholar
  14. 14.
    Malaia, E.: Engineering ontology: domain acquisition methodology and practice. VDM Saarbrucken (2009)Google Scholar
  15. 15.
    Mizoguchi, R.: Tutorial on ontological engineering - part 3: Advanced course of ontological engineering. New Generation Comput 22(2) (2004)Google Scholar
  16. 16.
    Panov, P., Džeroski, S., Soldatova, L.: OntoDM: An ontology of data mining. In: ICDMW 2008, pp. 752–760 (2008)Google Scholar
  17. 17.
    Cimiano, P., Buitelaar, P. (eds.): Ontology learning and population: bridging the gap between text and knowledge. IOS Press, Netherlands (2008)zbMATHGoogle Scholar
  18. 18.
    Peng, Y., Kou, G., Shi, Y., Chen, Z.: A descriptive framework for the field of data mining and knowledge discovery. International Journal of Information Technology & Decision Making (IJITDM) 7(04), 639–682 (2008)CrossRefGoogle Scholar
  19. 19.
    Quinlan, R.: C4.5: programs for machine learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  20. 20.
    Schober, D., Kusnierczyk, W., Lewis, S.E., Lomax, J.: Towards naming conventions for use in controlled vocabulary and ontology engineering. In: Proceedings of BioOntologies SIG, ISMB 2007, pp. 29–32 (2007)Google Scholar
  21. 21.
    Smith, B.: Ontology. In: Blackwell Guide to the Philosophy of Computing and Information, pp. 155–166. Oxford Blackwell, Malden (2003)Google Scholar
  22. 22.
    Smith, B., et al.: The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nature Biotechnology 25(11), 1251–1255 (2007)CrossRefGoogle Scholar
  23. 23.
    Smith, B., et al.: Relations in biomedical ontologies. Genome Biology 6(5) , (2005)Google Scholar
  24. 24.
    Soldatova, L., Aubrey, W., King, R.D., Clare, A.: The exact description of biomedical protocols. Bioinformatics, 24(13) (2008)Google Scholar
  25. 25.
    Soldatova, L., King, R.D.: Are the current ontologies in biology good ontologies? Nature Biotechnology 23(9), 1095–1098Google Scholar
  26. 26.
    Soldatova, L., King, R.D.: An ontology of scientific experiments. Journal of the Royal Society Interface 3(11), 795–803 (2006)CrossRefGoogle Scholar
  27. 27.
    Vanschoren, J., Blockeel, H., Pfahringer, B., Holmes, G.: Experiment databases: Creating a new platform for meta-learning research. In: Proceedings of the Second PlanLearn Workshop 2008, pp. 10–15 (2008)Google Scholar
  28. 28.
    Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. (June 2005)Google Scholar
  29. 29.
    Yang, Q., Wu, X.: 10 challenging problems in data mining research. International Journal of Information Technology and Decision Making 5(4), 597–604 (2006)CrossRefGoogle Scholar
  30. 30.
    Zakova, M., Kremen, P., Zelezny, F., Lavrač, N.: Planning to learn with a knowledge discovery ontology. In: Proceedings of the Second Planning to Learn Workshop, pp. 29–34 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Panče Panov
    • 1
  • Larisa N. Soldatova
    • 2
  • Sašo Džeroski
    • 1
  1. 1.Jožef Stefan InstituteLjubljanaSlovenia
  2. 2.Aberystwyth University, PenglaisAberystwythUK

Personalised recommendations