Skip to main content

Towards an Ontology of Data Mining Investigations

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNAI,volume 5808)

Abstract

Motivated by the need for unification of the domain of data mining and the demand for formalized representation of outcomes of data mining investigations, we address the task of constructing an ontology of data mining. In this paper we present an updated version of the OntoDM ontology, that is based on a recent proposal of a general framework for data mining and it is aligned with the ontology of biomedical investigations (OBI) . The ontology aims at describing and formalizing entities from the domain of data mining and knowledge discovery. It includes definitions of basic data mining entities (e.g., datatype, dataset, data mining task, data mining algorithm etc.) and allows extensions with more complex data mining entities (e.g. constraints, data mining scenarios and data mining experiments). Unlike most existing approaches to constructing ontologies of data mining, OntoDM is compliant to best practices in engineering ontologies that describe scientific investigations (e.g., OBI ) and is a step towards an ontology of data mining investigations. OntoDM is available at: http://kt.ijs.si/panovp/OntoDM/ .

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-642-04747-3_21
  • Chapter length: 15 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   99.00
Price excludes VAT (USA)
  • ISBN: 978-3-642-04747-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   129.00
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bernstein, A., Provost, F., Hill, S.: Toward intelligent assistance for a data mining process: An ontology-based approach for cost-sensitive classification. IEEE Trans. on Knowl. and Data Eng. 17(4), 503–518 (2005)

    CrossRef  Google Scholar 

  2. Blockeel, H.: Experiment databases: A novel methodology for experimental research. In: Bonchi, F., Boulicaut, J.-F. (eds.) KDID 2005. LNCS, vol. 3933, pp. 72–85. Springer, Heidelberg (2006)

    CrossRef  Google Scholar 

  3. Blockeel, H., Vanschoren, J.: Experiment databases: Towards an improved experimental methodology in machine learning. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 6–17. Springer, Heidelberg (2007)

    CrossRef  Google Scholar 

  4. Boulicaut, J.-F., Klemettinen, M., Mannila, H.: Modeling KDD processes within the inductive database framework. In: Data Warehousing and Knowledge Discovery, pp. 293–302 (1999)

    Google Scholar 

  5. Brezany, P., Janciak, I., Tjoa, A.: Ontology-Based Construction of Grid Data Mining Workflows. In: Data Mining with Ontologies: Implementations, Findings and Frameworks. IGI Global (2007)

    Google Scholar 

  6. Cannataro, M., Comito, C.: A data mining ontology for grid programming. In: Proceedings of (SemPGrid2003), pp. 113–134 (2003)

    Google Scholar 

  7. Cannataro, M., Talia, D.: The knowledge GRID. Commun. ACM 46(1), 89–93 (2003)

    CrossRef  MATH  Google Scholar 

  8. Diamantini, C., Potena, D.: Semantic annotation and services for KDD tools sharing and reuse. In: ICDMW 2008, Washington, DC, USA, 2008, pp. 761–770. IEEE Computer Society Press, Los Alamitos (2008)

    Google Scholar 

  9. Džeroski, S.: Towards a general framework for data mining. In: Džeroski, S., Struyf, J. (eds.) KDID 2006. LNCS, vol. 4747, pp. 259–300. Springer, Heidelberg (2006)

    CrossRef  Google Scholar 

  10. Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Comm. Of The ACM 39, 58–64 (1996)

    CrossRef  Google Scholar 

  11. Kalousis, A., Bernstein, A., Hilario, M.: Meta-learning with kernels and similarity functions for planning of data mining workflows. In: Proceedings of the Second PlanLearn Workshop 2008, pp. 23–28 (2008)

    Google Scholar 

  12. King, R.D., et al.: The Automation of Science. Science 324(5923), 85–89 (2009)

    CrossRef  Google Scholar 

  13. Lister, A., Lord, Ph., Pocock, M., Wipat, A.: Annotation of SMBL models through rule-based semantic integration. In: Proc. of Bio-ontologies SIG/ ISMB 2009 (2009)

    Google Scholar 

  14. Malaia, E.: Engineering ontology: domain acquisition methodology and practice. VDM Saarbrucken (2009)

    Google Scholar 

  15. Mizoguchi, R.: Tutorial on ontological engineering - part 3: Advanced course of ontological engineering. New Generation Comput 22(2) (2004)

    Google Scholar 

  16. Panov, P., Džeroski, S., Soldatova, L.: OntoDM: An ontology of data mining. In: ICDMW 2008, pp. 752–760 (2008)

    Google Scholar 

  17. Cimiano, P., Buitelaar, P. (eds.): Ontology learning and population: bridging the gap between text and knowledge. IOS Press, Netherlands (2008)

    MATH  Google Scholar 

  18. Peng, Y., Kou, G., Shi, Y., Chen, Z.: A descriptive framework for the field of data mining and knowledge discovery. International Journal of Information Technology & Decision Making (IJITDM) 7(04), 639–682 (2008)

    CrossRef  Google Scholar 

  19. Quinlan, R.: C4.5: programs for machine learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  20. Schober, D., Kusnierczyk, W., Lewis, S.E., Lomax, J.: Towards naming conventions for use in controlled vocabulary and ontology engineering. In: Proceedings of BioOntologies SIG, ISMB 2007, pp. 29–32 (2007)

    Google Scholar 

  21. Smith, B.: Ontology. In: Blackwell Guide to the Philosophy of Computing and Information, pp. 155–166. Oxford Blackwell, Malden (2003)

    Google Scholar 

  22. Smith, B., et al.: The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nature Biotechnology 25(11), 1251–1255 (2007)

    CrossRef  Google Scholar 

  23. Smith, B., et al.: Relations in biomedical ontologies. Genome Biology 6(5) , (2005)

    Google Scholar 

  24. Soldatova, L., Aubrey, W., King, R.D., Clare, A.: The exact description of biomedical protocols. Bioinformatics, 24(13) (2008)

    Google Scholar 

  25. Soldatova, L., King, R.D.: Are the current ontologies in biology good ontologies? Nature Biotechnology 23(9), 1095–1098

    Google Scholar 

  26. Soldatova, L., King, R.D.: An ontology of scientific experiments. Journal of the Royal Society Interface 3(11), 795–803 (2006)

    CrossRef  Google Scholar 

  27. Vanschoren, J., Blockeel, H., Pfahringer, B., Holmes, G.: Experiment databases: Creating a new platform for meta-learning research. In: Proceedings of the Second PlanLearn Workshop 2008, pp. 10–15 (2008)

    Google Scholar 

  28. Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. (June 2005)

    Google Scholar 

  29. Yang, Q., Wu, X.: 10 challenging problems in data mining research. International Journal of Information Technology and Decision Making 5(4), 597–604 (2006)

    CrossRef  Google Scholar 

  30. Zakova, M., Kremen, P., Zelezny, F., Lavrač, N.: Planning to learn with a knowledge discovery ontology. In: Proceedings of the Second Planning to Learn Workshop, pp. 29–34 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Panov, P., Soldatova, L.N., Džeroski, S. (2009). Towards an Ontology of Data Mining Investigations. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds) Discovery Science. DS 2009. Lecture Notes in Computer Science(), vol 5808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04747-3_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04747-3_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04746-6

  • Online ISBN: 978-3-642-04747-3

  • eBook Packages: Computer ScienceComputer Science (R0)