Using Ontologies in Semantic Data Mining with SEGS and g-SEGS

  • Nada Lavrač
  • Anže Vavpetič
  • Larisa Soldatova
  • Igor Trajkovski
  • Petra Kralj Novak
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6926)

Abstract

With the expanding of the Semantic Web and the availability of numerous ontologies which provide domain background knowledge and semantic descriptors to the data, the amount of semantic data is rapidly growing. The data mining community is faced with a paradigm shift: instead of mining the abundance of empirical data supported by the background knowledge, the new challenge is to mine the abundance of knowledge encoded in domain ontologies, constrained by the heuristics computed from the empirical data collection. We address this challenge by an approach, named semantic data mining, where domain ontologies define the hypothesis search space, and the data is used as means of constraining and guiding the process of hypothesis search and evaluation. The use of prototype semantic data mining systems SEGS and g-SEGS is demonstrated in a simple semantic data mining scenario and in two real-life functional genomics scenarios of mining biological ontologies with the support of experimental microarray data.

Keywords

Semantic data mining ontologies background knowledge relational data mining 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aggarwal, C.C., Wang, H. (eds.): Managing and Mining Graph Data. Springer, US (2010)MATHGoogle Scholar
  2. 2.
    Aronis, J.M., Provost, F.J., Buchanan, B.G.: Exploiting background knowledge in automated discovery. In: Proc. of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 355–358 (1996)Google Scholar
  3. 3.
    Brisson, L., Collard, M.: How to semantically enhance a data mining process? In. In: Filipe, J., Cordeiro, J. (eds.) ICEIS 2008. LNBIP, vol. 19, pp. 103–116. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  4. 4.
    Clearwater, S.H., Provost, F.J.: Rl4: A tool for knowledge-based induction. In: Proc. of the 2nd International IEEE Conference on Tools for Artificial Intelligence, pp. 24–30 (November 1990)Google Scholar
  5. 5.
    De Raedt, L.: Logical and Relational Learning. Springer, Heidelberg (2008)CrossRefMATHGoogle Scholar
  6. 6.
    Garriga, G.C., Ukkonen, A., Mannila, H.: Feature selection in taxonomies with applications to paleontology. In: Boulicaut, J.-F., Berthold, M.R., Horváth, T. (eds.) DS 2008. LNCS (LNAI), vol. 5255, pp. 112–123. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  7. 7.
    Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoringt. Science 286, 531–537 (1999)CrossRefGoogle Scholar
  8. 8.
    Gottgtroy, P., Kasabov, N., MacDonell, S.: An ontology driven approach for knowledge discovery in biomedicine. In: Proc. of the VIII Pacific Rim International Conferences on Artificial Intelligence, PRICAI (2004)Google Scholar
  9. 9.
    Kim, S.Y., Volsky, D.J.: Page: Parametric analysis of gene set enrichment. BMC Bioinformatics 6(144) (2005)Google Scholar
  10. 10.
    Lavrač, N., Kavšek, B., Flach, P.A., Todorovski, L.: Subgroup discovery with CN2-SD. Journal of Machine Learning Research 5, 153–188 (2004)MathSciNetGoogle Scholar
  11. 11.
    Lehmann, J., Haase, C.: Ideal Downward Refinement in the \(\mathcal{EL}\) Description Logic. In: De Raedt, L. (ed.) ILP 2009. LNCS, vol. 5989, pp. 73–87. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    Liu, H.: Towards semantic data mining. In: Proc. of the 9th International Semantic Web Conference (ISWC 2010) (November 2010)Google Scholar
  13. 13.
    Michalski, R.S.: A theory and methodology of inductive learning. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning: An Artificial Intelligence Approach, pp. 83–129. Tioga Publishing Company, Palo Alto (1983)CrossRefGoogle Scholar
  14. 14.
    Mozetič, I., Lavrač, N., Podpečan, V., Kralj Novak, P., et al.: Bisociative knowledge discovery for microarray data analysis. In: Proc. of the First Intl. Conf. on Computational Creativity, pp. 190–199. Springer, Heidelberg (2010)Google Scholar
  15. 15.
    Demšar, J., Zupan, B., Leban, G.: Orange: From experimental machine learning to interactive data mining, white paper. Faculty of Computer and Information Science, University of Ljubljana (2004), www.ailab.si/orange
  16. 16.
    Podpečcan, V., Juršič, M., Žakova, M., Lavrač, N.: Towards a service-oriented knowledge discovery platform. In: Proc. of the ECML/PKDD Workshop on Third-Generation Data Mining: Towards Service-Oriented Knowledge Discovery, pp. 25–36 (2009)Google Scholar
  17. 17.
    Subramanian, P., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A.: Gene set enrichment analysis: A knowledge based approach for interpreting genome-wide expression profiles. Proc. of the National Academy of Science, USA 102(43), 15545–15550 (2005)CrossRefGoogle Scholar
  18. 18.
    Svátek, V., Rauch, J., Ralbovský, M.: Ontology-enhanced association mining. In: Ackermann, M., Berendt, B., Grobelnik, M., Hotho, A., Mladenič, D., Semeraro, G., Spiliopoulou, M., Stumme, G., Svátek, V., van Someren, M. (eds.) EWMF 2005 and KDO 2005. LNCS (LNAI), vol. 4289, pp. 163–179. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  19. 19.
    Trajkovski, I., Lavrač, N., Tolar, J.: SEGS: Search for enriched gene sets in microarray data. Journal of Biomedical Informatics 41(4), 588–601 (2008)CrossRefGoogle Scholar
  20. 20.
    Witten, I.H., Frank, E.: Data Mining Practical Machine Learning Tools and Techniques, 2nd edn. Elsevier, San Francisco (2005)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Nada Lavrač
    • 1
    • 2
  • Anže Vavpetič
    • 1
  • Larisa Soldatova
    • 3
  • Igor Trajkovski
    • 4
  • Petra Kralj Novak
    • 1
  1. 1.Department of Knowledge TechnologiesJožef Stefan InstituteLjubljanaSlovenia
  2. 2.University of Nova GoricaNova GoricaSlovenia
  3. 3.Aberystwyth UniversityWalesUnited Kingdom
  4. 4.Faculty of Electrical Engineering and Information TechnologiesSs. Cyril and Methodius UniversitySkopjeMacedonia

Personalised recommendations