Advertisement

SINDBAD and SiQL: An Inductive Database and Query Language in the Relational Model

  • Jörg Wicker
  • Lothar Richter
  • Kristina Kessler
  • Stefan Kramer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5212)

Abstract

In this demonstration, we will present the concepts and an implementation of an inductive database – as proposed by Imielinski and Mannila – in the relational model. The goal is to support all steps of the knowledge discovery process on the basis of queries to a database system. The query language SiQL (structured inductive query language), an SQL extension, offers query primitives for feature selection, discretization, pattern mining, clustering, instance-based learning and rule induction. A prototype system processing such queries was implemented as part of the SINDBAD (structured inductive database development) project. To support the analysis of multi-relational data, we incorporated multi-relational distance measures based on set distances and recursive descent. The inclusion of rule-based classification models made it necessary to extend the data model and software architecture significantly. The prototype is applied to three different data sets: gene expression analysis, gene regulation prediction and structure-activity relationships (SARs) of small molecules.

Keywords

Data Mining Query Language Pattern Mining Inductive Logic Programming Knowledge Discovery Process 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Agrawal, R., Bollinger, T., Clifton, C.W., Dzeroski, S., Freytag, J.C., Gehrke, J., Hipp, J.: Data mining: The next generation. In: Agrawal, R., Freytag, J.-C., Ramakrishnan, R. (eds.) Report based on a Dagstuhl perspectives workshop (2005)Google Scholar
  2. 2.
    Blockeel, H., Calders, T., Fromont, E., Goethals, B., Prado, A.: Mining views: Database views for data mining. In: Proc. IEEE ICDE (2008)Google Scholar
  3. 3.
    Boulicaut, J.F., Masson, C.: Data mining query languages. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 715–727. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Date, C.J.: An Introduction to Database Systems, 4th edn. Addison-Wesley, Reading (1986)Google Scholar
  5. 5.
    Fröhler, S., Kramer, S.: Inductive logic programming for gene regulation prediction. Machine Learning 70(2-3), 225–240 (2008)CrossRefGoogle Scholar
  6. 6.
    Golub, T.R., Slonim, D.K., Tamayo, P., Huard, P., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)CrossRefGoogle Scholar
  7. 7.
    Han, J., Fu, Y., Wang, W., Koperski, K., Zaiane, O.: DMQL: A data mining query language for relational databases. In: SIGMOD 1996 Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 1996), Montreal, Canada (1996)Google Scholar
  8. 8.
    Imielinski, T., Virmani, A.: MSQL: A query language for database mining. Data Min. Knowl. Discov. 3(4), 373 (1999)CrossRefGoogle Scholar
  9. 9.
    Kramer, S., Aufschild, V., Hapfelmeier, A., Jarasch, A., Kessler, K., Reckow, S., Wicker, J., Richter, L.: Inductive databases in the relational model: The data as the bridge. In: Bonchi, F., Boulicaut, J.-F. (eds.) KDID 2005. LNCS, vol. 3933, pp. 124–138. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Kramer, S., De Raedt, L., Helma, C.: Molecular feature mining in HIV data. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2001), pp. 136–143 (2001)Google Scholar
  11. 11.
    Meo, R., Psaila, G., Ceri, S.: An extension to sql for mining association rules. Data Mining and Knowledge Discovery 2(2), 195–224 (1998)CrossRefGoogle Scholar
  12. 12.
    Quinlan, J.R.: Learning logical definitions from relations. Machine Learning 5, 239 (1990)Google Scholar
  13. 13.
    Ramon, J., Bruynoogh, M.: A polynomial time computable metric between point sets. Acta Informatica 37 (2001)Google Scholar
  14. 14.
    Richter, L., Wicker, J., Kessler, K., Kramer, S.: An inductive database and query language in the relational model. In: Proceedings of the 10th International Conference on Extending Database Technology (EDBT 2008), pp. 740–744. ACM Press, New York (2008)Google Scholar
  15. 15.
    Tang, Z.H., MacLennan, J.: Data mining with SQL Server 2005. Wiley, IN (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Jörg Wicker
    • 1
  • Lothar Richter
    • 1
  • Kristina Kessler
    • 1
  • Stefan Kramer
    • 1
  1. 1.Institut für Informatik I12Technische Universität MünchenGarching b. MünchenGermany

Personalised recommendations