Auto-experimentation of KDD Workflows Based on Ontological Planning

  • Floarea Serban
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6497)


One of the problems of Knowledge Discovery in Databases (KDD) is the lack of user support for solving KDD problems. Current Data Mining (DM) systems enable the user to manually design workflows but this becomes difficult when there are too many operators to choose from or the workflow’s size is too large. Therefore we propose to use auto-experimentation based on ontological planning to provide the users with automatic generated workflows as well as rankings for workflows based on several criteria (execution time, accuracy, etc.). Moreover auto-experimentation will help to validate the generated workflows and to prune and reduce their number. Furthermore we will use mixed-initiative planning to allow the users to set parameters and criteria to limit the planning search space as well as to guide the planner towards better workflows.


  1. 1.
    Bernstein, A., Provost, F., Hill, S.: Towards Intelligent Assistance for a Data Mining Process: An Ontology-based Approach for Cost-sensitive Classification. IEEE Transactions on Knowledge and Data Engineering 17(4), 503–518 (2005)CrossRefGoogle Scholar
  2. 2.
    Blockeel, H., Vanschoren, J.: Experiment databases: Towards an improved experimental methodology in machine learning. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 6–17. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Brazdil, P., Gama, J., Henery, B.: Characterizing the applicability of classification algorithms using meta-level learning. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 83–102. Springer, Heidelberg (1994)CrossRefGoogle Scholar
  4. 4.
    Brazdil, P., Soares, C., Da Costa, J.: Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning 50(3), 251–277 (2003)CrossRefzbMATHGoogle Scholar
  5. 5.
    Burstein, M., McDermott, D.: Issues in the development of human-computer mixed-initiative planning. Advances in Psychology 113, 285–303 (1996)CrossRefGoogle Scholar
  6. 6.
    Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: Crisp–dm 1.0: Step-by-step data mining guide. Technical report, The CRISP–DM Consortium (2000)Google Scholar
  7. 7.
    Diamantini, C., Potena, D., Storti, E.: Kddonto: An ontology for discovery and composition of kdd algorithms. In: Service-oriented Knowledge Discovery (SoKD 2009) Workshop at ECML/PKDD 2009 (2009)Google Scholar
  8. 8.
    Diamantini, C., Potena, D., Storti, E.: Supporting users in kdd processes design: a semantic similarity matching approach. In: Planning to Learn Workshop (PlanLearn 2010) at ECAI 2010, pp. 27–34 (2010)Google Scholar
  9. 9.
    Fernández, S., Súarez, R., de la Rosa, T., Ortiz, J., Fernández, F., Borrajo, D., Manzano, D.: Improving the execution of kdd workflows generated by ai planners. In: Planning to Learn Workshop (PlanLearn 2010) at ECAI 2010, pp. 19–25 (2010)Google Scholar
  10. 10.
    Gama, J., Brazdil, P.: Characterization of classification algorithms. In: Progress in Artificial Intelligence, pp. 189–200 (1995)Google Scholar
  11. 11.
    Ghallab, M., Nau, D., Traverso, P.: Automated Planning: Theory & Practice. Morgan Kaufmann, San Francisco (2004)zbMATHGoogle Scholar
  12. 12.
    Giraud-Carrier, C.: The data mining advisor: meta-learning at the service of practitioners. In: Proceedings of Fourth International Conference on Machine Learning and Applications, p. 7 (2005)Google Scholar
  13. 13.
    Hilario, M., Kalousis, A.: Fusion of meta-knowledge and meta-data for case-based model selection. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 180–191. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  14. 14.
    Kietz, J., Serban, F., Bernstein, A., Fischer, S.: Data mining workflow templates for intelligent discovery assistance and auto-experimentation. In: Third-Generation Data Mining: Towards Service-Oriented Knowledge Discovery SoKD 2010 (2010)Google Scholar
  15. 15.
    Kietz, J.-U., Serban, F., Bernstein, A., Fischer, S.: Towards cooperative planning of data mining workflows. In: Service-oriented Knowledge Discovery (SoKD 2009) Workshop at ECML/PKDD 2009 (2009)Google Scholar
  16. 16.
    Michie, D., Spiegelhalter, D., Taylor, C., Campbell, J.: Machine learning, neural and statistical classification (1994)Google Scholar
  17. 17.
    Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: Yale: Rapid prototyping for complex data mining tasks. In: KDD 2006: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 935–940. ACM, New York (2006)Google Scholar
  18. 18.
    Myers, K.: Strategic advice for hierarchical planners. In: Principles of Knowledge Representation and Reasoning-International Conference, pp. 112–123. Morgan Kaufmann Publishers, San Francisco (1996)Google Scholar
  19. 19.
    Nau, D.S.: May all your plans succeed (invited talk). In: Proceedings of the National Conference on Artificial Intelligence (AAAI) (July 2005)Google Scholar
  20. 20.
    Nau, D.S., Smith, S.J.J., Erol, K.: Control strategies in htn planning: Theory versus practice. In: IAAI Proceedings, pp. 1127–1133 (1998)Google Scholar
  21. 21.
    Serban, F., Kietz, J.-U., Bernstein, A.: An overview of intelligent data assistants for data analysis. In: Planning to Learn Workshop (PlanLearn 2010) at ECAI 2010, pp. 7–14 (2010)Google Scholar
  22. 22.
    Wirth, R., Shearer, C., Grimmer, U., Reinartz, T., Schlösser, J., Breitner, C., Engels, R., Lindner, G.: Towards process-oriented tool support for knowledge discovery in databases. In: Komorowski, J., Żytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 243–253. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  23. 23.
    Žáková, M., Křemen, P., Železný, F., Lavrač, N.: Planning to learn with a knowledge discovery ontology. In: Planning to Learn Workshop (PlanLearn 2008) at ICML 2008 (2008)Google Scholar
  24. 24.
    Žáková, M., Podpečan, V., Železný, F., Lavrač, N.: Advancing data mining workflow construction: A framework and cases using the orange toolkit. In: Service-oriented Knowledge Discovery (SoKD 2009) Workshop at ECML/PKDD 2009 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Floarea Serban
    • 1
  1. 1.Dynamic and Distributed Information Systems Group, Department of InformaticsUniversity of ZurichZurichSwitzerland

Personalised recommendations