Skip to main content

An Incremental Learning Method to Support the Annotation of Workflows with Data-to-Data Relations

Part of the Lecture Notes in Computer Science book series (LNAI,volume 10024)

Abstract

Workflow formalisations are often focused on the representation of a process with the primary objective to support execution. However, there are scenarios where what needs to be represented is the effect of the process on the data artefacts involved, for example when reasoning over the corresponding data policies. This can be achieved by annotating the workflow with the semantic relations that occur between these data artefacts. However, manually producing such annotations is difficult and time consuming. In this paper we introduce a method based on recommendations to support users in this task. Our approach is centred on an incremental rule association mining technique that allows to compensate the cold start problem due to the lack of a training set of annotated workflows. We discuss the implementation of a tool relying on this approach and how its application on an existing repository of workflows effectively enable the generation of such annotations.

Keywords

  • Association Rules
  • DataNode
  • Port Pair
  • Annotated Items
  • Closed Itemsets

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-49004-5_9
  • Chapter length: 16 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-49004-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.

Notes

  1. 1.

    My experiment: http://www.myexperiment.org/.

  2. 2.

    W3C PROV: https://www.w3.org/TR/prov-overview/.

  3. 3.

    OPMW: http://www.opmw.org/.

  4. 4.

    PWO: http://purl.org/spar/pwo.

  5. 5.

    Wings: http://www.wings-workflows.org/.

  6. 6.

    My experiments: http://www.myexperiment.org/.

  7. 7.

    SHIWA: http://www.shiwa-workflow.eu/wiki/-/wiki/Main/SHIWA+Repository.

  8. 8.

    SCUFL2: https://taverna.incubator.apache.org/documentation/scufl2/.

  9. 9.

    Datanode: http://purl.org/datanode/ns/.

  10. 10.

    In this paper we use the terminology of the SCUFL2 specification. However, the basic structure is a common one. In the W3C PROV-O model this concept maps to the class Activity, in PWO with Step, and in OPMW to WorkflowExecutionProcess, just to mention few examples.

  11. 11.

    “LipidMaps Query” workflow from My experiment: http://www.myexperiment.org/workflows/1052.html.

  12. 12.

    Dinowolf: http://github.com/enridaga/dinowolf.

  13. 13.

    SCUFL2 Specification: https://taverna.incubator.apache.org/documentation/scufl2/.

  14. 14.

    Apache Taverna: https://taverna.incubator.apache.org/.

  15. 15.

    Apache Lucene: https://lucene.apache.org/core/.

  16. 16.

    DBPedia Spotlight: http://spotlight.dbpedia.org/.

  17. 17.

    DBPedia: http://dbpedia.org/.

  18. 18.

    My Experiments: http://www.myexperiments.org.

References

  1. Alper, P., Belhajjame, K., Goble, C.A., Karagoz, P.: LabelFlow: exploiting workflow provenance to surface scientific data provenance. In: Ludäscher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 84–96. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16462-5_7

    CrossRef  Google Scholar 

  2. Belhajjame, K., Corcho, O., Garijo, D., Zhao, J., Missier, P., Newman, D., Bechhofer, S., Garc a Cuesta, E., Soiland-Reyes, S., Verdes-Montenegro, L., et al.: Workflow-centric research objects: first class citizens in scholarly discourse. In: Proceedings of Workshop on the Semantic Publishing (SePublica 2012) 9th Extended Semantic Web Conference Hersonissos, Crete, Greece, 28 May 2012 (2012)

    Google Scholar 

  3. Belhajjame, K., Zhao, J., Garijo, D., Garrido, A., Soiland-Reyes, S., Alper, P., Corcho, O.: A workflow prov-corpus based on taverna and wings. In: Proceedings of the Joint EDBT/ICDT 2013 Workshops, pp. 331–332. ACM (2013)

    Google Scholar 

  4. Daga, E., d’Aquin, M., Adamou, A., Motta, E.: Addressing exploitability of smart city data. In: 2016 IEEE Second International Smart Cities Conference (ISC2). IEEE (2016)

    Google Scholar 

  5. Daga, E., d’Aquin, M., Gangemi, A., Motta, E.: Describing semantic web applications through relations between data nodes. Technical report kmi-14-05, Knowledge Media Institute, The Open University, Walton Hall, Milton Keynes (2014). http://kmi.open.ac.uk/publications/techreport/kmi-14-05

  6. Daga, E., d’Aquin, M., Gangemi, A., Motta, E.: Propagation of policies in rich data flows. In: Proceedings of the 8th International Conference on Knowledge Capture, K-CAP 2015, New York, NY, USA, pp. 5:1–5:8 (2015). http://doi.acm.org/10.1145/2815833.2815839

  7. Di Francescomarino, C., Ghidini, C., Rospocher, M., Serafini, L., Tonella, P.: Semantically-aided business process modeling. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 114–129. Springer, Heidelberg (2009)

    CrossRef  Google Scholar 

  8. Ferreira, D.R., Alves, S., Thom, L.H.: Ontology-based discovery of workflow activity patterns. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 100, pp. 314–325. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28115-0_30

    CrossRef  Google Scholar 

  9. Gangemi, A., Peroni, S., Shotton, D., Vitali, F.: A pattern-based ontology for describing publishing workflows. In: Proceedings of the 5th International Conference on Ontology and Semantic Web Patterns, WOP 2014, vol. 1302, Aachen, Germany, pp. 2–13. CEUR-WS.org (2014). http://dl.acm.org/citation.cfm?id=2878937.2878939

  10. Garijo, D., Alper, P., Belhajjame, K., Corcho, O., Gil, Y., Goble, C.: Common motifs in scientific workflows: an empirical analysis. Future Gener. Comput. Syst. 36, 338–351 (2014)

    CrossRef  Google Scholar 

  11. Garijo, D., Gil, Y.: A new approach for publishing workflows: abstractions, standards, and linked data. In: Proceedings of the 6th Workshop on Workflows in Support of Large-scale Science, WORKS 2011, NY, USA, pp. 47–56 (2011). http://doi.acm.org/10.1145/2110497.2110504

  12. Godin, R., Missaoui, R., Alaoui, H.: Incremental concept formation algorithms based on galois (concept) lattices. Comput. Intell. 11(2), 246–267 (1995)

    CrossRef  Google Scholar 

  13. Gómez-Pérez, J.M., Corcho, O.: Problem-solving methods for understanding process executions. Comput. Sci. Eng. 10(3), 47–52 (2008)

    CrossRef  Google Scholar 

  14. Hettne, K., Soiland-Reyes, S., Klyne, G., Belhajjame, K., Gamble, M., Bechhofer, S., Roos, M., Corcho, O.: Workflow forever: Semantic web semantic models and tools for preserving and digitally publishing computational experiments. In: Proceedings of the 4th International Workshop on Semantic Web Applications and Tools for the Life Sciences, SWAT4LS 2011, NY, USA, pp. 36–37 (2012). http://doi.acm.org/10.1145/2166896.2166909

  15. Kuznetsov, S.O., Obiedkov, S.A.: Comparing performance of algorithms for generating concept lattices. J. Exp. Theor. Artif. Intell. 14(2–3), 189–216 (2002)

    CrossRef  MATH  Google Scholar 

  16. Liu, J., Pacitti, E., Valduriez, P., Mattoso, M.: A survey of data-intensive scientific workflow management. J. Grid Comput. 13(4), 457–493 (2015)

    CrossRef  Google Scholar 

  17. Palma, R., Corcho, O., Hotubowicz, P., Pérez, S., Page, K., Mazurek, C.: Digital libraries for the preservation of research methods and associated artifacts. In: Proceedings of the 1st International Workshop on Digital Preservation of Research Methods and Artefacts, DPRMA 2013, NY, USA, pp. 8–15 (2013). http://doi.acm.org/10.1145/2499583.2499589

  18. Poelmans, J., Elzinga, P., Viaene, S., Dedene, G.: Formal concept analysis in knowledge discovery: a survey. In: Croitoru, M., Ferré, S., Lukose, D. (eds.) ICCS 2010. LNCS (LNAI), vol. 6208, pp. 139–153. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14197-3_15

    CrossRef  Google Scholar 

  19. Poelmans, J., Kuznetsov, S.O., Ignatov, D.I., Dedene, G.: Formal concept analysis in knowledge processing: a survey on models and techniques. Expert Syst. Appl. 40(16), 6601–6623 (2013)

    CrossRef  Google Scholar 

  20. Weber, I., Hoffmann, J., Mendling, J.: Semantic business process validation. In: Proceedings of the 3rd International Workshop on Semantic Business Process Management (SBPM 2008). CEUR-WS Proceedings, vol. 472 (2008)

    Google Scholar 

  21. Wille, R.: Formal concept analysis as mathematical theory of concepts and concept hierarchies. In: Ganter, B., Stumme, G., Wille, R. (eds.) Formal Concept Analysis. LNCS (LNAI), vol. 3626, pp. 1–33. Springer, Heidelberg (2005)

    CrossRef  Google Scholar 

  22. Wolstencroft, K., Haines, R., Fellows, D., Williams, A., Withers, D., Owen, S., Soiland-Reyes, S., Dunlop, I., Nenadic, A., Fisher, P., et al.: The taverna workflow suite: designing and executing workflows of web services on the desktop, web or in the cloud. Nucleic Acids Res. 41, W557–W561 (2013)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Enrico Daga .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Daga, E., d’Aquin, M., Gangemi, A., Motta, E. (2016). An Incremental Learning Method to Support the Annotation of Workflows with Data-to-Data Relations. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds) Knowledge Engineering and Knowledge Management. EKAW 2016. Lecture Notes in Computer Science(), vol 10024. Springer, Cham. https://doi.org/10.1007/978-3-319-49004-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49004-5_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49003-8

  • Online ISBN: 978-3-319-49004-5

  • eBook Packages: Computer ScienceComputer Science (R0)