Using Provenance to Improve Workflow Design
With the popularity of scientific workflow management systems (WfMS), workflow specifications are becoming available. Provenance support in WfMS can help reusing third party code. Browsing can be done through queries instead of ad-hoc search on the Web. Finding dependencies among programs or services through provenance queries, without tool support, is not a trivial task. Due to the huge number of program versions available and their configuration parameters, this task may be heavily error prone and counterproductive. In this work we propose a recommendation service that aims at suggesting frequent combinations of scientific programs for reuse. Our recommendation service is designed to work over WfMS that provide provenance on workflow specification and execution logs. We have based our service on software components reuse and data mining techniques, and implemented a prototype with Vistrails WfMS.