Similarity assessment for scientific workflow clustering and recommendation



This article proposes to identify and recommend scientific workflows for reuse and repurposing. Specifically, a scientific workflow is represented as a layer hierarchy that specifies the hierarchical relations between this workflow, its sub-workflows, and activities. Semantic similarity is calculated between layer hierarchies of workflows. A graph-skeleton based clustering technique is adopted for grouping layer hierarchies into clusters. Barycenters in each cluster are identified, which serve as core workflows in this cluster, for facilitating the cluster identification and workflow ranking and recommendation with respect to the requirement of scientists.



本文旨在实现科学工作流的重用和再利用. 为此, 本文提出了识别和推荐科学工作流的有效技术. 首先, 本文通过使用层次模型表示科学工作流, 从而清晰描述科学工作流与其内部子工作流和活动之间的层级性关系. 据此, 本文提出了评估两个层次模型间相似性的策略. 并通过基于图骨架的聚类算法对现有层次模型进行聚类. 最后, 通过识别出聚类的重心点来表示每个聚类核心工作流, 以此来提高识别聚类和工作流排序、推荐的速度和质量, 从而满足用户的需求.

This is a preview of subscription content, access via your institution.


  1. 1

    Liu X Z, Huang G, Zhao Q, et al. Imashup: a mashup-based framework for service composition. Sci China Inf Sci, 2014, 57: 012101

    Google Scholar 

  2. 2

    Ning H S, Liu H. Cyber-physical-social-thinking space based science and technology framework for the internet of things. Sci China Inf Sci, 2015, 58: 031102

    Article  Google Scholar 

  3. 3

    Starlinger J, Brancotte B, Cohen-Boulakia S, et al. Similarity search for scientific workflows. Proc VLDB Endowment, 2014, 7: 1143–1154

    Article  Google Scholar 

  4. 4

    Huang J, Sun H, Song Q, et al. Revealing density-based clustering structure from the core-connected tree of a network. IEEE Trans Knowl Data Eng, 2013, 25: 1876–1889

    Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Zhangbing Zhou.

Electronic supplementary material

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhou, Z., Cheng, Z. & Zhu, Y. Similarity assessment for scientific workflow clustering and recommendation. Sci. China Inf. Sci. 59, 113101 (2016).

Download citation


  • scientific workflow
  • reuse and repurposing
  • layer hierarchy
  • graph-skeleton
  • recommendation


  • 科学工作流
  • 重用和再利用
  • 层级模型
  • 图骨架
  • 推荐