A Calculus for Propagating Semantic Annotations Through Scientific Workflow Queries

  • Shawn Bowers
  • Bertram Ludäscher
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4254)


Scientific workflows facilitate automation, reuse, and reproducibility of scientific data management and analysis tasks. Scientific workflows are often modeled as dataflow networks, chaining together processing components (called actors) that query, transform, analyse, and visualize scientific datasets. Semantic annotations relate data and actor schemas with conceptual information from a shared ontology, to support scientific workflow design, discovery, reuse, and validation in the presence of thousands of potentially useful actors and datasets. However, the creation of semantic annotations is complex and time-consuming. We present a calculus and two inference algorithms to automatically propagate semantic annotations through workflow actors described by relational queries. Given an input annotation α and a query q, forward propagation computes an output annotation α′; conversely, backward propagation infers α from q and α′.


Inference Rule Inference Algorithm Semantic Annotation Relational Query Forward Propagation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bernstein, P.A.: Applying Model Management to Classical Meta-Data Problems. In: Conference on Innovative Data Systems Research (CIDR), pp. 209–220 (2003)Google Scholar
  2. 2.
    Bhagwat, D., Chiticariu, L., Tan, W.C., Vijayvargiya, G.: An Annotation Management System for Relational Databases. In: Intl. Conf. on Very Large Data Bases (VLDB) (2004)Google Scholar
  3. 3.
    Biskup, J., Kluck, A.: A New Approach to Inferences of Semantic Constraints. In: Proc. of Advances in Databases and Information Systems (1997)Google Scholar
  4. 4.
    Bowers, S., Lin, K., Ludäscher, B.: On Integrating Scientific Resources through Semantic Registration. In: Intl. Conf. on Scientific and Statistical Database Management (SSDBM) (2004)Google Scholar
  5. 5.
    Bowers, S., Ludäscher, B.: Actor-Oriented Design of Scientific Workflows. In: Delcambre, L.M.L., Kop, C., Mayr, H.C., Mylopoulos, J., Pastor, Ó. (eds.) ER 2005. LNCS, vol. 3716, pp. 369–384. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  6. 6.
    Bowers, S., Ludäscher, B.: Towards Automatic Generation of Semantic Types in Scientific Workflows. In: Intl. Workshop on Scalable Semantic Web Knowledge Base Systems (2005)Google Scholar
  7. 7.
    Buneman, P., Khanna, S., Tan, W.-C.: Why and Where: A Characterization of Data Provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, p. 316. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  8. 8.
    Buneman, P., Khanna, S., Tan, W.C.: On Propagation of Deletions and Annotations Through Views. In: ACM Symposium on Principles of Database Systems (PODS) (2002)Google Scholar
  9. 9.
    Chalcraft, D., Williams, J., Smith, M., Willig, M.: Scale Dependence In The Species-Richness-Productivity Relationship: The Role Of Species Turnover. Ecology 85(10) (2004)Google Scholar
  10. 10.
    Clark, K.L.: Negation as Failure. In: Logic and Databases. Plenum Press, New York (1977)Google Scholar
  11. 11.
    Cui, Y., Widom, J.: Lineage Tracing for General Data Warehouse Transformations. In: Intl. Conference on Very Large Data Bases (VLDB), Rome, Italy (2001)Google Scholar
  12. 12.
    Deutsch, A., Popa, L., Tannen, V.: Query Reformulation with Constraints. SIGMOD Record (to appear, 2006)Google Scholar
  13. 13.
    Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.C.: Composing Schema Mappings: Second-Order Dependencies to the Rescue. In: ACM Symposium on Principles of Database Systems (PODS), Paris, France (2004)Google Scholar
  14. 14.
    Geerts, F., Kementsietsidis, A., Milano, D.: MONDRIAN: Annotating and Querying Databases through Colors and Blocks. In: Intl. Conference on Data Engineering (ICDE) (2006)Google Scholar
  15. 15.
    Kahn, G., MacQueen, D.B.: Coroutines and Networks of Parallel Processes. In: Gilchrist, B. (ed.) Proc. of the IFIP Congress 77, pp. 993–998 (1977)Google Scholar
  16. 16.
    Lee, E.A., Parks, T.: Dataflow Process Networks. Proceedings of the IEEE 83(5), 773–799 (1995)CrossRefGoogle Scholar
  17. 17.
    Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific Workflow Management and the Kepler System. Concurrency and Computation: Practice & Experience (to appear, 2006)Google Scholar
  18. 18.
    Ludäscher, B., Goble, C.A. (eds.): Guest Editors’ Introduction to the Special Section on Scientific Workflows. ACM SIGMOD Record, vol. 34(3) (September 2005)Google Scholar
  19. 19.
    Majithia, S., Shields, M.S., Taylor, I.J., Wang, I.: Triana: A Graphical Web Service Composition and Execution Toolkit. In: Proceedings of the IEEE International Conference on Web Services (ICWS 2004), pp. 514–524. IEEE Computer Society, Los Alamitos (2004)CrossRefGoogle Scholar
  20. 20.
    Nash, A., Bernstein, P.A., Melnik, S.: Composition of Mappings Given by Embedded Dependencies. In: ACM Symposium on Principles of Database Systems (PODS) (2005)Google Scholar
  21. 21.
    Oinn, T.M., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, R.M., Carver, T., Glover, K., Pocock, M.R., Wipat, A., Li, P.: Taverna: A Tool for the Composition and Enactment of Bioinformatics Workflows. Bioinformatics 20(17) (2004)Google Scholar
  22. 22.
    Robinson, J.A.: A Machine-Oriented Logic Based on the Resolution Principle. Journal of the ACM 12(1), 23–41 (1965)zbMATHCrossRefGoogle Scholar
  23. 23.
    Simmhan, Y., Plale, B., Gannon, D.: A Survey of Data Provenance in e-Science. In: [18]Google Scholar
  24. 24.
    Yu, J., Buyya, R.: A Taxonomy of Scientific Workflow Systems for Grid Computing. In: [18]Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Shawn Bowers
    • 1
  • Bertram Ludäscher
    • 1
    • 2
  1. 1.UC Davis Genome Center 
  2. 2.Department of Computer ScienceUniversity of CaliforniaDavis

Personalised recommendations