An Ontology-Driven Framework for Data Transformation in Scientific Workflows

  • Shawn Bowers
  • Bertram Ludäscher
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2994)

Abstract

Ecologists spend considerable effort integrating heterogeneous data for statistical analyses and simulations, for example, to run and test predictive models. Our research is focused on reducing this effort by providing data integration and transformation tools, allowing researchers to focus on “real science,” that is, discovering new knowledge through analysis and modeling. This paper defines a generic framework for transforming heterogeneous data within scientific workflows. Our approach relies on a formalized ontology, which serves as a simple, unstructured global schema. In the framework, inputs and outputs of services within scientific workflows can have structural types and separate semantic types (expressions of the target ontology). In addition, a registration mapping can be defined to relate input and output structural types to their corresponding semantic types. Using registration mappings, appropriate data transformations can then be generated for each desired service composition. Here, we describe our proposed framework and an initial implementation for services that consume and produce XML data.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [ABH+02]
    Ankolenkar, A., Burstein, M., Hobbs, J.R., Lassila, O., Martin, D.L., McDermott, D., McIlraith, S.A., Narayanan, S., Paolucci, M., Payne, T.R., Sycara, K.: DAML-S: Web service description for the semantic web. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, p. 348. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. [BCD+02]
    Bhattacharyya, S.S., Cheong, E., Davis II, J., Goel, M., Hylands, C., Kienhuis, B., Lee, E.A., Liu, J., Liu, X., Muliadi, L., Neuendorffer, S., Reekie, J., Smyth, N., Tsay, J., Vogel, B., Williams, W., Xiong, Y., Zheng, H.: Heterogeneous concurrent modeling and design in java. Technical Report Memorandum UCB/ERL M02/23, EECS, University of California, Berkeley (August 2002)Google Scholar
  3. [BCF+03]
    Boag, S., Chamberlin, D., Fernández, M.F., Florescu, D., Robie, J., Siméon, J. (eds.): XQuery 1.0: An XML Query Language. W3C Working Draft. World Wide Web Consortium (W3C) (November 2003), http://www.w3.org/TR/2003/WD-xquery-20031112/
  4. [BG03]
    Brickley, D., Guha, R.V. (eds.): RDF Vocabulary Description Language 1.0: RDF Schema. W3C Working Draft. World Wide Web Consortium (W3C) (February 2003), http://www.w3.org/TR/2003/WD-rdfschema-20030123/
  5. [BHT96]
    Begon, M., Harper, J.L., Townsend, C.R.: Ecology: Individuals, Populations, and Communities. Blackwell Science, Malden (1996)Google Scholar
  6. [BM01]
    Biron, P.V., Malhotra, A. (eds.): XML Schema Part 2: Datatypes. W3C Recommendation.WorldWideWeb Consortium (W3C) (May 2001), http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/
  7. [BN03]
    Baader, F., Nutt, W.: Basic description logics. In: Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.F. (eds.) The Description Logic Handbook: Theory, Implementation, and Applications, Cambridge University Press, Cambridge (2003)Google Scholar
  8. [CCMW01]
    Christensen, E., Curbera, F., Meredith, G., Weerawarana, S. (eds.): Web Services Description Language (WSDL) 1.1. W3C Note. World Wide Web Consortium (W3C) (March 2001), http://www.w3.org/TR/2001/NOTE-wsdl-20010315
  9. [CD99]
    Clark, J., DeRose, S. (eds.): XML Path Language Version 1.0. W3C Recommendation. World Wide Web Consortium (W3C) (November 1999), http://www.w3.org/TR/1999/REC-xpath-19991116
  10. [CDSS98]
    Cluet, S., Delobel, C., Siméon, J., Smaga, K.: Your mediators need data conversion! In: Proceedings of the SIGMOD International Conference on Management of Data, pp. 177–188. ACM Press, New York (1998)Google Scholar
  11. [DK97]
    Davidson, S.B., Kosky, A.: WOL: A language for database transformations and constraints. In: Proceedings of the 13th International Conference on Data Engineering (ICDE), pp. 55–65. IEEE Computer Society, Los Alamitos (1997)CrossRefGoogle Scholar
  12. [KLK91]
    Krishnamurthy, R., Litwin, W., Kent, W.: Language features for interoperability of databases with schematic discrepancies. In: Proceedings of the SIGMOD International Conference on Management of Data, pp. 40–49. ACM Press, New York (1991)Google Scholar
  13. [LGM01]
    Ludäscher, B., Gupta, A., Martone, M.E.: Modelbased mediation with domain maps. In: Proceedings of the 17th International Conference on Data Engineering (ICDE), April 2001, pp. 81–90. IEEE Computer Society, Los Alamitos (2001)CrossRefGoogle Scholar
  14. [LP95]
    Lee, E.A., Parks, T.M.: Dataflow process networks. Proceedings of the IEEE 83(5), 773–801 (1995)CrossRefGoogle Scholar
  15. [LVL03]
    Lakshmanan, F.S.L.V.S.: Interoperability on XML data. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 146–163. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  16. [Mic03]
    Michener, W.K.: Building SEEK: the science environment for ecological knowledge. DataBits: An electronic newsletter for Information Managers, Spring Issue (2003)Google Scholar
  17. [MS02]
    Miklau, G., Suciu, D.: Containment and equivalence for an XPath fragment. In: Proceedings of the 21st Symposium on Principles of Database Systems (PODS), June 2002, pp. 65–76. ACM Press, New York (2002)Google Scholar
  18. [MvH03]
    McGuinness, D.L., van Harmelen, F. (eds.): OWL Web Ontology Language Overview. W3C Candidate Recommendation. World Wide Web Consortium (W3C) (August 2003), http://www.w3.org/TR/2003/CR-owl-features-20030818/
  19. [PAGM96]
    Papakonstantinou, Y., Abiteboul, S., Garcia-Molina, H.: Object fusion in mediator systems. In: Proceedings of 22nd International Conference on Very Large Data Bases (VLDB), September 1996, pp. 413–424. Morgan Kaufmann, San Francisco (1996)Google Scholar
  20. [PB03]
    Pottinger, R., Bernstein, P.A.: Merging models based on given correspondences. In: Proceedings of the 29th International Conference on Very Large Data Bases (VLDB), September 2003, pp. 826–837. Morgan Kaufmann, San Francisco (2003)Google Scholar
  21. [PS98]
    Parent, C., Spaccapietra, S.: Issues and approaches of database integration. Communications of the ACM 41(5), 166–178 (1998)CrossRefGoogle Scholar
  22. [PVM+02]
    Popa, L., Velegrakis, Y., Miller, R.J., Hernández, M., Fagin, R.: Translating Web data. In: Proceedings of the 28th International Conference on Very Large Data Bases, VLDB (2002)Google Scholar
  23. [SSR94]
    Sciore, E., Siegel, M., Rosenthal, A.: Using semantic values to falilitate interoperability among heterogeneous information systems. ACM Transactions on Database Systems 19(2), 254–290 (1994)CrossRefGoogle Scholar
  24. [TBMM01]
    Thompson, H.S., Beech, D., Maloney, M., Mendelsohn, N. (eds.): XML Schema Part 1: Structures. W3C Recommendation. World Wide Web Consortium (W3C) (May 2001), http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/
  25. [Ull97]
    Ullman, J.D.: Information integration using logical views. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 19–40. Springer, Heidelberg (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Shawn Bowers
    • 1
  • Bertram Ludäscher
    • 1
  1. 1.San Diego Supercomputer CenterUniversity of CaliforniaSan Diego La JollaUSA

Personalised recommendations