Context-Sensitive Clinical Data Integration

  • James F. Terwilliger
  • Lois M. L. Delcambre
  • Judith Logan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4254)


Current methods for data integration are as difficult to use as they are powerful. Motivated by our work with clinical data and the people who analyze it, we present two components that allow non-technical users that are domain experts to create and reuse complex data integration processes. The GUAVA (GUI As View Apparatus) component enables data analysts to make informed data integration decisions based on detailed accounts of the user interface that was used to generate the data. The MultiClass component allows analysts to revisit decisions made for prior studies and reuse them or not each time the data is used. We describe these two components with examples where a warehouse of clinical data is used to support research studies. We describe the state of our implementation and why we believe the two components can be automatically translated into ETL workflows.


Data Integration Study Schema Conjunctive Query Data Analyst Reporting Tool 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Dhamankar, R., Lee, Y., Doan, A., Halevy, A., Domingos, P.: iMAP: discovering complex semantic matches between database schemas. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, Paris, France, pp. 383–394 (2004)Google Scholar
  2. 2.
    Dong, X., Halevy, A.Y.: A Platform for Personal Information Management and Integration. In: Proceedings of the Second Biennial Conference on Innovative Data Systems Research (CIDR), Asilomar, CA, USA, January 4-7, pp. 119–130 (2005)Google Scholar
  3. 3.
    Du, F., Amir-Yahia, S., Freire, J.: A comprehensive solution to the XML-to-relational mapping problem. In: Proceedings of the 6th Annual ACM International Workshop on Web Information and Data Management, Washington DC, November 12-13, pp. 31–38 (2004)Google Scholar
  4. 4.
    Evens, M.: Thesaural Relations in Information Retrieval. In: Green, R., Bean, C.A., Myaeng, S.H. (eds.) The Semantics of Relationships: An Interdisciplinary Perspective, pp. 143–160. Kluwer Academic Publishers, Dordrecht (2002)Google Scholar
  5. 5.
    Gingras, F., Lakshmanan, L.V.S.: nD-SQL: A multi-dimensional language for interoperability and OLAP. In: Proceedings of the 24th International Conference on Very Large Data Bases (VLDB), New York City, USA, pp. 134–145 (1998)Google Scholar
  6. 6.
    Larson, J.A., Navathe, S.B., Elmasri, R.: A Theory of Attribute Equivalence in Databases with Application to Schema Integration. IEEE Transactions on Software Engineering 15(4), 449–463 (1989)MATHCrossRefGoogle Scholar
  7. 7.
    Madhavan, J., Halevy, A.Y.: Composing Mappings Among Data Sources. In: Proceedings of the 29th International Conference on Very Large Data Bases (VLDB), Berlin, Germany, September 2003, pp. 572–583 (2003)Google Scholar
  8. 8.
    Miller, R.J.: Using Schematically Heterogeneous Structures. In: Proceedings of ACM SIGMOD, Seattle, WA, June 1998, vol. 27(2), pp. 189–200 (1998)Google Scholar
  9. 9.
    Miller, R.J., Hernandez, M.A., Haas, L.M., Yan, L.-L., Ho, C.T.H., Fagin, R., Popa, L.: The Clio Project: Managing Heterogeneity. SIGMOD Record 30(1), 78–83 (2001)CrossRefGoogle Scholar
  10. 10.
    Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. In: Proceedings of the 27th International Conferences on Very Large Databases, vol. 10(4), pp. 334–350 (2001)Google Scholar
  11. 11.
    Sciore, E., Siegel, M., Rosenthal, A.: Using semantic values to facilitate interoperability among heterogeneous information systems. ACM Transactions on Database Systems 19(2), 254–290 (1994)CrossRefGoogle Scholar
  12. 12.
    Spaccapietra, S., Parent, C., Dupont, Y.: Model independent assertions for integration of heterogeneous schemas. VLDB Journal 1, 81–126 (1992)CrossRefGoogle Scholar
  13. 13.
    Spooner, D.L.: Towards an Object-Oriented Data Model for a Mechanical CAD Database System. In: Dittrich, K.R., Dayal, U., Buchmann, A.P. (eds.) On Object-Oriented Database Systems, pp. 189–205. Springer, Berlin (1991)Google Scholar
  14. 14.
    Vassiliadis, P., Simitsis, A., Georgantas, P., Terrovitis, M., Skiadopoulos, S.: A generic and customizable framework for the design of ETL scenarios. Information Systems 30(7), 492–525 (2005)CrossRefGoogle Scholar
  15. 15.
    Wang, Y.R., Madnick, S.E.: The inter-database instance identification problem in integrating autonomous systems. In: Proceedings of the Fifth International Conference on Data Engineering (ICDE), Los Angeles, CA, February 6-10, pp. 46–55. IEEE Computer Society Press, Washington (1989)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • James F. Terwilliger
    • 1
  • Lois M. L. Delcambre
    • 1
  • Judith Logan
    • 2
  1. 1.Department of Computer SciencePortland State UniversityPortlandUSA
  2. 2.Department of Medical Informatics and Clinical Epidemiology, School of MedicineOregon Health and Science UniversityPortlandUSA

Personalised recommendations