Merging Sets of Taxonomically Organized Data Using Concept Mappings under Uncertainty

  • David Thau
  • Shawn Bowers
  • Bertram Ludäscher
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5871)


We present a method for using aligned ontologies to merge taxonomically organized data sets that have apparently compatible schemas, but potentially different semantics for corresponding domains. We restrict the relationships involved in the alignment to basic set relations and disjunctions of these relations. A merged data set combines the domains of the source data set attributes, conforms to the observations reported in both data sets, and minimizes uncertainty introduced by ontology alignments. We find that even in very simple cases, merging data sets under this scenario is non-trivial. Reducing uncertainty introduced by the ontology alignments in combination with the data set observations often results in many possible merged data sets, which are managed using a possible worlds semantics. The primary contributions of this paper are a framework for representing aligned data sets and algorithms for merging data sets that report the presence and absence of taxonomically organized entities, including an efficient algorithm for a common data set merging scenario.


Compress Function Ontology Concept Context Attribute Naive Algorithm Ontology Alignment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cliff, A.D., Haggett, P., Smallman-Raynor, M.: The changing shape of island epidemics: historical trends in icelandic infectious disease waves. J Hist Geogr. 1902–1988 (2009)Google Scholar
  2. 2.
    Berkley, C., Jones, M., Bojilova, J., Higgins, D.: Metacat: a schema-independent xml database system. In: SSDBM, pp. 171–179 (2001)Google Scholar
  3. 3.
    Thau, D.: Reasoning about taxonomies and articulations. In: EDBT Workshops, pp. 11–19 (2008)Google Scholar
  4. 4.
    Brachman, R.: What is-a is and isn’t: An analysis of taxonomic links in semantic networks. IEEE Computer 16, 30–36 (1983)Google Scholar
  5. 5.
    Randell, D.A., Cui, Z., Cohn, A.: A spatial logic based on regions and connection. In: KR, pp. 165–176 (1992)Google Scholar
  6. 6.
    Thau, D., Ludäscher, B.: Reasoning about taxonomies in first-order logic. Ecological Informatics 2(3), 195–209 (2007)CrossRefGoogle Scholar
  7. 7.
    Thau, D., Bowers, S., Ludäscher, B.: Merging taxonomies under RCC-5 algebraic articulations. In: Proceedings of the CIKM ONISW Workshop, pp. 47–54 (2008)Google Scholar
  8. 8.
    Lewis, C., Langford, C.: Symbolic Logic, 2nd edn. Dover, New York (1959)zbMATHGoogle Scholar
  9. 9.
    Abiteboul, S., Kanellakis, P.C., Grahne, G.: On the representation and querying of sets of possible worlds. In: SIGMOD, pp. 34–48 (1987)Google Scholar
  10. 10.
    Antova, L., Jansen, T., Koch, C., Olteanu, D.: Fast and simple relational processing of uncertain data. In: ICDE, pp. 983–992 (2008)Google Scholar
  11. 11.
    Bowers, S., Madin, J.S., Schildhauer, M.P.: A conceptual modeling framework for expressing observational data semantics. In: Li, Q., Spaccapietra, S., Yu, E., Olivé, A. (eds.) ER 2008. LNCS, vol. 5231, pp. 41–54. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  12. 12.
    Antova, L., Koch, C., Olteanu, D.: World-set decompositions: Expressiveness and efficient algorithms. In: Schwentick, T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353, pp. 194–208. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  13. 13.
    Bachmair, L., Ganzinger, H., Waldmann, U.: Set constraints are the monadic class. Logic in Computer Science, 75–83 (1993)Google Scholar
  14. 14.
    Darwiche, A.: New advances in compiling cnf into decomposable negation normal form. In: ECAI, pp. 328–332 (2004)Google Scholar
  15. 15.
    Korovin, K.: iProver – an instantiation-based theorem prover for first-order logic (system description). In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008. LNCS (LNAI), vol. 5195, pp. 292–298. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Lenzerini, M.: Data integration: A theoretical perspective. In: PODS (2002)Google Scholar
  17. 17.
    Peet, R.: Taxonomic concept mappings for 9 taxonomies of the genus Ranunculus published from 1948 to 2004. Unpublished data set (2005)Google Scholar
  18. 18.
    Vasseur, P., Mouaddib, E.M., Pégard, C.: Introduction to multisensor data fusion. In: Zurawski, R. (ed.) The Industrial Information Technology Handbook, pp. 1–10. CRC Press, Boca Raton (2005)Google Scholar
  19. 19.
    Noy, N.F., Musen, M.A.: PROMPT: Algorithm and tool for automated ontology merging and alignment, pp. 450–455. AAAI, Menlo Park (2000)Google Scholar
  20. 20.
    McGuinness, D.L., Fikes, R., Rice, J., Wilder, S.: An environment for merging and testing large ontologies. In: ECAI (2000)Google Scholar
  21. 21.
    Stumme, G., Maedche, A.: Ontology merging for federated ontologies on the semantic web. In: FMII, pp. 413–418 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • David Thau
    • 1
  • Shawn Bowers
    • 2
  • Bertram Ludäscher
    • 1
    • 2
  1. 1.Dept. of Computer ScienceUniversity of California Davis
  2. 2.Genome CenterUniversity of California Davis

Personalised recommendations