A Discovery-Based Approach to Database Ontology Design

  • Silvana Castano
  • Valeria De Antonellis


In this paper, we introduce an approach to task-driven ontology design which is based on information discovery from database schemas. Techniques for semi-automatically discovering terms and relationships used in the information space, denoting concepts, their properties and links are proposed, which are applied in two stages. At the first stage, the focus is on the discovery of heterogeneity/ambiguity of data representations in different schemas. For this purpose, schema elements are compared according to defined comparison features and similarity coefficients are evaluated. This stage produces a set of candidates for unification into ontology concepts. At the second stage, decisions are made on which candidates to unify into concepts and on how to relate concepts by semantic links. Ontology concepts and links can be accessed according to different perspectives, so that the ontology can serve different purposes, such as, providing a search space for powerful mechanisms for concept location, setting a basis for query formulation and processing, and establishing a reference for recognizing terminological relationships between elements in different schemas.


Ontology design Similarity techniques Schema analysis and clustering. Distributed and heterogeneous databases. 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Y. Arens, C.Y. Chee, C.N. Hsu, C.A. Knoblock, “Retrieving and Integrating Data from Multiple Information Sources,” Int. Journal of Intelligent and Cooperative Information Systems, vol. 2, no. 2, pp. 127–158, 1993.CrossRefGoogle Scholar
  2. 2.
    P. Atzeni, V. De Antonellis, Relational Database Theory, The Benjamin/Cummings Publishing Company, 1993.Google Scholar
  3. 3.
    C. Batini, M. Lenzerini, S.B. Navathe, “A Comprehensive Analysis of Methodologies for Database Schema Integration,” ACM Computing Surveys, vol. 18, no. 4, pp. 322–364, 1986.CrossRefGoogle Scholar
  4. 4.
    C. Batini, S. Castano, V. De Antonellis, M.G. Fugini, B. Pernici, “Analysis of an Inventory of Information Systems in the Public Administration,” Requirements Engineering Journal, vol. 1, no. 1, Springer, pp. 47–62, 1996.CrossRefGoogle Scholar
  5. 5.
    J.M. Blanco, A. Illarramendi, A. Goni, “Building a Federated Relational Database System: An Approach Using a Knowledge-Based System,” Int. Journal of Intelligent and Cooperative Information Systems, vol. 3, no. 4, pp. 415–455, 1994.CrossRefGoogle Scholar
  6. 6.
    M.W. Bright, A.R. Hurson, S. Pakzad, “Automated Resolution of Semantic Heterogeneity in Multidatabases ” ACM Transactions on Database Systems, vol. 19, no. 2, pp. 212–253, June 1994.CrossRefGoogle Scholar
  7. 7.
    S. Castano, V. De Antonellis, M.G. Fugini, B. Pernici, “Conceptual Schema Analysis: Techniques and Applications,” ACM Transactions on Database Systems, (to appear).Google Scholar
  8. 8.
    S. Castano, V. De Antonellis, “Semantic Dictionary Design for Database Interoperability,” in Proc. of ICDE’97, 13th IEEE Conf. on Data Engineering, Birmingham, 1997, pp. 43–54.Google Scholar
  9. 9.
    R. Cattell (ed.), The Object Database Standard: ODMG-93, Morgan Kaufmann, 1996.Google Scholar
  10. 10.
    B. Everitt, Cluster Analysis, Heinemann Educational Books Ltd, Social Science Research Council, 1974.Google Scholar
  11. 11.
    N.V. Findler, (Ed.), Associative Networks, Academic Press, 1979.Google Scholar
  12. 12.
    H. Garcia-Molina, et al., “The TSIMMIS Approach to Mediation: Data Models and Languages,” in Proc. of the NGITS workshop, 1995Google Scholar
  13. available at
  14. 13.
    J. Gilarranz J. Gonzalo, F. Verdejo, “Using the Euro Word Net Multilingual Semantic Database,” in Proc. of AAAI-96 Spring Symposium Cross-Language Text and Speech Retrieval, 1996.Google Scholar
  15. 14.
    T.R. Gruber, “Towards Principles for the Design of Ontologies Used for Knowledge Sharing,” Int. Journal of Human and Computer Studies, vol. 43, Nos. 5 /6, pp. 907–928, 1995.CrossRefGoogle Scholar
  16. 15.
    T.R. Gruber, “Ontolingua: A Mechanism to Support Portable Ontologies,” Tech. Rep. KSL 91–66, Stanford University, Knowledge System Laboratory, March 1992.Google Scholar
  17. 16.
    N. Guarino, “Semantic Matching: Formal Ontological Distinctions for Information Organization, Extraction, and Integration,” in Information Extraction–A Multidisciplinary Approach to an Emerging Information Technology, LNAI no. 1299, pp. 139–170, 1997.CrossRefGoogle Scholar
  18. 17.
    N. Guarino, R. Poli, (eds.), Formal Ontology in Conceptual Analysis and Knowledge Representation, Special Issue of the Int. Journal of Human and Computer Studies, vol. 43, Nos.5/6, Academic Press, 1995.Google Scholar
  19. 18.
    M.N. Huhns, M.P. Singh, “Ontologies for Agents,” IEEE Internet Computing, vol. 1, no. 6, pp. 81–83, 1997.CrossRefGoogle Scholar
  20. 19.
    M.A. Jeusfeld, M. Papazoglou, “Information Brokering,” in [20].Google Scholar
  21. 20.
    B. Kramer, M. Papazoglou, H.W. Schmidt, (eds.), Information Systems Interoperability, RSP Press, John Wiley, 1998.Google Scholar
  22. 21.
    Internet-Based Agents,“ Special Issue of IEEE Internet Computing, vol. 1, no. 4, July/August 1997.Google Scholar
  23. 22.
    J.A. Larson, S.B. Navathe, R. Elmasri, “A Theory of Attribute Equivalence in Databases with Application to Schema Integration,” IEEE Transactions on Software Engineering, vol. 15, no. 4, pp. 449–463, 1989.zbMATHCrossRefGoogle Scholar
  24. 23.
    A.Y. Levy, A. Rajaraman, J.J. Ordille, “Querying Heterogeneous Information Sources Using Source Descriptions,” in Proc. of VLDB’96, the 22th Int. Conf. on Very Large Databases, Mumbai (Bombay), 1996.Google Scholar
  25. 24.
    A.Y. Levy, D. Srivastava, T. Kirk, “Data Model and Query Evaluation in Global Information Systems,” Int. Journal of Intelligent Information Systems, vol. 5, pp. 121–143, 1995.CrossRefGoogle Scholar
  26. 25.
    Documentation available at
  27. 26.
    D. McLeod, A. Si, “The Design and Experimental Evaluation of an Information Discovery Mechanism for Networks of Autonomous Database Systems,” in Proc. of ICDE’95, 11th Conf. on Data Engineering, Taiwan, 1995, pp. 15–24.Google Scholar
  28. 27.
    E. Mena, V. Kashyap, A. Sheth, A. Illarramendi, “OBSERVER: An Approach for Query Processing in Global Information Systems based on Interoperation across Pre-existing Ontologies,” in Proc. of First IFCIS International Conference on Cooperative Information Systems (CoopIS’96), Brussels ( Belgium),, June 1996, pp. 14–25.CrossRefGoogle Scholar
  29. 28.
    Mikrokosmos Ontology,“ Documentation available at,1996.
  30. 29.
    A.G. Miller, “WordNet: a lexical database for English,” Communications of the ACM, vol. 38, no. 11, pp. 39–41, 1995.CrossRefGoogle Scholar
  31. 30.
    G. Salton, Automatic Text Processing - The Transformation, Analysis and Retrieval of Information by Computer, Addison-Wesley, 1989.Google Scholar
  32. 31.
    A.P. Sheth, S.K. Gala, S.B. Navathe, “On Automatic Reasoning For Schema Integration,” Int. Journal of Intelligent and Cooperative Information Systems, vol. 2, no. 1, pp. 23–50, 1993.CrossRefGoogle Scholar
  33. 32.
    T.J. Teorey, G. Wei, D. L. Bolton, Koenig, J.A., “ER Model Clustering as an Aid for User Communication and Documentation in Database Design,” Communications of the ACM, vol. 3, no. 8, 1989.Google Scholar
  34. 33.
    Guidelines for the Construction and Development of Monoligual Thesauri,“ UNI ISO Report N.2788, 1993.Google Scholar
  35. 34.
    G. Wiederhold, “Mediators in the Architecture of Future Information Systems,” IEEE Computer, vol. 25, pp. 38–49, 1992.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 1999

Authors and Affiliations

  • Silvana Castano
    • 1
  • Valeria De Antonellis
    • 2
  1. 1.Dipartimento di Scienze dell’InformazioneUniversity of MilanoMilanoItaly
  2. 2.Dipartimento di Elettronica per l’AutomazioneUniversity of BresciaBresciaItaly

Personalised recommendations