Towards Intelligent Integration of Heterogeneous Information Sources

  • Shamkant B. Navathe
  • Michael J. Donahoo

Abstract

Current methodologies for information integration are inadequate for solving the problem of integration of large scale, distributed information sources (e.g. databases, free-form text, simulation etc.). The existing approaches are either too restrictive and complicated as in the “federated” (global model) approach or do not provide the necessary functionality as in the “multidatabase” approach. We propose a hybrid approach combining the advantages of both the federated and multidatabase techniques which we believe provide the most feasible avenue for large scale integration. Under our architecture, the individual data site administrators provide anaugmented export schemaspecifying knowledge about the sources of data (where data exists), their structure (underlying data model or file structure), their content (what data exists), and their relationships (how the data relates to other information in its domain). The augmented export schema from each information source provides an intelligent agent, called the “mediator”, knowledge which can be used to infer information on some of the existing inter-system relationships. This knowledge can then be used to generate a partially integrated, global view of the data.

Keywords

Archie Almaden 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Are]
    Yigal Arens, Chin Chee, Chun-Nan Hsu, and Craig A. Knoblock. Retrieving and inte-gration data from multiple information sources. To appear in International Journal on Intelligent and Cooperative Information Systems.Google Scholar
  2. [Are94]
    Yigal Arens, Chin Chee, Chun-Nan Hsu, Hoh In, and Criag A. Knoblock. Query processing in an information mediator. ISI Technical Report, 1994.Google Scholar
  3. [Bat86]
    C. Batini, M. Lenzernini, and S.B. Navathe. A comparative analysis of methodologies for database schema integration.ACM Computing Surveys18(4):325–364, Dec. 1986.CrossRefGoogle Scholar
  4. [Bec89]
    Howard W. Beck, Sunit K. Gala, and Shamkant B. Navathe. Classification as a query processing technique in the CANDIDE semantic data model. In1989 IEEE Conference on Data Engineeringpages 572–581. IEEE, 1989.Google Scholar
  5. [Bor94]
    Alexander Borgida. Description logics in data management. Technical report, Rutegers University, July 1994.Google Scholar
  6. [Bra85]
    R. Brachman and G. Schmolze. An overview of the KL-ONE knowledge representation system.Cognitive Science9(2):171–216, 1985.CrossRefGoogle Scholar
  7. [Bri94]
    David Brill.Loom Reference Manual (Version 2.0).ISX Corp, October 1994.Google Scholar
  8. [Cha92]
    Hans Chalupsky, Tim Finin, Rich Fritzson, Don McKay, Stu Shapiro, and Gio Weiderhold. An overview of KQML: A knowledge query and manipulation language. Technical report, KQML Advisory Group, April 1992.Google Scholar
  9. [Don93]
    Michael J. Donahoo.Integration of Information in Heterogeneous Library Information Systems.Master’s thesis, Baylor University, May 1993.Google Scholar
  10. [Goe93]
    Ashok K. Goel, Andres Garza, Nathalie Grue, M. Recker, and T. Govindaraj. Beyond domain knowledge: Towards a computing environment for the learning of design strategies and skills. Technical report, College of Computing, Georgia Tech, 1993.Google Scholar
  11. [Lit90]
    Witold Litwin, Leo Mark, and Nick Roussopoulos. Interoperability of multiple autonomous databases.ACM Computing Surveys22(3):267–293, September 1990.CrossRefGoogle Scholar
  12. [Mar85]
    Leo Mark.Self-Describing Database Systems-Formalization and Realization.PhD thesis, Computer Science Department, University of Maryland, 1985.Google Scholar
  13. [Nav91]
    Shamkant Navathe, Sunit K. Gala, and Seong Geum. Application of the CANDIDE se-mantic data model for federations of information bases. InInvited paper COMAD 91Bombay, India, December 1991.Google Scholar
  14. [Nav95]
    Shamkant B. Navathe and Ashoka N. Savasere. A practical schema integration facility using an object-oriented model. To be published inObject Oriented Multidatabase Systems: A Solution for Advanced Applications(O. Bukhres and A. Elmagarmid, eds), Prentice-Hall, January 1995.Google Scholar
  15. [Pap94]
    Yannis Papakonstantinou, Hector Garcia-Molina, and Jennifer Widom. Object exchange across heterogeneous information sources. Standford University, Department of Computer Science, Technical Report, 1994.Google Scholar
  16. [Par93a]
    Paramax System Corporation.Computer System Operator ‘s Manual for the Cache-Based Intelligent Data Interface of the Intelligent Database Interfacerevision 2.3 edition, Feb. 1993.Google Scholar
  17. [Par93b]
    Paramax Systems Corporation.Computer System Operator’s Manual for the Cache-Based Intelligent Data Interface of the Intelligent Database Interface, revision 2.3 edition, Feb. 1993.Google Scholar
  18. [Sav91]
    Ashoka Savasere, Amit Sheth, Sunit Gala, Shamkant Navathe, and Howard Marcus. On applying classification to schema integration.In First International Workshop on In-teroperability in Multidatabase Systems, pages 258–261. IEEE Computer Society, IEEE Computer Society Press, April 1991.Google Scholar
  19. [She90]
    Amit P. Sheth and James A. Larson. Federated database systems for managing dis-tributed, heterogeneous, and autonomous databases.ACM Computing Surveys22(3):183–236, September 1990CrossRefGoogle Scholar
  20. [She93]
    Amit P. Sheth, Sunit K. Gala, and Shamkant B. Navathe. On automatic reasoning for schema integration.International Journal of Intelligent and Cooperative Information Systems2(1):23–50, 1993.CrossRefGoogle Scholar
  21. [Spe88]
    R. Speth, editor.Global View Definition and Multidatabase Languages-Two Approaches to Database Integration.Amsterdam:Holland April 1988.Google Scholar
  22. [Vee95a]
    Aravindan Veerasamy, Scott Hudson, and Shamkant Navathe. Visual interface for textual information retrieval systems. To appear in Proceedings of IFIP 2.6 Third Working Conference on Visual Database Systems, Lausanne, Switzerland, Springer Verlag, March 1995.Google Scholar
  23. [Vee95b]
    Aravindan Veerasamy and Shamkant Navathe. Querying, navigating and visualizing an online library catalog. Submitted for Publication, January 1995.Google Scholar
  24. [Wei92]
    Gio Weiderhold. Mediators in the architecture of future information systems.IEEE Computerpages 38–49, March 1992.Google Scholar
  25. [Wei93]
    Gio Weiderhold. Intelligent integration of information. In Arie Segev, editorACM SIGMOD International Conferencevolume 22, pages 434–437, ACM, ACM Press, June 1993.Google Scholar
  26. [Wha93]
    Whan-Kyu Whang, Sharma Chakravathy, and Shamkant B. Navathe. Heterogeneous databases: Toward merging and querying component schema.Computing Systems 6(3) August 1993. (a Univ. of California Press publication). Google Scholar

Copyright information

© Springer Science+Business Media New York 1996

Authors and Affiliations

  • Shamkant B. Navathe
    • 1
  • Michael J. Donahoo
    • 1
  1. 1.College of ComputingGeorgia Institute of TechnologyAtlantaUSA

Personalised recommendations