Skip to main content
Log in

Matching object catalogues

  • Original Paper
  • Published:
Innovations in Systems and Software Engineering Aims and scope Submit manuscript

Abstract

A catalogue holds information about a set of objects, typically classified using terms taken from a given thesaurus, and described with the help of a set of attributes. Matching a pair of catalogues means to find a relationship between the terms of their thesauri and a relationship between their attributes. This paper first introduces a matching approach, based on the notion of similarity, that applies to both thesauri and attribute matching. It then describes matchings based on mutual information and introduces variations that explore certain heuristics. Finally, it discusses experimental results that evaluate the precision of the matchings and that measure the influence of the heuristics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. ADL (1999) Alexandria digital library gazetteer. Map and Imagery Lab, Davidson Library, University of California, Santa Barbara, CA. Copyright UC Regents. http://www.alexandria.ucsb.edu/gazetteer

  2. Bernstein P, Melnik S (2007) Model management 2.0: manipulating richer mappings. In: Proc. 2007 ACM SIGMOD Intl. Conf. on Management of Data, pp 1–12. ACM Press, New York, NY, USA

  3. Bilke A, Naumann F (2005) Schema matching using duplicates. In: Naumann F (ed) Proc. 21st Int’l. Conf. on Data Engineering, pp 69–80

  4. Brauner DF, Casanova MA, Milidiú RL (2006) Mediation as recommendation: an approach to design mediators for object catalogues. In: OTM Confederated International Workshops and Posters. Montpellier, France, 29 October–3 November 2006. Lecture Notes in Computer Science, vol 4278, pp 46–47. ISSN 0302-9743

  5. Brauner DF, Casanova MA, Milidiú RL (2007a) Towards gazetteer integration through an instance-based thesauri mapping approach. In: Advances in geoinformatics. Springer, Heidelberg, pp 235–245

  6. Brauner DF, Gazola A, Casanova MA (2008) Adaptative matching of database web services export schemas. In: Proc. Int’l. Conf. on Enterprise Information Systems, Barcelona, Spain

  7. Brauner DF, Intrator C, Freitas JC, Casanova MA (2007b) An instance-based approach for matching export schemas of geographical database web services. In: Vinhas L, da Rocha Costa AC, (eds) IX Proc. Brazilian Symposium on Geoinformatics, pp 109–120

  8. Casanova MA, Breitman KK, Brauner DF, Marins AL (2007) Database conceptual schema matching. Computer, IEEE Computer Society, pp 102–104

  9. Castano S, Ferrara A, Montanelli S, Racca G (2004) Semantic information interoperability in open networked systems. In: Proc. Int’l. Conf. on Semantics of a Networked World (ICSNW), in cooperation with ACM SIGMOD 2004, Paris, France

  10. Euzenat J, Shvaiko P (2007) Ontology matching. Springer, New York

    MATH  Google Scholar 

  11. Frakes W, Baeza-Yates R (1992) Information retrieval: data structure and algorithms. Prentice Hall, Englewood Cliffs, NJ, USA

    Google Scholar 

  12. GNIS (2005) Geographic Names Information System, U.S. Department of the Interior, U.S. Geological Survey, Reston, USA. http://geonames.usgs.gov/

  13. GNS (2006) GEOnet Names Server, U.S. National Geospatial-Intelligence Agency, USA. http://gnswww.nga.mil/geonames/GNS

  14. Hill L, Frew J, Zheng Q (1999) Geographic names: the implementation of a gazetteer in a geo-referenced digital library. In: D-Lib. http://www.dlib.org/dlib/january99/hill/01hill.html

  15. Hindle D (1990) Noun classification from predicate-argument structures. In: Proc. 28th annual meeting of the association for computational linguistics, pp 268–275, Morristown, NJ, USA

  16. ISO-2788 (1986) Documentation—guidelines for the development of monolingual thesauri, International Standard ISO-2788, 2nd edn, pp 11–15

  17. Janée G (2004) ADL Gazetteer Service Protocol v.1.2. http://www.alexandria.ucsb.edu/gazetteer/protocol/

  18. Lee J (1993) Information retrieval based on conceptual distance in Is-A hierarchies. J Document 49(2): 188–207

    Article  Google Scholar 

  19. Leme LAP, Casanova MA (2008) Schema matching using similarity models. Technical Report 28/08. Department of Informatics, PUC-Rio

  20. Lin D (1998) An information-theoretic definition of similarity. In: Proc. 15th Int’l. Conf. on Machine Learning, pp 296–304, Madison, WI

  21. Madhavan J, Cohen S, Dong XL, Halevy AY, Jeffery SR, Ko D, Yu C (2007) Web-scale data integration: you can afford to pay as you go. In: CIDR, pp 342–350. http://www.crdrdb.org

  22. Madhavan J, Madhavan J, Bernstein P, Doan A, Halevy A (2005) Corpus-based schema matching. In: Bernstein P (ed) Proc. 21st Int’l. Conf. on Data Engineering ICDE 2005, pp 57–68

  23. Manning CD, Schütze H (2000) Foundations of statistical natural language processing, chap 8, pp 265–271. The MIT Press, Cambridge, England

  24. Percivall G (2003) OpenGIS® Reference Model, Document number OGC 03-040, Version 0.1.3, Open GIS Consortium, Inc

  25. Rahm E, Bernstein P (2001) A survey of approaches to automatic schema matching. VLDB J 10(4): 334–350

    Article  MATH  Google Scholar 

  26. Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: Proc. 14th Int’l. Joint Conf. on Artificial Intelligence, pp 448–453

  27. Spertus E, Sahami M, Buyukkokten O (2005) Evaluating similarity measures: a large-scale study in the orkut social network. In: Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery and data mining, Chicago, IL, USA, August 21–24, pp 678–684

  28. Tversky A (1977) Features of similarity. Psychol Rev 84(4): 327–352

    Article  Google Scholar 

  29. UNESCO (1995) UNESCO Thesaurus. United Nations Educational, Scientific and Cultural Organization. http://www.ulcc.ac.uk/unesco

  30. Wang J, Wen J, Lochovsky F, Ma W (2004) Instance-based schema matching for web databases by domain-specific query probing. In: Nascimento MA, Özsu MT, Kossmann D, Miller RJ, Blakeley JA, Schiefer KB (eds) Proc.13th Int’l. Conf. on Very Large Data Bases, pp 408–419, Toronto, Canada

  31. Wordnet (2005) Wordnet—a lexical database for the English language. Cognitive Science Laboratory, Princeton University, Princeton, NJ, USA. http://wordnet.princeton.edu

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luiz André P. Leme.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Leme, L.A.P., Brauner, D.F., Breitman, K.K. et al. Matching object catalogues. Innovations Syst Softw Eng 4, 315–328 (2008). https://doi.org/10.1007/s11334-008-0070-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11334-008-0070-3

Keywords

Navigation