Database Schema Matching Using Machine Learning with Feature Selection

  • Jacob Berlin
  • Amihai Motro
Conference paper

DOI: 10.1007/3-540-47961-9_32

Part of the Lecture Notes in Computer Science book series (LNCS, volume 2348)
Cite this paper as:
Berlin J., Motro A. (2002) Database Schema Matching Using Machine Learning with Feature Selection. In: Pidduck A.B., Ozsu M.T., Mylopoulos J., Woo C.C. (eds) Advanced Information Systems Engineering. CAiSE 2002. Lecture Notes in Computer Science, vol 2348. Springer, Berlin, Heidelberg

Abstract

Schema matching, the problem of finding mappings between the attributes of two semantically related database schemas, is an important aspect of many database applications such as schema integration, data warehousing, and electronic commerce. Unfortunately, schema matching remains largely a manual, labor-intensive process. Furthermore, the effort required is typically linear in the number of schemas to be matched; the next pair of schemas to match is not any easier than the previous pair. In this paper we describe a system, called Automatch, that uses machine learning techniques to automate schema matching. Based primarily on Bayesian learning, the system acquires probabilistic knowledge from examples that have been provided by domain experts. This knowledge is stored in a knowledge base called the attribute dictionary. When presented with a pair of new schemas that need to be matched (and their corresponding database instances), Automatch uses the attribute dictionary to find an optimal matching. We also report initial results from the Automatch project.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Jacob Berlin
    • 1
  • Amihai Motro
    • 1
  1. 1.Information and Software Engineering DepartmentGeorge Mason UniversityFairfaxVirginia

Personalised recommendations