Abstract
The first component of data integration is schema alignment. As we showed in Section 1.2.3, there can be thousands to millions of data sources in the same domain, but they often describe the domain using different schemas. As an illustration, in the motivating example in Section 1.1, the four sources describe the flight domain using very different schemas: they contain different numbers of tables and different numbers of attributes; they may use different attribute names for the same attribute (e.g., Scheduled Arrival Date in Airline2.Flight vs. Scheduled in Airport3.Arrivals); they may apply different semantics for attributes with the same name (e.g., Arrival Time may mean landing time in one source and arrival-at-gate time in another source). To integrate data from different sources, the first step is to align the schemas and understand which attributes have the same semantics and which ones do not.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Dong, X.L., Srivastava, D. (2015). Schema Alignment. In: Big Data Integration. Synthesis Lectures on Data Management. Springer, Cham. https://doi.org/10.1007/978-3-031-01853-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-01853-4_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-00725-5
Online ISBN: 978-3-031-01853-4
eBook Packages: Synthesis Collection of Technology (R0)eBColl Synthesis Collection 6