Mapping Conceptual Models to Database Schemas

Embley, David W.; Mok, Wai Yin

doi:10.1007/978-3-642-15865-0_5

David W. Embley^nAff1 &
Wai Yin Mok^nAff2

2390 Accesses
9 Citations

Abstract

From the beginning, a primary objective of conceptual modeling has been to generate good database schemas. This chapter surveys and explains the principles and practices of algorithmically mapping conceptual models to database schemas. An important unifying theme is that the underlying principles are independent of conceptual-modeling languages and notation. Although illustrated mainly in terms of the entity-relationship model, the chapter explains and illustrates the application of the mapping principles to extended entity-relationship models, the unified modeling language, and generic conceptual-model hypergraphs. Besides explaining conceptual-model-independent mapping rules, the chapter also addresses normalization issues, explaining both the map-then-normalize approach and the normalize-then-map approach to schema normalization. In addition to mapping conceptual models to flat relations for standard relational databases, the chapter also shows how to map conceptual models to nested relations applicable for object-based and XML storage structures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Here and throughout the chapter “composite key” always designates a minimal key, so that if any of the attributes of the key is removed, the remaining attribute(s) no longer provide the unique identification property of a key.
2.
Unless otherwise explicitly stated, the first key listed in a relational schema is the primary key – RoomNr in this example.
3.
We refer the interested reader to the additional readings in Sect. 5.7 for these esoteric mappings.
4.
Here we make use of the common set notation in the relational database literature that lets a sequence of attribute names designate a set. Thus, {GuestNr Description} is a key set with a single composite key consisting of two attributes, whereas {RoomNr, RoomName} is a key set with two keys.
5.
Note that although Name and Address functionally depend on RoomNr and also on RoomName, they do not directly functionally depend on either RoomNr or RoomName because they functionally depend on GuestNr, which is not a key attribute for the key set {RoomNr, RoomName} for which we are building a relational schema.
6.
Note that Duration functionally depends on Description, but since Description is a proper subset of GuestNr Description, we exclude Duration. Similarly, we exclude Name and Address since GuestNr is a proper subset of GuestNr Description.
7.
There are many ER variants, and some have conventions to designate composite keys for entity sets (e.g., a connecting line among underlined attributes of an entity set).
8.
Whether we should keep a relation for Room here is an interesting question. Observe that its data may be completely recorded in the relation HasReservationFor. If so, we can discard it. In our application, however, it is possible (even likely) that there is no current reservation for some of the rooms. Thus, to preserve all room numbers in our database, we keep it.
9.
Note, by the way, that the entity set Reservation is not weak, even though it is certainly a relationship set we view as an entity set. When we turned it into an entity set, we gave it a key, GuaranteeNr, so that it did not become a weak entity set.
10.
Location is meant to be a simple city or town or other designated place. Several bed-and-breakfast establishments can be in the same location (e.g., Boston), but each establishment in the same location must have a different name. Thus, Name Location \( \rightarrow \) BedAndBreakfast.
11.
Typically, as in our examples here, \( W \) has only one key. But, for example, if we also had RoomName for the weak entity set Room, we would have a second key for Room, namely, Name Location RoomName.
12.
In this context, we call subset entity sets “specialization entity sets ” or just “specializations” and superset entity sets “generalization entity sets” or just “generalizations.”
13.
Since all specializations in an ISA hierarchy are subsets of their generalizations, entities in the specializations inherit their identity from their generalization(s). In most common cases there is only one generalization. When a specialization has more than one generalization, it inherits its identity from all generalizations. Often, however, all generalizations have the same identifying attribute inherited from some root generalization. ReturningGuest in Fig. 5.5 a inherits its identity from both CurrentGuest and FutureGuest, but these entity sets, in turn, both inherit their identity from Guest. Thus, GuestNr becomes the one and only key attribute that identifies returning guests.
14.
Although rare, if there are multiple roots, we collapse the hierarchy to all roots. Any entity set that is the specialization of multiple roots collapses to all of them.
15.
If there are multiple roots, the leaves inherit from all roots.
16.
UML does not use underlines to denote keys for classes; rather it uses underlines to denote static attributes – attributes that belong to classes, not attributes applicable to instances of classes.
17.
We could have illustrated the derivation of SQL DDL for all earlier generated schemas as well as this one, but we only illustrate this derivation once.
18.
Many argue that if conceptual-model diagrams are not canonical, they are not good quality diagrams.
19.
Any ISA hierarchies remain intact without alteration.
20.
Some conceptual models (e.g., ORM [5.9] and OSM [5.8]) are directly based on hypergraphs and already include all connections among attributes. These conceptual models need no transformation to hypergraphs. For these hypergraph-based conceptual models, the canonicalization and mappings to relational schemas proceed as we describe here.
21.
Usually these three main steps are enough. Exceptions arise when (1) the hypergraph is cyclic after redundancies have been removed and (2) optional participation interferes with our ability to capture all element values. An example of the first would be an additional edge between Description and Duration where one means the average duration for an activity and the other means the maximum allowable duration. In this case, we need to qualify Duration to be, say, AveDuration and MaxDuration, and generate the relational schema for Activity with three attributes, Duration, AveDuration, and MaxDuration. An example of the second would be optional participation for Name where we might want to keep all names of guests even if they have no guest number. In this case, we need a separate table for Name alone since we cannot capture all names in the Guest relational schema, which demands a GuestNr for every Guest. We can resolve both of these issues by adding roles. See [5.4] for details.
22.
To keep the various nodes straight, we should add comments to the diagram. Unless these comments help us keep the nodes conceptually straight and may help us choose names for relational schemas, we can ignore these comments.
23.
The algorithm is a simplified version of the heuristic \( n \)-ary algorithm in [5.12], which generates scheme trees from a canonical hypergraph.

References

Batini C, Ceri S, Navathe SB (1992) Conceptual database design: an entity-relationship approach. Benjamin/Cummings, Redwood City
MATH Google Scholar
Chen PP (1976) The entity-relationship model – toward a unified view of data. Trans Database Syst 1(1):9–36
Article Google Scholar
Chung I, Nakamura F, Chen PP (1981) A decomposition of relations using the entity-relationship approach. In: Proceedings of the 2nd international conference on entity-relationship approach to information modeling and analysis (ER’81), Washington, October 1981, pp 149–171
Google Scholar
Embley DW (1998) Object database development: concepts and principles. Addison-Wesley, Reading
Google Scholar
Embley DW, Ling TW (1989) Synergistic database design with an extended entity-relationship model. In: Proceedings of the 8th international conference on entity-relationship approach modeling (ER’89), Toronto, October 1989, pp 118–135
Google Scholar
Embley DW, Mok WY (2001) Developing XML documents with guaranteed ‘good’ properties. In: Proceedings of the 20th international conference on conceptual modeling (ER2001), Yokohama, Japan, November 2001, pp 426–441
Google Scholar
Embley DW, Mok WY (2003) Producing XML documents with guaranteed ‘good’ properties. In: Proceedings of the 7th world multiconference on systemics, cybernetics and informatics (SCI 2003), vol IX, Orlando, FL, July 2003, pp 195–198
Google Scholar
Embley DW, Kurtz BD, Woodfield SN (1992) Object-oriented systems analysis: a model-driven approach. Prentice Hall, Englewood Cliffs
Google Scholar
Halpin T (1995) Conceptual schema and relational database design, 2nd edn. Prentice Hall, Sydney, Australia
Google Scholar
Ling TW (1985) A normal form for entity-relationship diagrams. In: Proceedings of the 4th international conference on entity-relationship approach (ER’85), Chicago, October 1985, pp 24–35
Google Scholar
Elmasri R, Navathe SB (2004) Fundamentals of database systems, 4th edn. Addison-Wesley, Boston
MATH Google Scholar
Mok WY, Embley DW (2006) Generating compact redundancy-free XML documents from concptual-model hypergraphs. Trans Knowl Data Eng 18(8):1082–1096
Article Google Scholar
Silberschatz A, Korth H, Sudarshan S (2002) Database system concepts, 5th edn. McGraw-Hill, Boston
Google Scholar
Teorey TJ, Yang D, Fry JP (1986) A logical design methodology for relational databases using the extended entity-relationship model. ACM Comput Surv 18(2):197–222
Article MATH Google Scholar
Ullman JD, Widom J (2002) A first course in database systems, 2nd edn. Prentice Hall, Upper Saddle River
Google Scholar

Download references

Author information

David W. Embley
Present address: Department of Computer Science, Brigham Young University, 84602, Provo, Utah, USA
Wai Yin Mok
Present address: University of Alabama in Huntsville, Huntsville, USA

Authors and Affiliations

Authors

David W. Embley
View author publications
You can also search for this author in PubMed Google Scholar
Wai Yin Mok
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David W. Embley .

Editor information

Editors and Affiliations

, Department of Computer Science, Brigham Young University, Provo, 84602, USA
David W. Embley
Institut für Informatik, Christian-Albrechts-Universität zu Kiel, Olshausenstr. 40, Kiel, 24098, Germany
Bernhard Thalheim

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Embley, D., Mok, W. (2011). Mapping Conceptual Models to Database Schemas. In: Embley, D., Thalheim, B. (eds) Handbook of Conceptual Modeling. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15865-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-15865-0_5
Published: 13 March 2011
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15864-3
Online ISBN: 978-3-642-15865-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics