pdf2

COP 3540 - Introduction to Database Structures
DBMS Design, ER Model
Database design
The database design process can be divided into six
steps.
1. Requirement Analysis
2. Conceptual Database Design: ER model
3. Logical Database Design: logical schema
4. Schema refinement: Normalization
5. Physical Database Design
6. Application and Security Design
Database design
Table / Relation / File
Overview of Database design
Conceptual design: (ER Model is used at this stage.)
 What are the entities and relationships in the
enterprise?
 What information about these entities and relationships
should we store in the database?
 What are the integrity constraints or business rules that
hold?
 A database ‘schema’ in the ER Model can be
represented pictorially (ER diagrams).
 Can map an ER diagram into a relational schema.
Data Model
A data model is a collection of conceptual tools for
describing data, data relationships, data semantics, and
consistency constraints.
 The entity–relationship (E-R) model is a high-level
data model. It is based on a perception of a real world
that consists of a collection of basic objects, called
entities, and of relationships among these objects.
 The relational model is a lower-level model. It uses a
collection of tables to represent both data and the
relationships among those data.
ER Model
The entity-relationship (E-R) data model perceives the
real world as consisting of basic objects, called entities,
and relationships among these objects.
The E-R data model is one of several semantic data
models; the semantic aspect of the model lies in its
representation of the meaning of the data.
ER Model
The E-R data model employs three basic notions:
1) Entity Sets
2) Attributes, and
3) Relationship Sets.
Entity
An entity is a “thing” or “object” in the real world that is
distinguishable from all other objects. For example, each
person in an enterprise is an entity.
An entity has a set of properties, and the values for
some set of properties may uniquely identify an entity.
For instance, a person may have a person-id property
whose value uniquely identifies that person.
An entity set is a set of entities of the same type that
share the same properties, or attributes.
Entity
An entity is represented by a set of attributes.
Attributes are descriptive properties possessed by each
member of an entity set. For example, possible
attributes of the customer entity set are customer-id,
customer-name, customer-street, and customer-city.
Each entity has a value for each of its attributes.
For each attribute, there is a set of permitted values,
called the domain, or value set, of that attribute.
Entity
Example of entity sets
Customer
Entity
Example of entity sets
Employees
Attribute
An attribute of an entity set is a function that maps from
the entity set into a domain.
Since an entity set may have several attributes, each
entity can be described by a set of (attribute, data value)
pairs, one pair for each attribute of the entity set.
Attribute
An attribute can be characterized by the following attribute
types:
1. Simple or composite attributes
2. Single-valued or multivalued attributes
3. Derived attribute
An attribute takes a null value when an entity does not have a
value for it. The null value may indicate “not applicable”—that
is, that the value does not exist for the entity. For example, one
may have no middle name. Null can also designate that an
attribute value is unknown. An unknown value may be either
missing (the value does exist, but we do not have that
information) or not known (we do not know whether or not the
value actually exists).
Example
Attribute
Example of composite attributes
Composite Attribute
Composite Attribute
Composite Attribute
Composite Attribute
Keys
A superkey is a set of one or more attributes that,
taken collectively, allow us to identify uniquely an
entity in the entity set.
A candidate key is a minimal set of attributes whose
values uniquely identify an entity in the set.
A primary key denotes a candidate key that is
chosen by the database designer as the principal
means of identifying tuples within a relation.
Examples
Relationship Sets
A relationship is an association among several entities.
Example of relationship
Relationship Sets
Relationships are described in terms of degree,
connectivity, and existence.
Degree of a Relationship
The degree of a relationship is the number of
entity types that participate in that relationship.
 Binary relationship: a relationship between
two types of entities.
 Unary relationship: a relationship with
another occurrence within the same entity.
 Ternary relationship: a relationship directly
involving three entity types.
Degree of a Relationship
Example a ternary relationship
Degree of a Relationship
Example a unary relationship
Constraints
1) Mapping Cardinalities
2) Participation Constraints
Mapping Cardinalities
Mapping Cardinalities: For a binary relationship set R between
entity set A and B, the mapping cardinality must be one of the
following:
One to One
One to Many
Many to One
Many to Many
Key Constraints
 Consider Works_In: An employee can work in
many departments; a dept can have many
employees.
Key Constraints
 In contrast, each dept has at most one (i.e.,
arrow line) manager, according to the key
constraint on Manages.
since
name
ssn
dname
lot
Employees
did
Manages
budget
Departments
Mapping Cardinalities
Each department has at most one (i.e., arrow line) manager.
Mapping Cardinalities
Many to Many
Mapping Cardinalities
One to Many
Mapping Cardinalities
Many to One
Mapping Cardinalities
One to One
Mapping Cardinalities
Each employee works in at most one (i.e., arrow line)
department and at a single location.
Constraints
1) Mapping Cardinalities
2) Participation Constraints
Participation Constraints
Total participate: The participation of an entity set E in a
relationship set R is said to be total if every entity in E participates
in at least one (i.e., thick line) relationship in R.
Partial Participate: If only some entities in E participate in
relationships in R. The participation of entity set E in relationship
R is said to be partial participation.
Participation Constraints
Participation constraints specifies whether the existence of an
entity set depends on its being related to another entity.
Existence Dependency
If the existence of an entity x depends on the existence of
another entity y, then x is said to be existence dependent on y.
Weak Entity type: The entity types that do not have key attributes
of their own are called weak entity.
A weak entity type always has a total participation constraint
with respect to its identifying relationship because a weak entity
cannot be identified without an owner entity
Partial Key
Identifying relationship
Existence Dependency
A weak entity can be identified uniquely only by considering the
primary key of another (owner) entity.
 Owner entity set and weak entity set must participate in a
one-to-many relationship set (one owner, many weak
entities).
 Weak entity set must have total participation in this
identifying relationship set.
name
ssn
lot
Employees
cost
Policy
pname
age
Dependents
ISA (‘is a’) Hierarchies
 As in C++, or other PLs, attributes are inherited.
 If we declare A ISA B, every A entity is also considered to be a
B entity.
 Overlap constraints: Can Joe be an Hourly_Emps as well as
a Contract_Emps entity? (Allowed/disallowed)
 Covering constraints: Does every Employees entity also
have to be an Hourly_Emps or a Contract_Emps entity?
(Yes/no)
 Reasons for using ISA:
 To add descriptive attributes specific to a subclass.
 To identify entities that participate in a relationship
Generalization / Specialization
Aggregation
 Used when we have to model a relationship involving (entity
sets and) a relationship set.
 Aggregation allows us to treat a relationship set as an entity
set for purposes of participation in (other) relationships
 Aggregation vs. ternary relationship:
 Monitors is a distinct relationship, with a descriptive
attribute.
 Also, can say that each sponsorship is monitored by at most
one employee.
Aggregation
Conceptual Design Using the ER Model
 Design choices:
 Should a concept be modeled as an entity or an attribute?
 Should a concept be modeled as an entity or a relationship?
 Identifying relationships: Binary or ternary? Aggregation?
 Constraints in the ER Model:
 A lot of data semantics can (and should) be captured.
 But some constraints cannot be captured in ER diagrams.
Entity vs. Attribute
 Should address be an attribute of Employees or an entity
(connected to Employees by a relationship)?
 Depends upon the use we want to make of address
information, and the semantics of the data:
 If we have several addresses per employee, address must
be an entity (since attributes cannot be set-valued).
 If the structure (city, street, etc.) is important, e.g., we want
to retrieve employees in a given city, address must be
modeled as an entity (since attribute values are atomic).
Example 1
 Works_In2 does not allow an employee to work in a
department for two or more periods.
Example 2
 Similar to the problem of wanting to record several
addresses for an employee: We want to record
several values of the descriptive attributes for each
instance of this relationship. Accomplished by
introducing new entity set, Duration.
Example 3
 First ER diagram OK if a manager gets a separate
discretionary budget for each dept.
 What if a manager gets a discretionary budget that
covers all managed depts?


Redundancy: dbudget stored for each dept managed by
manager.
Misleading: Suggests dbudget associated with departmentmgr combination.
Example 4
Example 5
name
ssn
lot
dname
since
did
Employees
ISA
Managers
Manages2
dbudget
budget
Departments
Example 6 (Binary vs. Ternary Relationships)
 If each policy is owned by just 1 employee, and each
dependent is tied to the covering policy, first diagram
is inaccurate.
Example 7 (Binary vs. Ternary Relationships)
 What are the additional constraints in this diagram?
Example 8
Summary of Conceptual Design
 Conceptual design follows requirements analysis,
 Yields a high-level description of data to be stored
 ER model popular for conceptual design
 Constructs are expressive, close to the way
people think about their applications.
 Basic constructs: entities, relationships, and attributes
(of entities and relationships).
 Some additional constructs: weak entities, ISA
hierarchies, and aggregation.
 Note: There are many variations on ER model.
Summary of ER (Contd.)
 Several kinds of integrity constraints can be
expressed in the ER model: key constraints,
participation constraints, and overlap/covering
constraints for ISA hierarchies. Some foreign key
constraints are also implicit in the definition of a
relationship set.
 Some constraints (notably, functional
dependencies) cannot be expressed in the ER
model.
 Constraints play an important role in determining
the best database design for an enterprise.
Summary of ER
 ER design is subjective. There are often many ways
to model a given scenario! Analyzing alternatives can
be tricky, especially for a large enterprise. Common
choices include:
 Entity vs. attribute, entity vs. relationship, binary or
n-ary relationship, whether or not to use ISA
hierarchies, and whether or not to use
aggregation.
 Ensuring good database design: resulting relational
schema should be analyzed and refined further. FD
information and normalization techniques are
especially useful.