k̅ - Webcourse

Database Systems
236363
Entity Relationship Diagrams
Entity Relationship (ER) Model
• Entity Relationship (ER) model enables to analyze the system’s
requirements and represent the type of entities we would like to
store data about as well as the relationships between these entities
and various constraints on them
• Entity Relationship Diagram (ERD) provides a visualization through a
diagram of the system being represented in the model. The diagram
facilitates the process of defining the logical level of the data
model, e.g., defining the schemas of the relational database. As
such, it serves as a good middleman between the logical layer and
the reality as we grasp it
• Being theoretical and general, it has some limitations. For example,
it does not detail possible values for entities’ attributes
Notations
• There are several approaches to drawing the
diagrams, yet the principles are the same
• We will use the one covered in the textbook
Ullman and Widom, First Course in Database
Systems
Attribute
• An attribute is the most basic unit of information
representing a data item of a given type
• When representing the set of all entities of a given
type as a table, each attribute corresponds to a
column
• Notation: an ellipse with the attribute name inside it
Name
ID num
Entity
• An object of a given type that we would like to store
in our system
• In a table or a file, an entity is usually represented as
a record (or a row)
• For example, in a movies database, there will
probably be an entity for the movie “Die Hard”
• Notice: ERDs do not include entities. Why?
Entity Set
• A set of entities of a given type
– For example, movie players
• Each entity set is associated with attributes that
define and describe specific entities of this type
• In a table representation, the entity set would be the
table’s name. The attributes associated with these
entities would be the columns’ names
– An entity set cannot have two attributes with the same
name
Entity Set - Notation
• Drawn as a rectangle
– The corresponding attributes are drawn as ellipses
connected to it by a solid line
– Each attribute in the diagram should be connected either
to a single entity set or to a single relationship set (defined
shortly)
Birth date
Photo
Name
Movie actor
• The subset of underlined attributes makes up the
primary key
• In an ER diagram, each rectangle represents a single
entity set – each should have a unique name
Compound Attributes
• A compound attribute consists of several
more basic attributes
• In the diagram, sub-attributes are connected
to the compound one
Day
Month
Date
Year
Multiple Value Attributes
• A multiple value attribute may store multiple
values for the same attribute
• Drawn as a double ellipse
– E.g., we can store multiple photos for each actor
Birth date
Name
Movie actor
Photo
Relationship
• Represents a relationship between entities
– For example, the relationship between actress Natalie Portman
and the movie Léon represents the fact that she played in that
movie
• A relationship can be
–
–
–
–
Binary – between two entities
Tertiary – between three entities
…
n-ary – between n entities
• Typically, a relationship is represented in a table storing the
primary keys of the corresponding entities
• Do we also include relationships in the diagram?
Relationship Set
• Represents the relationship between two or more entity sets
• Drawn as a rhombus connected to the two entity sets; the
name of the relationship set is inside the corresponding
rhombus
Name
Birth date
Photo
Movie actor
Movie Name
Played in
Year
Genre
Movie
• In this example, one or more actors can play in a given movie, and
the same actor can play in multiple movies. Obviously, not every
actor plays in every movie. Some movies (e.g., cartoons) may have
no actors at all, and some actors might not play in any movie
Relationship Set’s Attributes
• A relationship set may have attributes
– In this case, each relationship of this set must include the
corresponding attribute
• In the following example, we will know the role played by
each actor in each movie
Role
Name
Birth date
Movie Name
Photo
Movie actor
Played in
Year
Genre
Movie
• In the database schema, such a relationship is represented as a
table consisting of the primary keys of the corresponding attribute
sets plus the attributes of the relationship set
Primary Attributes of Relationship Sets
• In some situations, two (or more) entities might be related by
multiple relationships of the same set, each identified by
different attributes values
– For example, the same actor might have multiple roles in
the same movie
• Here, some of the attributes are defined as primary, and must
be unique for each specific relationship
Role
Name
Birth date
Movie Name
Photo
Movie actor
Played in
Year
Movie
Genre
Full Participation
• When it is required that each entity of a given entity
set would have at least one relationship with at least
one entity of another entity set
– This is represented by a double connecting line
Role
Name
Birth date
Movie Name
Photo
Movie actor
Played in
Year
Movie
• In this example, each stored actor must play in at
least one movie
– No more wannabes…
Genre
Relationship Multiplicity
• For a given relationship set, we can limit the number
of entities of a given entity set to (at most) one
– This is represented in the diagram as an arrow
Role
Name
Birth date
Movie Name
Photo
Movie actor
Star
Year
Movie
• In this example, there can be at most one star for
each movie
Genre
Restricting Relationship Multiplicity
• Many to many
Actor
R
Movie
• One to many
Employee
R
Boss
• One to one
President
R
Country
An Example: A Relationship Between
Three Entities
Name
Prize
Movie Name
Award
Awarded
Date
Date
Year
Genre
Movie
• Enables storing only awards that were awarded at least
once to a certain movie at a certain date
• An award cannot be given more than once to the same
movie at two different dates, or at the same date to two
different movies
Find the Differences…
ID no
Number
Course
Participated
Name
Student
Lecturer
ID no
Number
Course
Participated
Student
Participated
Lecturer
Participated
Name
Roles
• Sometimes, a relationship set is used to connect
the same entity set to itself
Movie Name
Year
Genre
Sequel
Movie
Sequel
Origin
• We distinguish between the relation types by
writing their name on the connecting edges
• In this example, each movie can have only a
single origin, but multiple sequels
Aggregation
• Enables treating a relationship set as an entity set
Role
Birth date
Movie Name
Photo
Name
Movie actor
Award Name
Played in
Award
Year
Genre
Movie
Awarded
• The attributes of an aggregation are the combination of all
attributes of the corresponding entity sets and relationship sets
• An aggregation is not represented in a table – it is derived from the
corresponding entities tables
Find the Differences…
Role
Birth date
Movie Name
Photo
Name
Played in
Movie actor
Award Name
Award
Awarded
Movie Name
Photo
Name
Movie actor
Award Name
Played in
Award
Genre
Movie
Role
Birth date
Year
Year
Genre
Movie
Awarded
A Weak Entity Set
• A weak entity set depends on another entity set(s)
– Its key is composed of its own key combined with the key(s) of the
entity set(s) it depends on
– Both the weak entity set and it corresponding relationship sets are
drawn in the diagram using double lines
Chain
Chain’s name
What are the multiplicity
constraints?
C-S
Store
Store name
S-D
Department
Department name
What about full
participation?
ISA Relationship
• Enables to represent entity sets that are subsets (sub-types) of
other entity sets through inclusion
• Included entity sets (subclasses) inherit all attributes and
relationships of the including set (superclass)
• Subclass entity sets inherit the key of the superclass entity set
• The subclass entity set may have its additional unique attributes
and relationships
• For every entity of a subclass entity set there is a corresponding
entity of the superclass entity set obtained by restricting its
attributes to those of the superclass entity set
ISA Relationship in the ER Diagram
• Marked as a triangle pointing to the superclass entity
set (in some books it is the opposite)
• Example
Movie Name
Actor Name
Year
Genre
Birth date
Photo
Movie
Movie actor
ISA
Role
Voices
ISA
Cartoon
Nature Film
Animator
Location
Unique Inclusion
• If we would like each entity of the superclass entity
set to appear in at most one subclass entity set, we
use a single triangle
Movie Name
Actor Name
Year
Genre
Birth date
Photo
Movie
Movie actor
ISA
Role
Voices
Cartoon
Nature Film
Animator
Location
Complete Inclusion
• If we would like each entity of the superclass set to
have at least one entity of the subclass set, we use
thick lines
Movie Name
Actor Name
Year
Genre
Birth date
Photo
Movie
Movie actor
ISA
Role
Voices
Cartoon
Nature Film
Animator
Location
Representing ISA in Tables
• Option I: Having a table for the superclass entity set.
In the table corresponding to each subclass entity
set, the only attribute(s) of the superclass entity set
is (are) the primary key
• Option II: Having a table for each subclass entity set.
Joins of tables from different subclass entity sets will
only consist of attributes of the superclass entity set
• When should we prefer one over the other?
Adding Constraints - Example
ID no.
Name
Child
Parenthood
Person
Parent
• Can we refine this to better capture reality?
Adding Constraints - Example
ID no.
Name
Child
Child
Fatherhood
Motherhood
Person
Parent
Parent
• Can we refine this to better capture reality?
Example
ID no.
Name
Fatherhood
Child
Child
Person
Parent
Parent
ISA
Father
Motherhood
ISA
Mother
• Can we further refine this to better capture reality?
Summarizing Example
• Consider this diagram for trains operation
Height
Km
A_Time
S_Name
D_Time
T_Num
S_Type
Days
Station
Arrives
Train
Serves
Platform
Gives
Line
Length
Service
Direction
L_Num
T_Category
Food
Class
What Tables do we Extract?
• What columns should exist for the relationship set “Serves”?
• The key S_Name (of the “Station” entity set)
• The key’s attributes L_Num and Direction (of “Line”)
– These triplet would serve as the key for “Serves”
– In addition, a column for the relation attribute Km
• What columns should exist for the relationship set “Arrives”?
– The key T_Num of the entity set “Train”
– The key’s attributes for the aggregated relationship set “Serves”,
i.e., S_Name, L_num, and Direction
– The three attributes of the relationship set “Arrives” itself
• Platform, A_time, D_time
Extracting Tables from ERD
A Table Example
• t1 = (foo,bar,baz,{x,y})
• t2 = (quz,bar,foo,{y,z})
• t1[a1] = (baz)
• t2[a2] = ({y,z})
• t1[a̅] = (baz,{x,y})
• t2[k̅] = (quz,bar)
k1
k2
a1
a2
t1
foo
bar
baz
{x,y}
t2
quz
bar
foo
{y,z}
Vector of all key
attributes
k̅ = (k1,k2)
a̅ = (a1,a2)
Vector of all
non-key
attributes
Entities
k1
kn
a1
E
• A short representation:
k̅ = (k1,…,kn)
a̅ = (a1,…,am)
k̅
a̅
E
am
Entities
k̅
a̅
E
• Translation to a table:
k̅
k1
…
a̅
kn
• Constraints:
t1[k̅] = t2[k̅] ⇒ t1[a̅] = t2[a̅]
a1
…
am
Entities
k̅
a̅
E
• k,̅ a̅ may be empty
• “No Key” – empty a̅ !
– Every attribute is a part of the key (underlined)
• What is the meaning of empty k̅ ?
Entities
k1
kn
a1
am
E
• Any ai (or ki) may be multi-valued
• In domain D, for a table row t:
ai
– for an attribute ai: t[ai] ∈ D
– for a multi-valued attribute aj: t[aj] ∈ P(D)
• a powerset.
Another Representation of a
Multi-Valued Attribute
k1
kn
a1
am
E
• A table without multi-valued attributes:
k̅
k1
…
a̅
kn
a2
…
am
• Tables - one for each multi-valued attribute:
k̅
k1
…
kn
a1
• Multi-valued attributes cannot be a part of the key
Relationships
k1̅
kR̅
a̅1
E1
a̅R
k2̅
R
E2
• Each k,̅ a̅ may be empty
• Translation to a table:
k̅1
k̅R
a̅2
a̅R
k̅2
Relationships
k1̅
kR̅
a̅1
a̅R
1
E1
• Constraints:
k2̅
a̅2
2
R
E2
K1 ≔ t1[k1̅ ]=t2[k1̅ ]
K2 ≔ t1[k2̅ ]=t2[k2̅ ]
KR ≔ t1[kR̅ ]=t2[kR̅ ]
AR ≔ t1[a̅R]=t2[a̅R]
– K2∧KR→K1∧AR (1)
– K1∧KR→K2∧AR (2)
– (K2∧KR→K1∧AR)∧(K1∧KR→K2∧AR) (1+2)
Relationships
k1̅
a̅1
E1
kR̅
a̅R
R
k2̅
a̅2
E2
• Constraints:
– πk2(E2)⊆ πk2(R)
– πk2(R)⊆ πk2(E2)
(true for any relationship!)
Ternary Relationships
k1̅
kR̅
a̅1
a̅R
1
R
E2
3
E3
–
–
–
–
a̅2
2
E1
• Constraints:
k2̅
k1̅
a̅1
K1 ≔ t1[k1̅ ]=t2[k1̅ ]
K2 ≔ t1[k2̅ ]=t2[k2̅ ]
K3 ≔ t1[k3̅ ]=t2[k3̅ ]
KR ≔ t1[kR̅ ]=t2[kR̅ ]
AR ≔ t1[a̅R]=t2[a̅R]
K2∧K3∧KR→K1∧AR (1)
K1∧K3∧KR→K2∧AR (2)
(K2∧K3∧KR→K1∧AR)∧(K1∧K3∧KR→K2∧AR) (1+2)
Each arrow is a layer and several arrows are a conjunction
(∧) of layers
Roles
k̅
a̅
role1
E1
kR̅
R
a̅R
role2
• Translation to a table:
k̅role1
k̅R
• Constraints are the same
a̅R
k̅role2
Aggregations
ER
k1̅
kR̅
a̅1
E1
a̅R
R
k2̅
a̅2
E2
kS̅
S
a̅S
E3
k3̅
a̅3
• Turns the relationship into an entity with
attributes of the relationship
Aggregations
E1:
k̅1
a̅1
E2:
k̅2
a̅2
E3:
k̅3
a̅3
R:
k̅1
k̅2
k̅R
a̅R
S:
k̅1
k̅2
k̅3
k̅R
k̅S
a̅S
Weak Entities
Translation to tables:
k1̅
E1
k̅1
a̅1
a̅1
k2̅
E2
k̅1
k̅2
a̅2
a̅2
k3̅
a̅3
E3
k̅1
k̅2
k̅3
a̅3
Weak Entities
k1̅
E1
a̅1
k2̅
E2
a̅2
k3̅
a̅3
E3
Constraints:
πk1(E2)⊆ πk1(E1)
πk1,k2(E3)⊆ πk1,k2(E2)
ISA
• ISA – a branching weak entity without key
components in the subclass
k1̅
a̅1
superclass
E1
ISA
subclass
a̅2
E2
subclass
E3
a̅3
ISA – Translations and Constraints
E1:
k̅1
k1̅
a̅1
a̅1
E1
E2:
k̅1
a̅2
ISA
E3:
k̅1
a̅3
Constraints:
πk1(E3)⊆ πk1(E1)
ISA
E2
E3
a̅2
a̅3
πk1(E2)⊆ πk1(E1)
Exclusive ISA
E1:
k̅1
k1̅
a̅1
a̅1
E1
E2:
k̅1
a̅2
ISA
E3:
k̅1
a̅3
Constraints:
πk1(E3)⊆ πk1(E1)
πk1(E2)∩πk1(E3)=∅
E2
E3
a̅2
a̅3
πk1(E2)⊆ πk1(E1)
Covering All ISA
E1:
k1̅
a̅1
NO Table!
E1
E2:
k̅1
a̅1
a̅2
ISA
E3:
k̅1
a̅1
Constraints:
πk1(E2)∩πk1(E3)=∅
a̅3
Thick lines
E2
E3
a̅2
a̅3