Structuring system requirements: Conceptual data modelling

Structuring system requirements: Conceptual data modelling - ERDs
Objectives







Define the key data modelling terms
o Conceptual data model
o Entity Relationship Diagram
o Entity type
o Entity instance
o Attribute
o Candidate key
o Multivalued attribute
o Relationship
o Degree
o Cardinality
o Associative entity
Ask appropriate questions to determine data requirements for an information system.
Draw an ERD.
Explain the role of the conceptual data model.
Distinguish between and give examples of unary, binary and ternary relationships.
Distinguish between relationships and associative entities and use an associative entity in an ERD.
Relate data modelling to process and logic modelling.
Conceptual data modelling
A conceptual data model is a detailed model that shows the overall structure of organisational
data while being independent of any database management system or other implementation
considerations. Its purpose is to show as many rules about the meaning and interrelationships
among data as possible.
Process
The first step is to develop a data model for the current system.
Next, build a new data model that includes all the data requirements for the new system.
In the design stage, the conceptual model is translated into a physical design.
Using the project repository, all data modelling and design steps can be traced.
Deliverables and outcomes
The primary deliverable for the conceptual data-modelling step is the Entity Relationship
Diagram (ERD).
There can be as many as four ERD’s produced and analysed during conceptual data modelling.
These are
DB ERD
1/18
1.
2.
3.
4.
An ERD that covers just the data needed in the project’s application
An ERD for the application system being replaced
An ERD for the whole database from which the new application’s data are extracted
An ERD for the whole database from which data for the application system being replaced is
drawn
The other deliverable is a set of entries about data objects to be stored in the project dictionary
or repository. The repository is a mechanism to link data, process and logic models of an
information system.
Gathering information
In requirements determination investigations have to be undertaken and questions asked that
focus on the data rather than focus on the process and logic. There are two perspectives that
can be used


Top-down approach – the data model is derived from an intimate understanding of the nature of
the business.
Bottom-up approach – the information is gathered for data modelling by reviewing specific
business documents.
Introduction to Entity-relationship diagrams (ERDs)
Entity-Relationship Diagram (ERD) – is a detailed, logical and graphical representation of the
entities, associations and data elements for an organisation or business area.
The basic modelling notation contains three main constructs



Data entities
Relationships
Attributes
The following symbols are used to construct ERD’s
DB ERD
2/18
Entities
An entity is a person, place, object, event or concept in the user environment about which the
organisation wishes to maintain data
An entity type is a collection of entities that share common properties or characteristics. For
example for the entity Person could have types – Employee or Student
An entity instance is a single occurrence of an entity type. For example in the entity type
Employee it would be the names of the employees.
Attributes
An attribute ia a named property or characteristic of an entity that is of interest to the
organisation
An example of attribute for the entity STUDENT would be Student_ID, Student_Name
Candidate keys and identifiers
The candidate key is an attribute (or combination of attributes) that uniquely identifies each
instance of an entity type. The candidate key for STUDENT could be Student_ID
The identifier is a candidate key that has been selected as the unique, identifying characteristic
for an entity type.
The following rules need to be applied when selecting an identifier




Choose a candidate key that will not change its value over time.
Choose a candidate key that will always have a value and never be null
Avoid using intelligent keys. These area ones that could contain an abbreviation of a location
Consider substituting single value surrogate keys for large composite keys
For each entity the name of the identifier is underlined on the ERD
Multivalued attributes
A multivalued attribute is an attribute that may take on more that one value for each entity
instance. An example would be if Dept_Name was an attribute of the entity EMPLOYEE and the
EMPLOYEE worked for more that one department.
It can be represented on the ERD in two ways


A double-lined ellipse
A weak entity
A repeating group is a set of two or more multivalued attributes that are logically related
DB ERD
3/18
Relationships
A relationship in an ERD is an association between the instances of one or more entity types
that is of interest to the organisation.
This usually means that an event has occurred or that some natural linkage exists between the
entity instances.
Relationships are always labelled with verb phrases.
Conceptual data modelling and the ER model
The goal of conceptual data modelling is to capture as much of the meaning of data as
possible. The more details that can be modelled the better the system is that we can design and
build.
Degree of a relationship
Degree – the number of entity types that participate in a relationship
A unary or recursive relationship is a relationship between the instances of one entity type
A binary relationship is a relationship between instances of two entity types
A ternary relationship is a simultaneous relationship among instances of three entity types
DB ERD
4/18
Cardinalities in relationships
Cardinality is the number of instances of one entity that can (or must) be associated with each
instance of another entity
Minimum cardinality is the minimum number of instances of one entity that may be associated
with each instance of another entity
Maximum cardinality is the maximum number of instances of one entity that may be associated
with each instance of another entity
(n is a number for an upper limit, if one exists)
Associative entities
Associative entity – an entity type that associates the instances of one or more entity types and
contains attributes that are peculiar to the relationship between those entity instances
It is a relationship that can be modelled as an entity type.
Step-by-step construction of ERDs
There are several ways of constructing ERDs. This is just one.
1. Identify Entities
Identify the roles, events, locations, tangible things or concepts about which the endusers want to store data
2. Find relationships
Find the natuiral associations between paris of entities using a relationship matrix
3. Draw rough ERD
Construct an ERD using the entities and relationships already identified
4. Fill in cardinality
Determine the number of occurences on one entity for a single occurence of the related
entity
5. Define primary keys
Indentify the data attribute(s) that uniquely identify one and olny one occurence of
each entity
6. Draw key-based ERD
Eliminate many-to-many relationships and include primary and foreign keys in each
entity
7. Identify attributes
Name the information details (fields) which are essential to the system under
development
8. Map attributes
For each attribute, match it with exactly one entity that it describes
9. Draw fully attributed
ERD
Adjust the ERD from step 6 to account for entities or relationships discovered in step 8
10. Check results
Does the final ERD accurately depict the system data?
DB ERD
5/18
Step by Step ERD Example
The scenario
A University contains many Faculties. The Faculties in turn are divided into several Schools.
Each School offers numerous programs and each program contains many courses. Lecturers
can teach many different courses and even the same course numerous times. Courses can
also be taught by many lecturers. A student is enrolled in only one program but a program can
contain many students. Students can be enrolled in many courses at the same time and the
courses have many students enrolled.
Step 1 - Identify Entities
The entities in this scenario are







University
Faculty
School
Program
Course
Lecturer
Student
Step 2 - Find relationships
University Faculty
University
School
Program
Course
Lecturer
Student
contains
Faculty
School
divided
into
offers
Program
employs
contains
Course
taken by
Lecturer
Student
taught
enrolled
enrolled
Step 3 - Draw rough ERD
DB ERD
6/18
Step 4 - Fill in cardinality





DB ERD
The university contains many faculties
Each faculty is divided into several schools
Each school offers numerous programs
Each program contains many courses
Each school employs many lecturers
7/18






DB ERD
Lecturers can teach many courses
Lecturers can teach the same course many times
Courses can be taught by more than one lecturer
A student is enrolled in only one program
Students can be enrolled in many courses at the same time
Courses have many students enrolled
8/18
Step 5 - Define primary keys
The primary keys could be
University – University name
Faculty – Faculty name
School – School name
Program – Program code
Course – Course number
Lecturer – Employee number
Student – Student number
Step 6 - Draw key-based ERD
DB ERD
9/18
In this step any many-to-many relationships have to be eliminated. In the ERD so far there are
two relationships that fall into this category. They are Lecturer – Course and Course – Student.
As you can see Associative Entities have been included to rectify the situation.
Step 7 - Identify attributes
DB ERD
10/18
In the scenario there are no attributes indicated, so it is up to the analyst to ascertain what data
needs to be kept about each particular entity.
For example other attributes for Lecturer could be



Employee Name
Employee Address
Speciality
Step 8 - Map attributes
An example of mapping the attributes would be
Attribute
Entity
Attribute
Entity
Employee_name
Lecturer
Faculty_name
Faculty
Employee_number
Lecturer
Student_number
Student
Course_number
Course
Student_name
Student
Step 9 - Draw fully attributes ERD
This is an example of what all the entities should look like when they have been fully attributed.
Step 10 - Check results
Does the final model depict the system well? If there are any discrepancies, the ERD will have
to be adjusted.
(Developed by D. Carpenter BBus (IS) Grad Dip (VET), Central Queensland University, 2003)
DB ERD
11/18
ERD - Example 1
There may be more that one way to draw an ERD. The solution given is just one
alternative.
Prepare an E-R diagram for a real estate firm that lists property for sale. The following
describes this organisation:






The firm has a number of sales offices in several states. Attributes of sales office include
Office_Number (identifier/key) and Location.
Each sales office is assigned one or more employees. Attributes of employee include
Employee_ID (identifier/key) and Employee_Name. An employee must be assigned to only one
sales office.
For each sales office, there is always one employee assigned to manage that office. An employee
may manage only the sales office to which he/she is assigned.
The firm lists property for sale. Attributes of property include Property_ID (identifier) and
Location. Components of Location include Address, City, State, and Zip_Code.
Each unit of property must be listed with one (and only one) of the sales offices. A sales office
may have any number of properties listed, or may have no properties listed.
Each unit of property has one or more owners. Attributes of owners are Owner_ID (identifier) and
Owner_Name. An owner may own one or more units of property. An attribute of the association
between property and owner is Percent_Owned.
Rough ERD
ERD with Cardinality
DB ERD
12/18
Key-based ERD
DB ERD
13/18
Fully attributed ERD with keys
DB ERD
14/18
Entity Relaionship Diagram Example 2
Draw a Rough Draft ERD with cardinality for the following scenario (Remember to rationalise
any many-to-many relationships)
A swim meet director needs to assign swimmers to events and keep track of those of different
age groups belonging to swim clubs. A swimmer can belong to only one club at a time and is
always attached to a club. A swimmer must be entered in at least one event. Each swimmer is
in exactly one age group. An event must have at least 2 but there is no upper limit to the
number of swimmers who can enter the event.
DB ERD
15/18
Additional Questions
Question 1:
Draw a data model, in the form of an entity-relationship diagram for the following system. The
system is required to store information about movies for a movie hire business. The users wish
to keep track of the following data:





Members have a username, password and address – no two members are allowed to have the same
username.
Every movie has a title, year released and category.
A movie only has one director but a director may have directed more than one movie.
The director’s first and last names as well as address is stored.
The date the movie is hired is recorded, as well as the date the movie is due to be returned.
Note:. Be sure to include all attributes, relationships and minimum and maximum cardinalities.
Question 2:
For the ERD you have just drawn:






Which are entities? Are there any associative entities?
What is the primary key of MEMBER?
What is the primary key of MOVIE HIRE?
What is the cardinality of each of the relationships?
What is the purpose of Primary Keys?
What is the purpose of Foreign Keys?
Question 3:
Transform the following statement about order processing in an organisation into
an ERD.





DB ERD
The staff are responsible for orders, which are identified by an ORDER-NO and have an ORDERDATE, DESCRIPTION, and QUOTED-PRICE. Each order is from one customer. Only one Staff
member is responsible for a given order, but a Staff member may be responsible for many orders.
The organisation manufactures the order in a series of jobs. The Staff member responsible for an
order makes formal requests to sections to carry out these jobs. The requests are identified by a
REQUEST-NO. They nominate a START-DATE and an END-DATE for each request.
A number of jobs can be created by a section in response to a request. Each job is identified by a
JOB-NO and has a COST. All jobs for one request go to the same section, which is identified by
SECTION-ID and has one MANAGER.
Each job uses a QTY-USED of one or more materials. Materials are identified by MAT-ID and
have a MAT-DESCRIPTION.
Staff in the organisation are identified by a STAFF-ID and have a SURNAME, FIRST-NAME
and DATE-OF-BIRTH.
16/18
Additional Question Answers
Question 1
Question 2
ENTITIES: Member; Movie; Director
ASSOCIATIVE ENTITIES; Movie Hire
PRIMARY KEY OF MEMBER: Username
PRIMARY KEY OF MOVIE HIRE: Concatenated key of Username and Movie ID
PURPOSE OF PRIMARY KEYS: A primary key is an attribute that serves as a unique identifier
across all occurrences of a relation
PURPOSE OF FOREIGN KEYS: a foreign key must satisfy referential integrity. This specifies
that the value of an attribute in one relation depends on the value of the same attribute in
another relation.
DB ERD
17/18
Question 3
Summary
Before you go on to the review questions, spend a few moments thinking about each of the
following key terms. If you are not really sure about the meanings revisit the various areas in the
textbook to refresh your memory

Associative entity

Degree

Multivalued attribute

Attribute

Entity

Relationship

Binary relationship

Entity instance (instance)

Repeating group

Candidaty key

ERD

Ternary relationship

Cardinality

Entity type

Unary relationship (recursive
relationship)

Conceptual data model

Identifier
DB ERD
18/18