PRIMARY KEY

COP 3540 - Introduction to Database Structures
Relational Model
Relational Data Model
The relational data model was first introduced in 1970 by
E. F. Codd, then of IBM (Codd, 1970).
Most widely used model.
 Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.
“Legacy systems” in older models
 E.G., IBM’s IMS
Recent competitor: object-oriented model
 ObjectStore, Versant, Ontos
 A synthesis emerging: object-relational model
• Informix Universal Server, UniSQL, O2, Oracle, DB2
Relational Data Model
 The relational data model represents data in the form of
relations (or tables).
 The relational data model consists of the following three
components:
1) Data structure: Data are organized in the form of
tables, with rows and columns.
2) Data manipulation: Powerful operations (typically
implemented using the SQL language) are used to
manipulate data stored in the relations.
3) Data integrity: The model includes mechanisms to
specify business rules that maintain the integrity of
data.
Relational Database: Definitions
Relational database: a set of relations
Relation: made up of 2 parts:
 Instance : a table, with rows and columns.
#Rows = cardinality, #fields = degree (or arity).
 Schema : specifies name of relation, plus name
and type of each column.
• E.G. Students(sid: string, name: string, login:
string, age: integer, gpa: real).
Can think of a relation as a set of rows or tuples (i.e.,
all rows are distinct).
Relational Data Structure
A relation is a named, two-dimensional table of data. Each
relation (or table) consists of a set of named columns and
an arbitrary number
of unnamed rows.
Example Instance of Students Relation
sid
53666
53688
53650
name
login
Jones jones@cs
Smith smith@eecs
Smith smith@math
age
18
18
19
gpa
3.4
3.2
3.8
Cardinality = 3, degree = 5, all rows distinct.
Do all columns in a relation instance have to be
distinct?
Terminology
 Relation: A relation is a table with columns and
rows.
 Attribute: An attribute is a named column of a
relation.
 Domain: A domain is the set of allowable values
for one or more attributes.
 Tuple: A tuple is a row of a relation.
 Degree: The degree of a relation is the number of
attributes it contains.
 Cardinality: The cardinality of a relation is the
number of tuples it contains.
Terminology
 Relation Schema : A named relation defined by a set
of attribute and domain name pairs.
Because every attribute has an associated domain,
there are constraints (called domain constraints) that
form restrictions on the set of values allowed for the
attributes of relations.
 Relation Database Schema : A set of relation
schemas, each with a distinct name.
Quiz
The SQL Query Language
 Developed by IBM (system R) in the 1970s
 Need for a standard since it is used by many vendors
 Standards:
 SQL-86
 SQL-89 (minor revision)
 SQL-92 (major revision)
 SQL-99 (major extensions)
 …..
 SQL-2011
Structured Query Language (SQL)
A database language should allow a user to:
 create the database and relation structures;
 perform basic data management tasks, such as the
insertion, modification, and deletion of data from
the relations;
 perform both simple and complex queries.
 Data Definition Language (DDL) for defining the
database structure and controlling access to the data.
 Data Manipulation Language (DML) for retrieving
and updating data.
Data Definition Language (DDL)
Creating Relations in SQL
Creates the Students relation. Observe that the type (domain) of
each field is specified, and enforced by the DBMS whenever
tuples are added or modified. CREATE TABLE Students (
sid CHAR(20),
name CHAR(20),
login CHAR(10),
age INTEGER,
gpa REAL)
As another example, the Enrolled table holds information about
courses that students take.
CREATE TABLE Enrolled (
sid
CHAR(20),
cid
CHAR(20),
grade CHAR(2))
Create Table
Destroying and Altering Relations
DROP TABLE Students
Destroys the relation Students. The schema information
and the tuples are deleted.
ALTER TABLE Students
ADD COLUMN firstYear: integer
The schema of Students is altered by adding a new
field; every tuple in the current instance is extended with
a null value in the new field.
Constraints
The relational data model includes several types of
constraints, or rules limiting acceptable values and
actions, whose purpose is to facilitate maintaining the
accuracy and integrity of data in the database. The
major types of constraints are:
1) Domain Constraints
2) Entity Integrity
3) Referential Integrity
Domain Constraints
All of the values that appear in a column of a relation
must be from the same domain. A domain is the set
of values that may be assigned to an attribute.
A domain definition usually consists of the following
components:
(1)
(2)
(3)
(4)
domain name,
meaning,
data type, size (or length), and
allowable values or allowable range (if applicable).
Entity Integrity
The entity integrity rule is designed to ensure that
every relation has a primary key and that the data
values for that primary key are all valid. In particular,
it guarantees that every primary key attribute is
non-null.
Referential Integrity
A referential integrity constraint is a rule that
maintains consistency among the rows of two
relations. In the relational data model, associations
between tables are defined through the use of
foreign keys.
Relational Keys
 Superkey: An attribute, or set of attributes, that uniquely
identifies a tuple within a relation.
 Candidate key: A superkey such that no proper subset is a
superkey within the relation.
• A candidate key K for a relation R has two properties:
• Uniqueness. In each tuple of R, the values of K uniquely
identify that tuple.
• Irreducibility. No proper subset of K has the uniqueness
property.
 Primary key: The candidate key that is selected to identify
tuples uniquely within the relation.
 Foreign key: An attribute, or set of attributes, within one
relation that matches the candidate key of some (possibly the
same) relation.
Primary Key Constraints
 A set of fields is a key for a relation if :
No two distinct tuples can have same values in all key
fields
 Example
Students (sid, name, gpa)
 sid is a key. (What about name?)
 {sid, gpa} is a superkey.
Primary and Candidate Keys in SQL
 Possibly many candidate keys (specified using UNIQUE),
one of which is chosen as the primary key.
CREATE TABLE Enrolled (
sid CHAR(20),
cid CHAR(20),
grade CHAR(2),
PRIMARY KEY (sid, cid) )
CREATE TABLE Enrolled (
sid CHAR(20),
cid
CHAR(20),
grade CHAR(2),
PRIMARY KEY (sid),
UNIQUE (cid, grade) )
Key Constraints in SQL
Foreign Keys, Referential Integrity
 Foreign key : Set of columns in one relation that is used
to ‘refer’ to a tuple in another relation. (Must correspond
to primary key of the second relation.) Like a ‘logical
pointer’.
CREATE TABLE Enrolled (
sid CHAR(20),
cid CHAR(20),
grade CHAR(2),
PRIMARY KEY (sid, cid),
FOREIGN KEY (sid) REFERENCES Students (sid) )
Foreign Key Constraints
Data Manipulation Language (DML)
Insert Tuple
Delete Tuple
Update Tuple
Adding and Deleting Tuples
Can insert a single tuple using:
INSERT INTO Students (sid, name, login, age, gpa)
VALUES (53688, ‘Smith’, ‘smith@ee’, 18, 3.2);
Can delete all tuples satisfying some condition (e.g.,
name = Smith):
DELETE FROM Students S
WHERE S.name = ‘Smith’
Insert Tuple
Delete Tuple
Update Tuple
19
2.2
Update Tuple
3.2
3.6
Enforcing Integrity Constraints
Assume that sid is primary key
Enforcing Integrity Constraints
Assume that sid is primary key
Enforcing Integrity Constraints
Assume that sid is primary key
Enforcing Referential Constraints
Enforcing Referential Integrity
Referential Integrity in SQL
 SQL/92 and SQL:1999 support all 4 options on deletes and
updates.
 Default is NO ACTION (delete/update is rejected)
 CASCADE (also delete all tuples that refer to deleted
tuple)
CREATE TABLE Employee
 SET NULL
( EmpId
CHAR(20),
 SET DEFAULT
EmpName
CHAR(20),
DeptId
CHAR(20),
PRIMARY KEY (EmpName),
FOREIGN KEY (DeptId)
REFERENCES Department (DeptId)
ON DELETE CASCADE
ON UPDATE SET DEFAULT )
Review
Two relations and a referential integrity constraints
REFERENTIAL INTEGRITY CONSTRAINT
 DELETE RESTRICT and UPDATE RESTRICT
CREATE TABLE employee(
empid CHAR(4),
empname CHAR(20),
deptid CHAR(2),
PRIMARY KEY (empid),
FOREIGN KEY (deptid) REFERENCES department);
DELETE RESTRICT example
REFERENTIAL INTEGRITY CONSTRAINT
 DELETE CASCADE
CREATE TABLE employee (
empid CHAR(4),
empname CHAR(20),
deptid CHAR(2),
PRIMARY KEY (empid),
FOREIGN KEY (deptid) REFERENCES department
ON DELETE CASCADE);
DELETE CASCADE example
REFERENTIAL INTEGRITY CONSTRAINT
 DELETE CASCADE
CREATE TABLE employee (
empid CHAR(4),
empname CHAR(20),
deptid CHAR(2),
PRIMARY KEY (empid),
FOREIGN KEY (deptid) REFERENCES department
ON DELETE SET NULL);
DELETE SET-TO-NULL example
DELETE SET-TO-DEFAULT example
REFERENTIAL INTEGRITY CONSTRAINT
 DELETE RESTRICT and UPDATE RESTRICT
CREATE TABLE employee (
empid CHAR(4),
empname CHAR(20),
deptid CHAR(2),
PRIMARY KEY (empid),
FOREIGN KEY (deptid) REFERENCES department);
UPDATE RESTRICT example
REFERENTIAL INTEGRITY CONSTRAINT
 UPDATE CASCADE
CREATE TABLE employee (
empid CHAR(4),
empname CHAR(20),
deptid CHAR(2),
PRIMARY KEY (empid),
FOREIGN KEY (deptid) REFERENCES department
ON UPDATE CASCADE);
UPDATE CASCADE example
REFERENTIAL INTEGRITY CONSTRAINT
 UPDATE SET-TO-NULL
CREATE TABLE employee (
empid CHAR(4),
empname CHAR(20),
deptid CHAR(2),
PRIMARY KEY (empid),
FOREIGN KEY (deptid) REFERENCES department
ON UPDATE SET NULL);
UPDATE SET-TO-NULL example
UPDATE SET-TO-DEFAULT example
REFERENTIAL INTEGRITY CONSTRAINT
 DELETE CASCADE and UPDATE SET-TO-NULL
CREATE TABLE employee (
empid CHAR(4),
empname CHAR(20),
deptid CHAR(2),
PRIMARY KEY (empid),
FOREIGN KEY (deptid) REFERENCES department
ON DELETE CASCADE
ON UPDATE SET NULL);
Where do ICs Come From?
 ICs are based upon the semantics of the real-world
enterprise that is being described in the database
relations.
 We can check a database instance to see if an IC is
violated, but we can NEVER infer that an IC is true by
looking at an instance.
 An IC is a statement about all possible instances!
 From example, we know name is not a key, but the
assertion that sid is a key is given to us.
 Key and foreign key ICs are the most common; more
general ICs supported too.
Logical DB Design: Entity to Relation (or Table)
Use E-R model to get a high-level graphical view of essential
components of enterprise and how they are related.
Then convert E-R diagram to SQL DDL, or whatever database
model you are using
Logical DB Design: Entity to Relation (or Table)
 Entity sets to tables:
CREATE TABLE Employees (
ssn
CHAR(11),
name CHAR(20),
lot
INTEGER,
PRIMARY KEY (ssn))
Logical DB Design: Entity to Relation (or Table)
CREATE TABLE Departments (
)
Logical DB Design: Relationship to Table
 In translating a relationship to a relation, attributes of the
relation must include:
 Keys for each participating entity set (as foreign keys).
 All descriptive attributes
Logical DB Design: Relationship to Table
CREATE TABLE Works_In (
ssn CHAR(11),
did
INTEGER,
since DATE,
PRIMARY KEY (ssn, did),
FOREIGN KEY (ssn)
REFERENCES Employees,
FOREIGN KEY (did)
REFERENCES Departments)
Logical DB Design: Relationship to Table
CREATE TABLE Manages (
)
Logical DB Design: Relationship to Table
Logical DB Design: Relationship Sets to Tables
CREATE TABLE Works_In2 (
ssn
CHAR(11),
did
INTEGER,
address CHAR(20),
since
DATE,
PRIMARY KEY (ssn, did, address),
FOREIGN KEY (ssn)
REFERENCES Employees,
FOREIGN KEY (did)
REFERENCES Departments,
FOREIGN KEY (address) REFERENCES Locations
)
Logical DB Design: Relationship to Table
Logical DB Design: Relationship Sets to Tables
Review: Constraints
Each department has at most one manager,
according to the key constraint on Manages.
Translation to
relational model?
1-to-1
1-to Many
Many-to-1
Many-to-Many
Translating Relationship Sets with Key Constraints
Translating Relationship with Key Constraints
CREATE TABLE Manages (
At most one !
ssn
CHAR(11),
Key Constraint
did
INTEGER,
since DATE,
PRIMARY KEY (did),
FOREIGN KEY (ssn) REFERENCES Employees (ssn),
FOREIGN KEY (did) REFERENCES Departments (did))
Translating Relationship with Key Constraints
CREATE TABLE Dept_Mgr (
did
INTEGER,
At most one !
dname
CHAR(20),
Key Constraint
budget
REAL,
mgr_ssn
CHAR(11),
since
DATE,
PRIMARY KEY (did),
FOREIGN KEY (mgr_ssn) REFERENCES Employees (ssn))
Review: Total Participation Constraint
 Every department should have a manager (i.e., exactly one
manager).
At most one !
Key Constraint
At least one !
Total Participation
Review: Participation Constraints
CREATE TABLE Dept_Mgr (
did
INTEGER,
At most one !
At least one !
dname
CHAR(20),
Key Constraint Total Participation
budget
REAL,
mgr_ssn
CHAR(11) NOT NULL,
since
DATE,
PRIMARY KEY (did),
FOREIGN KEY (ssn) REFERENCES Employees
ON DELETE NO ACTION )
Review: Weak Entities
 A weak entity can be identified uniquely only by considering the
primary key of another (owner) entity.
Translating Weak Entity Sets
CREATE TABLE Dep_Policy (
pname
CHAR(20),
age
INTEGER,
cost
REAL,
employee_ssn
CHAR(11) NOT NULL,
PRIMARY KEY (pname, employee_ssn),
FOREIGN KEY (employee_ssn)
REFERENCES Employees (ssn)
ON DELETE CASCADE)
Review: Binary vs. Ternary Relationships
name
ssn
pname
lot
Employees
Policies
policyid
name
ssn
Dependents
Covers
Bad design
age
cost
pname
lot
age
Dependents
Employees
Purchaser
Beneficiary
Better design
Policies
policyid
cost
 What are the
additional
constraints in the
2nd diagram?
Review: Binary vs. Ternary Relationships
name
ssn
pname
lot
age
Dependents
Employees
Purchaser
Beneficiary
Policies
CREATE TABLE Policies (
policyid
cost
policyid
INTEGER,
cost
REAL,
ssn
CHAR(11) NOT NULL,
PRIMARY KEY (policyid),
FOREIGN KEY (ssn) REFERENCES Employees
ON DELETE CASCADE)
Binary vs. Ternary Relationships (Contd.)
name
ssn
pname
lot
age
Dependents
Employees
Purchaser
Beneficiary
Policies
CREATE TABLE Dependents (
policyid
cost
pname
CHAR(20),
age
INTEGER,
policyid
INTEGER,
PRIMARY KEY (pname, policyid).
FOREIGN KEY (policyid) REFERENCES Policies
ON DELETE CASCADE)
Data Manipulation Language (DML)
Data Retrieval
The SQL Query Language
To find all 18 year old students, we can write:
SELECT *
FROM Students S
WHERE S.age=18
sid
name
53666 Jones
login
age gpa
jones@cs
18 3.4
53688 Smith smith@ee 18 3.2
To find just names and logins, replace the first line:
SELECT S.name, S.login
FROM Students S
WHERE S.age=18
Querying Multiple Relations (Join)
What does the following query compute?
SELECT S.name, E.cid
FROM
Students S, Enrolled E
WHERE S.sid = E.sid AND E.grade=“A”
Given the following instances of Enrolled and Students:
sid
53666
53688
53650
name
login
age gpa sid
cid
grade
C
Jones jones@cs
18 3.4 53831 Carnatic101
B
Smith smith@eecs 18 3.2 53831 Reggae203
A
Smith smith@math 19 3.8 53650 Topology112
53666 History105
B
We get: S.name E.cid
Smith
Topology112
Review
Review
Review
View
 A view is just a relation, but we store a definition, rather than a
set of tuples.
CREATE VIEW YoungActiveStudents (name, grade)
AS SELECT S.name, E.grade
FROM Students S, Enrolled E
WHERE S.sid = E.sid and S.age < 21
 Views can be dropped using the DROP VIEW command.
 How to handle DROP TABLE if there’s a view on the table?
• DROP TABLE command has options to let the user
specify this.
View
Relational Model: Summary
 A tabular representation of data.
 Simple and intuitive, currently the most widely used.
 Integrity constraints can be specified by the DBA, based on
application semantics. DBMS checks for violations.
 Two important ICs: primary and foreign keys
 In addition, we always have domain constraints.
 Powerful and natural query languages exist.
 Rules to translate ER to relational model