Temple University – CIS Dept.
CIS331– Principles of Database
Systems
V. Megalooikonomou
Relational Model I
(based on notes by Silberchatz,Korth, and Sudarshan and notes by C.
Faloutsos at CMU)
Overview
history
concepts
Formal query languages
relational algebra
rel. tuple calculus
rel. domain calculus
History
before: records, pointers, sets etc
introduced by E.F. Codd (1923-2003)
in 1970
revolutionary!!!
first systems: 1977-8 (System R; Ingres)
Turing award in 1981
Concepts
Database: a set of relations (= tables)
rows: tuples
columns: attributes (or keys)
superkey, candidate key, primary key
Example
Database:
STUDENT
Ssn
Name
123 smith
234 jones
Address
main str
forbes ave
SSN c-id
grade
123 15-413 A
234 15-413 B
Example: cont’d
Database:
STUDENT
Ssn
Name
123 smith
234 jones
k-th attribute
(Dk domain)
Address
main str
forbes ave
rel. schema (attr+domains)
tuple
Example: cont’d
STUDENT
Ssn
Name
123 smith
234 jones
Address
main str
forbes ave
rel. schema (attr+domains)
instance
Example: cont’d
•Di: the domain of the I-th attribute (eg., char(10)
•Formally: an instance is a subset of (D1 x D2 x …x Dn)
STUDENT
Ssn
Name
123 smith
234 jones
Address
main str
forbes ave
rel. schema (attr+domains)
instance
Example: cont’d
superkey (eg., ‘ssn , name’):
determines record
cand. key (eg., ‘ssn’, or ‘st#’):
minimal superkey (no subset of it is
a superkey)
primary key: one of the cand. keys
Another example
Example: if
Customer-name = {Jones, Smith, Curry, Lindsay}
Customer-street = {Main, North, Park}
Customer-city = {Harrison, Rye, Pittsfield}
Then r = { (Jones, Main, Harrison),
(Smith, North, Rye),
(Curry, North, Rye),
(Lindsay, Park, Pittsfield)}
is a relation over
Customer-name x Customer-street x Customer-city
Relations, tuples
Relation Schema:
A1, A2, …, An are attributes
R = (A1, A2, …, An ) is a relation schema
E.g. Customer-schema =
(customer-name, customer-street, customer-city)
r(R) is a relation on the relation schema R
E.g. customer (Customer-schema)
Relations are unordered
Order of tuples is irrelevant (tuples may be stored in an
arbitrary order)
Database
A database consists of multiple relations
Information about an enterprise is broken up into parts, with
each relation storing one part of the information
E.g.: account : stores information about accounts
depositor : stores information about which customer
owns which account
customer : stores information about customers
Storing all information as a single relation such as
bank(account-number, balance, customer-name, ..)
results in
repetition of information (e.g. two customers own an account)
the need for null values (e.g. represent a customer without an account)
Normalization theory (discuss later) deals with how to design
relational schemas
Overview
history
concepts
Formal query languages
relational algebra
rel. tuple calculus
rel. domain calculus
Formal query languages
How do we collect information?
Eg., find ssn’s of people in cis331
(recall: everything is a set!)
One solution: Relational algebra, i.e.,
set operators (procedural language)
Q1: Which operators??
Q2: What is a minimal set of operators?
Relational operators
.
.
.
set union
U
set difference ‘-’
Example:
Q: find all students (part or full time)
A: PT-STUDENT union FT-STUDENT
FT-STUDENT
Ssn
Name
129 peters
239 lee
main str
5th ave
PT-STUDENT
Ssn
Name
123 smith
234 jones
Address
main str
forbes ave
Observations:
two tables are ‘union compatible’ if they
have the same attributes (i.e., same
arity: number of attributes and same
‘domains’)
Q: how about intersection
?
U
Observations:
A: redundant:
STUDENT intersection STAFF =
STUDENT - (STUDENT - STAFF)
STUDENT
STAFF
Relational operators
.
.
.
set union U
set difference ‘-’
Other operators?
E.g., find all students on ‘Main street’
A: ‘selection’
address 'mainstr'
STUDENT
Ssn
Name
123 smith
234 jones
( STUDENT)
Address
main str
forbes ave
Other operators?
Notice: selection (and rest of operators)
expect tables, and produce tables
--> can be cascaded!!
For selection, in general:
condition (RELATION )
Selection - examples
Find all ‘Smiths’ on ‘Forbes Ave’
name 'Smith' address 'Forbes ave' (STUDENT)
‘condition’ can be any boolean combination of ‘=‘,
‘>’, ‘>=‘, ...
Relational operators
selection
.
.
set union
set difference
condition (R )
RUS
R-S
Relational operators
selection picks rows - how about
columns?
A: ‘projection’ - eg.: ssn (STUDENT )
finds all the ‘ssn’ - removing duplicates
STUDENT
Ssn
Name
123 smith
234 jones
Address
main str
forbes ave
Relational operators
Cascading: ‘find ssn of students on
‘forbes ave’
ssn ( address ' forbes ave' (STUDENT))
STUDENT
Ssn
Name
123 smith
234 jones
Address
main str
forbes ave
Relational operators
selection
projection
.
set union
set difference
condition (R )
attlist (R)
RUS
R-S
Relational operators
Are we done yet?
Q: Give a query we can not answer yet!
Relational operators
A: any query across two or more tables,
eg., ‘find names of students in cis351’
Q: what extra operator do we need??
A: surprisingly, the cartesian product is
enough!
STUDENT
Ssn
Name
123 smith
234 jones
Address
main str
forbes ave
TAKES
SSN
c-id
123 cis331
234 cis331
grade
A
B
Cartesian product
E.g., dog-breeding: MALE x FEMALE
gives all possible couples
MALE
name
spike
spot
FEMALE
x name
lassie
shiba
=
M.name
spike
spike
spot
spot
F.name
lassie
shiba
lassie
shiba
so what?
Eg., how do we find names of students
taking cis351?
STUDENT
Ssn
Name
123 smith
234 jones
Address
main str
forbes ave
TAKES
SSN
c-id
123 cis331
234 cis331
grade
A
B
Cartesian product
A:
......... STUDENT .ssnTAKES . ssn ( STUDENT x TAKES )
Ssn
123
234
123
234
Name
smith
jones
smith
jones
Address
ssn
main str
123
forbes ave
123
main str
234
forbes ave
234
cid
cis331
cis331
cis331
cis331
grade
A
A
B
B
Cartesian product
..
( ( STUDENT.ssnTAKES .ssn (STUDENT x TAKES)))
cid cis351
Ssn
123
234
123
234
Name
smith
jones
smith
jones
Address
ssn
main str
123
forbes ave
123
main str
234
forbes ave
234
cid
cis331
cis331
cis331
cis331
grade
A
A
B
B
name (
cid cis351( ( STUDENT .ssnTAKES .ssn ( STUDENT x TAKES )))
)
Ssn
123
234
123
234
Name
smith
jones
smith
jones
Address
ssn
main str
123
forbes ave
123
main str
234
forbes ave
234
cid
cis331
cis331
cis331
cis331
grade
A
A
B
B
FUNDAMENTAL
Relational operators
condition (R )
selection
attlist (R)
projection
cartesian product MALE x FEMALE
RUS
set union
set difference
R- S
Relational ops
Surprisingly, they are enough, to help
us answer almost any query we want!!
derived operators, for convenience
set intersection
join (theta join, equi-join, natural join)
‘rename’ operator R ' ( R)
division R S
Joins
Equijoin: R
R . a S .b
S
R . a S .b ( R S )
Cartesian product
A:
......... STUDENT .ssnTAKES . ssn ( STUDENT x TAKES )
Ssn
123
234
123
234
Name
smith
jones
smith
jones
Address
ssn
main str
123
forbes ave
123
main str
234
forbes ave
234
cid
cis331
cis331
cis331
cis331
grade
A
A
B
B
Joins
Equijoin: R
R . a S .b
S R . a S .b ( R S )
theta-joins: R S
generalization of equi-join - any
condition
Joins
very popular: natural join: R
S
like equi-join, but it drops duplicate
columns:
STUDENT(ssn, name, address)
TAKES(ssn, cid, grade)
Joins
nat. join has 5 attributes STUDENT TAKES
Ssn
123
234
123
234
Name
smith
jones
smith
jones
equi-join: 6
Address
ssn
main str
123
forbes ave
123
main str
234
forbes ave
234
cid
cis331
cis331
cis331
cis331
grade
A
A
B
B
STUDENT STUDENT . ssnTAKES . ssn TAKES
Natural Joins - nit-picking
if no attributes in common between R, S:
nat. join -> cartesian product:
Overview - rel. algebra
fundamental operators
derived operators
joins etc
rename
division
examples
rename op.
Q: why? AFTER (BEFORE )
A:
Shorthand (BEFORE can be a relational
algebra expression)
self-joins; …
for example, find the grand-parents of
‘Tom’, given PC(parent-id, child-id)
rename op.
PC(parent-id, child-id)
PC
p-id
Mary
Peter
John
c-id
Tom
Mary
Tom
PC
p-id
Mary
Peter
John
PC PC
c-id
Tom
Mary
Tom
rename op.
first, WRONG attempt:
PC PC
(why? how many columns?)
Second WRONG attempt:
PC PC.cid PC. p id PC
rename op.
we clearly need two different names for
the same table - hence, the ‘rename’
op.
PC1 ( PC) PC1.cid PC. pid PC
Overview - rel. algebra
fundamental operators
derived operators
joins etc
rename
division
examples
Division
Rarely used, but powerful.
Suited for queries that include the
phrase “for all”
Example: find suspicious suppliers, i.e.,
suppliers that supplied all the parts in
A_BOMB
Division
SHIPMENT
s#
s1
s2
s1
s3
s5
p#
p1
p1
p2
p1
p3
ABOMB
p#
p1
p2
BAD_S
s#
s1
Division
Observations: ~reverse of cartesian
product
It can be derived from the 5
fundamental operators (!!)
How?
Division
Answer:
r s ( R S ) (r ) ( R S ) [( ( R S ) (r ) s) r ]
( R S ) [( ( R S ) (r ) s ) r ]
gives those tuples t in ( R S ) ( r )
such that for some tuple u in S, tu not in R.
Overview - rel. algebra
fundamental operators
derived operators
joins etc
rename
division
examples
Sample schema
find names of students that take cis351
STUDENT
Ssn
Name
123 smith
234 jones
Address
main str
forbes ave
TAKES
SSN
c-id
123 cis331
234 cis331
CLASS
c-id
c-name units
cis331 d.b.
2
cis321 o.s.
2
grade
A
B
Examples
find names of students that take cis351
name[ cid cis351(STUDENT TAKES)]
Sample schema
find course names of ‘smith’
STUDENT
Ssn
Name
123 smith
234 jones
Address
main str
forbes ave
TAKES
SSN
c-id
123 cis331
234 cis331
CLASS
c-id
c-name units
cis331 d.b.
2
cis321 o.s.
2
grade
A
B
Examples
find course names of ‘smith’
c name[ name 'smith' (
STUDENT TAKES CLASS
)]
Examples
find ssn of ‘overworked’ students,
ie., that take cis331, cis342, cis350
Examples
find ssn of ‘overworked’ students, ie., that
take cis331, cis342, cis350:
almost correct answer:
cname331 (TAKES ) I
cname342 (TAKES ) I
cname350 (TAKES )
Examples
find ssn of ‘overworked’ students, ie.,
that take cis331, cis342, cis350
- Correct answer:
ssn [ cname331 (TAKES )] I
ssn [ cname342 (TAKES )] I
ssn [ cname350 (TAKES )]
Examples
find ssn of students that work at least
as hard as ssn=123
(ie., they take all the courses of
ssn=123, and maybe more)
Sample schema
STUDENT
Ssn
Name
123 smith
234 jones
Address
main str
forbes ave
TAKES
SSN
c-id
123 cis331
234 cis331
CLASS
c-id
c-name units
cis331 d.b.
2
cis321 o.s.
2
grade
A
B
Examples
find ssn of students that work at least
as hard as ssn=123 (ie., they take all
the courses of ssn=123, and maybe
more
[ ssn,c id (TAKES )] c id [ ssn123 (TAKES )]
Conclusions
Relational model: only tables
(‘relations’)
relational algebra:
powerful, minimal:
5 operators can handle almost any query!
most non-trivial op.: join
The bank database
© Copyright 2026 Paperzz