R ⋈ * S = * * ( R × S) sid

Relational Algebra
R & G, Chapters
• Relational Algebra: 4.1 - 4.2
Architecture of a
DBMS
Last lecture:
• Finished tree indexes:
ISAM and B+-Trees
Today:
Relational Operators
SQL Client
Query Parsing
& Optimization
Relational Operators
Files and Tree-Index
Management
Buffer Management
Disk Space
Management
Database
Architecture of a
DBMS
Execute the dataflow by
operating on records and files
SQL Client
Query Parsing
& Optimization
Relational Operators
GroupBy (Age)
Files and Tree-Index
Management
Indexed Join
Buffer Management
Heap Scan
Reserves
Indexed Scan
Sailors
Disk Space
Management
Database
Architecture of a
DBMS
Previously (2 Weeks)
How do we store
and access data?
SQL Client
Query Parsing
& Optimization
Relational Operators
Files and Tree-Index
Management
Today
How de we
represent and
execute
computation?
Buffer Management
Disk Space
Management
Database
Big Picture Overview
SQL Query
Relational Algebra
SELECT S.name
FROM Reserves R, Sailors S
WHERE R.sid = S.sid
AND R.bid = 100
AND S.rating > 5
𝜋S.name(𝜎bid=100⋀rating>5(
Query Parser
Reserves
⋈R.sid=S.sid Sailors))
Optimized (Physical) Query Plan:
(Logical) Query Plan:
𝜋S.name
𝜎R.bid=100 ⋀ S.rating > 5
⋈R.sid=S.sid
Reserves
Sailors
Query Optimizer
Operator Code
B+-Tree
Indexed Scan
Iterator
𝜋S.name
On-the-fly
Project Iterator
𝜎S.rating>5
On-the-fly
Select Iterator
⋈R.sid=S.sid
𝜎R.bid=100
Reserves
Sailors
Indexed Nested
Loop Join Iterator
Heap Scan
Iterator
Big Picture Overview
Relational Algebra
SQL Query
SELECT S.name
FROM Reserves R, Sailors S
WHERE R.sid = S.sid
AND R.bid = 100
AND S.rating > 5
𝜋
(𝜎bid=100⋀rating>5(
Query ParserS.name
Reserves
Optimized (Physical) Query Plan:
Why?
(Logical) Query Plan:
𝜎R.bid=100 ⋀ S.rating > 5
A declarative
expression of the
⋈R.sid=S.sid
query
result
Reserves
Sailors
On-the-fly
Project Iterator
𝜋S.name
𝜋S.name
SQL
⋈R.sid=S.sid Sailors))
Query Optimizer
Relational Algebra
⋈ description of
OperationalR.sid=S.sid
𝜎S.rating>5
On-the-fly
Select Iterator
Indexed Nested
Loop Join Iterator
Operator Code
B+-Tree
Indexed Scan
Iterator
a computation.
𝜎R.bid=100
Sailors
Heap Scan
Iterator
SystemsReserves
optimize and execute
relational algebra query plan.
Big Picture Overview
Relational Algebra
SQL Query
SELECT S.name
FROM Reserves R, Sailors S
WHERE R.sid = S.sid
AND R.bid = 100
AND S.rating > 5
𝜋
(𝜎bid=100⋀rating>5(
Query ParserS.name
Reserves
Optimized (Physical) Query Plan:
Why?
(Logical) Query Plan:
𝜎R.bid=100 ⋀ S.rating > 5
A declarative
expression of the
⋈R.sid=S.sid
query
result
Reserves
Sailors
On-the-fly
Project Iterator
𝜋S.name
𝜋S.name
SQL
⋈R.sid=S.sid Sailors))
Query Optimizer
Relational Algebra
⋈ description of
OperationalR.sid=S.sid
𝜎S.rating>5
On-the-fly
Select Iterator
Indexed Nested
Loop Join Iterator
Operator Code
B+-Tree
Indexed Scan
Iterator
a computation.
𝜎R.bid=100
Sailors
Heap Scan
Iterator
SystemsReserves
optimize and execute
relational algebra query plan.
SQL (Structured Query Language)
• One of most widely
used languages
• Introduced in 1970s
– IBM System R
SELECT S.name
FROM Reserves R, Sailors S
WHERE R.sid = S.sid
AND R.bid = 100
AND S.rating > 5
• Key System Features: Why do we like SQL
– Declarative:
• Say what you want, not how to get it
• Enables system to optimize the how
– Limited DSL  formal reasoning and optimization
• Foundation in formal Query Languages
– Relational Calculus
Formal Relational QL’s
• Relational Calculus: (Basis for SQL)
– Describe the result of computation
– Based on first order logic
– Tuple Relational Calculus (TRC)
• {S | S ∈ Sailors ∃R ∈ Reserves
(R.sid = S.sid∧R.bid = 103)}
Are these equivalent?
• Relational Algebra:
Can we go from one to
the other?
– Algebra on sets
– Operational description of transformations
Codd’s Theorem
• Established equivalence in
expressivity between :
– Relational Calculus*
– Relational Algebra
• Why an import result?
Edgar F. “Ted” Codd
(1923 - 2003)
Turing Award 1981
– Connects declarative representation of
queries with operational description
– Constructive: we can compile SQL into
relational algebra
*Domain Independent Relational Calculus
Imaged obtained from https://en.wikipedia.org/wiki/Edgar_F._Codd
Big Picture Overview
Relational Algebra
SQL Query
SELECT S.name
FROM Reserves R, Sailors S
WHERE R.sid = S.sid
AND R.bid = 100
AND S.rating > 5
𝜋
(𝜎bid=100⋀rating>5(
Query ParserS.name
Reserves
Optimized (Physical) Query Plan:
Why?
(Logical) Query Plan:
𝜎R.bid=100 ⋀ S.rating > 5
AAdeclarative
declarative
expression
expressionof
ofthe
the
⋈
R.sid=S.sid
query
query
result
result
Reserves
Sailors
On-the-fly
Project Iterator
𝜋S.name
𝜋S.name
SQL
SQL
⋈R.sid=S.sid Sailors))
Query Optimizer
Relational Algebra
⋈ description of
OperationalR.sid=S.sid
𝜎S.rating>5
On-the-fly
Select Iterator
Indexed Nested
Loop Join Iterator
Operator Code
B+-Tree
Indexed Scan
Iterator
a computation.
𝜎R.bid=100
Sailors
Heap Scan
Iterator
SystemsReserves
optimize and execute
relational algebra query plan.
Relational Algebra, Iterators, &
Operator Implementations
R & G, Chapters
• Relational Algebra: 4.1 - 4.2
• Relational  Query Plan: 12
• Select, Project, Join operators: 14
•
Day 2 (2/18/2016)
Relational Algebra Preliminaries
• Algebra of operators on relational instances
𝜋S.name(𝜎R.bid=100 ⋀ S.rating>5(R ⋈R.sid=S.sid S))
– Closed: result is also a relational instance
• Enables rich composition!
– Typed: input schema determines output
– Why is this important?
• Pure relational algebra has set semantics
– No duplicate tuples in a relation instance
– vs. SQL, which has multiset (bag) semantics
– We will relax this in the system discussion
Relational Algebra Operators
Unary Operators: operate on single relation instance
• Projection ( p ): Retains only desired columns (vertical)
• Selection ( s ): Selects a subset of rows (horizontal)
• Renaming ( 𝜌 ): Rename attributes and relations.
Binary Operators: operate on pairs of relation instances
• Union (  ): Tuples in r1 or in r2.
• Intersection ( ∩ ): Tuples in r1 and in r2.
• Set-difference ( — ): Tuples in r1, but not in r2.
• Cross-product (  ): Allows us to combine two relations.
• Joins ( ⋈𝜃 , ⋈ ): Combine relations that satisfy predicates
Projection (p)
Selects a subset of columns (vertical)
psname,rating(S2)
List of Attributes
Relational Instance S2
sid sname
rating
age
sname
age
28
yuppy
9
35.0
yuppy
35.0
31
lubber
8
55.5
lubber
55.5
44
guppy
5
35.0
guppy
35.0
58
rusty
10
35.0
rusty
35.0
• Corresponds to the SELECT list
• Schema determined by schema of attribute list
– Names and types correspond to input attributes
Projection (p)
Selects a subset of columns (vertical)
page(S2)
Relational Instance S2
sid sname
rating
age
Multiset
age
28
yuppy
9
35.0
35.0
31
lubber
8
55.5
55.5
44
guppy
5
35.0
35.0
58
rusty
10
35.0
35.0
Set
age
35.0
55.5
• Set semantics  results in fewer rows
– Real systems don’t automatically remove duplicates
– why?
Selection(𝜎)
Selects a subset of rows (horizontal)
𝜎rating>8(S2)
Selection Condition (Boolean Expression)
Relational Instance S2
sid sname
rating
age
28
yuppy
9
35.0
sid sname
rating
age
31
lubber
8
55.5
28
yuppy
9
35.0
44
guppy
5
35.0
58
rusty
10
35.0
58
rusty
10
35.0
• Corresponds to the WHERE clause
• Output schema same as input
• Duplicate Elimination?
Composing Select and Project
• Names of sailors with rating > 8
pname(𝜎rating>8( S2 ))
sid sname
rating
age
28
yuppy
9
35.0
sid sname
rating age
sname
31
lubber
8
55.5
28
yuppy
9
35.0
yuppy
44
guppy
5
35.0
58
rusty
10
35.0
rusty
58
rusty
10
35.0
𝜎rating>8
• What about:
𝜎rating>8(pname(S2))
pname
Invalid types. Input
to 𝜎rating>8 does not
contain rating.
Union (∪)
S1 ∪ S2
S1
S2
S1 ∪ S2
Two input relations, must be compatible:
• Same number of fields.
• “Corresponding” fields have same type
• SQL Expression: UNION
Union (∪)
S1 ∪ S2
Relational Instance S1
sid sname
rating
age
sid sname
rating
age
22
dustin
7
45.0
22
dustin
7
45
31
lubber
8
55.5
28
yuppy
9
35.0
58
rusty
10
35.0
31
lubber
8
55.5
44
guppy
5
35.0
58
rusty
10
35.0
Relational Instance S2
sid sname
rating
age
28
yuppy
9
35.0
31
lubber
8
55.5
44
guppy
5
35.0
58
rusty
10
35.0
Duplicate elimination in practice?
• UNION vs UNION ALL?
Set Difference ( −
)
S1 − S2
S1
S2
S1 − S2
Same as with union, both input relations must be
compatible.
SQL Expression: EXCEPT
Set Difference ( −
)
S1 − S2
Relational Instance S1
sid sname
rating
age
sid sname
rating
age
22
dustin
7
45.0
22
7
45
31
lubber
8
55.5
58
rusty
10
35.0
dustin
Symmetric?
S2 − S1
Relational Instance S2
sid sname
rating
age
sid sname
rating
age
28
yuppy
9
35.0
28
yuppy
9
35.0
31
lubber
8
55.5
44
guppy
5
35.0
44
guppy
5
35.0
58
rusty
10
35.0
Duplicate elimination?
• Not required
• EXCEPT vs EXCEPT ALL
Monotonicity and Set Difference
• A relational op. Q is monotone in R if:
R1 ⊆ R2 ⇒ Q(R1, S, T, …) ⊆ Q(R2, S, T, …)
• For example UNION (∪) is monotone:
R1 ⊆ R2 ⇒ S ∪ R1 ⊆ S ∪ R2
• Set difference?
Monotonicity and Set Difference
• A relational op. Q is monotone in R if:
R1 ⊆ R2 ⇒ Q(R1, S, T, …) ⊆ Q(R2, S, T, …)
• Set difference (−) where S = {a,b,c}:
?
R1= {a} ⊆ R2 = {a,b} ⇒ S − {a} ⊆ S − {a,b}
{b,c} ⊆ {c}
• Implication: set difference is blocking
– For S – R, need to have full contents of R before
emitting any results
Intersection (∩)
S1 ∩ S2
S1
S2
S1 ∩ S2
Same as with union, both input relations must be
compatible.
SQL Expression: INTERSECT
Intersection (∩)
S1 ∩ S2
Relational Instance S1
sid sname
rating
age
sid sname
rating
age
22
dustin
7
45.0
31
lubber
8
55.5
31
lubber
8
55.5
58
rusty
10
35.0
58
rusty
10
35.0
Relational Instance S2
sid sname
rating
age
28
yuppy
9
35.0
31
lubber
8
55.5
44
guppy
5
35.0
58
rusty
10
35.0
Is intersection essential?
• Implement it with earlier ops. ?
Intersection (∩)
S1 ∩ S2 = ?
S1
∩
S2
Intersection (∩)
S1
S1 ∩ S2 = S1 – ?
= S1
–
?
∩
S2
Intersection (∩)
S1
∩
S2
S1 ∩ S2 = S1 – (S1 – S2)
= S1
–
(
S1
–
S2
)
Is intersection monotonic?
R1 ⊆ R2
⇒
S ∩ R1 ⊆ S ∩ R2
Yes!
Cross-Product (×)
R1 × S1: Each row of R1 paired with each row of S1
R1:
R1 × S1
sid
bid
day
22
101
10/10/96
58
103
11/12/96
×
S1:
sid
sname
rating
age
22
dustin
7
45.0
31
lubber
8
55.5
58
rusty
10
35.0
=
4,a 4,b 4,c 4,d
3
3,a 3,b 3,c 3,d
2
2,a 2,b 2,c 2,d
1
1,a 1,b 1,c 1,d
b
c
bid
day
sid
sname
rating
age
22
101
10/10/96
22
dustin
7
45.0
22
101
10/10/96
31
lubber
8
55.5
22
101
10/10/96
58
rusty
10
35.0
58
103
11/12/96
22
dustin
7
45.0
58
103
11/12/96
31
lubber
8
55.5
58
103
11/12/96
58
rusty
10
35.0
How many rows in the result?
Sometimes also called
Cartesian Product: 4
a
sid
d
Schema compatibility?
|R1| * |R2|
No requirements.
One field per field in original schemas.
What about duplicate names?
Renaming
operator
Renaming ( 𝜌 = “rho” )
Renames relations and their attributes:
𝜌( Temp1(1  sid1, 4  sid2), R1 × S1)
Output Relation
Name
Renaming List
position  New Name
Input
Relation
R1 × S1
Temp1
sid
bid
day
sid
sname
rating
age
sid1
bid
day
sid2
sname
rating
age
22
101
10/10/96
22
dustin
7
45.0
22
101
10/10/96
22
dustin
7
45.0
22
101
10/10/96
31
lubber
8
55.5
22
101
10/10/96
31
lubber
8
55.5
22
101
10/10/96
58
rusty
10
35.0
22
101
10/10/96
58
rusty
10
35.0
58
103
11/12/96
22
dustin
7
45.0
58
103
11/12/96
22
dustin
7
45.0
58
103
11/12/96
31
lubber
8
55.5
58
103
11/12/96
31
lubber
8
55.5
58
103
11/12/96
58
rusty
10
35.0
58
103
11/12/96
58
rusty
10
35.0
• Note that relational algebra doesn’t require names.
• We could just use positional arguments. pf5(𝜎f6>f8(S2))
• Difficult to read …
Compound Operator: Join
• Joins are compound operators (like intersection):
– Built using cross product & selection and sometimes
projection (for natural join)
• Hierarchy of common kinds:
– Theta Join ( ⋈𝜃 ): join on logical expression 𝜃
• Equi-Join: theta join with conjunction equalities
– Natural Join ( ⋈ ): equi-join on all matching column names
• Note: we should use a good join algorithm, not a
cross-product if we can avoid it!!
Theta Join (⋈𝜃)
R ⋈𝜃 S = 𝜎𝜃( R × S)
Example: More senior sailors for each sailor.
S1 ⋈ age < age S1
S1:
sid
sname
rating
age
22
dustin
7
45.0
31
lubber
8
55.5
58
rusty
10
35.0
This predicate doesn’t
make any sense!
Could switch to
positional arguments
Theta Join (⋈𝜃)
R ⋈𝜃 S = 𝜎𝜃( R × S)
Example: More senior sailors for each sailor.
S1 ⋈ f4 < f8 S1
S1:
S1 × S1
S1
S1
f1
f2
f3
f4
f1
f2
f3
f4
f5
f6
f7
f8
22
dustin
7
45.0
22
dustin
7
45.0
22
dustin
7
45.0
31
lubber
8
55.5
22
dustin
7
45.0
31
lubber
8
55.5
58
rusty
10
35.0
22
dustin
7
45.0
58
rusty
10
35.0
31
lubber
8
55.5
22
dustin
7
45.0
31
lubber
8
55.5
31
lubber
8
55.5
31
lubber
8
55.5
58
rusty
10
35.0
58
rusty
10
35.0
22
dustin
7
45.0
58
rusty
10
35.0
31
lubber
8
55.5
58
rusty
10
35.0
58
rusty
10
35.0
Difficult to read
 use 𝜌 operator
Theta Join (⋈𝜃)
R ⋈𝜃 S = 𝜎𝜃( R × S)
Example: More senior sailors for each sailor.
S1 ⋈ age < age2 𝜌((8  age2), S1)
S1 × S1
S1:
S1
S1
sid
sname
rating
age
sid
sname
rating
age
sid
sname
rating
age2
22
dustin
7
45.0
22
dustin
7
45.0
22
dustin
7
45.0
31
lubber
8
55.5
22
dustin
7
45.0
31
lubber
8
55.5
58
rusty
10
35.0
22
dustin
7
45.0
58
rusty
10
35.0
31
lubber
8
55.5
22
dustin
7
45.0
31
lubber
8
55.5
31
lubber
8
55.5
31
lubber
8
55.5
58
rusty
10
35.0
58
rusty
10
35.0
22
dustin
7
45.0
58
rusty
10
35.0
31
lubber
8
55.5
58
rusty
10
35.0
58
rusty
10
35.0
Theta Join (⋈𝜃)
R ⋈𝜃 S = 𝜎𝜃( R × S)
Example: More senior sailors for each sailor.
S1 ⋈ age < age2 𝜌((8  age2), S1)
S1:
S1
S1
sid
sname
rating
age
sid
sname
rating
age
sid
sname
rating
age2
22
dustin
7
45.0
22
dustin
7
45.0
31
lubber
8
55.5
31
lubber
8
55.5
58
rusty
10
35.0
22
dustin
7
45.0
58
rusty
10
35.0
58
rusty
10
35.0
31
lubber
8
55.5
• Result schema same as that of cross-product.
• Special Case:
• Equi-Join: theta join with conjunction equalities
• Special special case Natural Join …
Natural Join (⋈)
Special case of equi-join in which equalities
are specified for all matching fields and
duplicate fields are projected away
R ⋈ S = 𝜋unique fld. 𝜎eq. matching fld.( R × S )
• Compute R × S
• Select rows where fields appearing in
both relations have equal values
• Project onto the set of all unique fields.
Natural Join (⋈)
R ⋈ S = 𝜋unique fld. 𝜎eq. matching fld.( R × S )
Example:
R1 ⋈ S1
R1:
sid
bid
day
sid
bid
day
sid
sname
rating
age
22
101
10/10/96
22
101
10/10/96
22
dustin
7
45.0
58
103
11/12/96
22
101
10/10/96
31
lubber
8
55.5
22
101
10/10/96
58
rusty
10
35.0
58
103
11/12/96
22
dustin
7
45.0
58
103
11/12/96
31
lubber
8
55.5
58
103
11/12/96
58
rusty
10
35.0
S1:
sid
sname
rating
age
22
dustin
7
45.0
31
lubber
8
55.5
58
rusty
10
35.0
Natural Join (⋈)
R ⋈ S = 𝜋unique fld. 𝜎eq. matching fld.( R × S )
Example:
R1 ⋈ S1
R1:
sid
bid
day
sid
bid
day
sname
rating
age
22
101
10/10/96
22
101
10/10/96
dustin
7
45.0
58
103
11/12/96
58
103
11/12/96
rusty
10
35.0
S1:
sid
sname
rating
age
22
dustin
7
45.0
31
lubber
8
55.5
58
rusty
10
35.0
Commonly used for foreign
key joins (as above).
Exercise:
Boats(bid,bname,color)
Sailors(sid, sname, rating, age)
Reserves(sid, bid, day)
Find names of sailors who’ve reserved boat #103
• Solution 1:
𝜋sname( 𝜎bid=103( Sailors ⋈ Reserves) )
• Solution 2:
𝜋sname( Sailors ⋈𝜎bid=103( Reserves ) )
Exercise:
Boats(bid,bname,color)
Sailors(sid, sname, rating, age)
Res(sid, bid, day)
Find names of sailors who’ve reserved a red boat
• Solution 1:
𝜋sname(𝜎color=‘red’( Boats ⋈ Res⋈ Sailors)
)
• More “efficient” Solution 2:
𝜋sname( 𝜋sid( 𝜋bid(𝜎color=‘red’( Boats ) )⋈ Res⋈
) Sailors)
In general many possible equivalent expressions: algebra…
Relational Algebra Rules
• Operator Precedence:
– Unary operators before binary operators
• Selections:
 sc1…cn(R)  sc1(…(scn(R))…) (cascade)
 sc1(sc2(R))  sc2(sc1(R))
(commute)
• Projections:
• pa1(R)  pa1(…(pa1, …, an-1(R))…) (cascade)
• Cartesian Product
– R  (S  T)  (R  S)  T
– RSSR
(associative)
(commutative)
– Applies for joins as well but be careful with join
predicates …
Caution with
Natural Join
Boats(bid,bname,color)
Sailors(sid, sname, rating, age)
Reserves(sid, bid, day)
• Consider the following:
Boats
⋈ Reserves ⋈ Sailors
bid
sid
• Commute and Associate:
Boats
⋈ Sailors ⋈ Reserves
bid
sid
– Incompatible join predicate:
Boats
⋈ Sailors ⋈ Reserves
bid
sid
Boats(bid,bname,color)
Sailors(sid, sname, rating, age)
Reserves(sid, bid, day)
Caution with
Natural Join
• Consider the following:
Boats
⋈ Reserves ⋈ Sailors
bid
sid
• Commute and Associate:
Boats
⋈ Sailors ⋈ Reserves
bid
sid
– Incompatible join predicate:
Boats
×
Sailors
⋈
Reserves
sid,bid
Relational Algebra Rules
Commuting of selection operators
 sc(R  S)  sc(R)  S (c only has fields in R)
 sc(R ⋈ S)  sc(R) ⋈ S (c only has fields in
R)
Commuting of projection operators
• 𝜋a(R  S)  𝜋a1(R)  𝜋a2(S)
– a1 is subset of a that mentions R and a2
is subset of a that mentions S
– Similar result holds for joins
Division (/) Compound Operator
“Find the names of sailors that reserved
all boats.”
• Consider Relations: A(x,y) and B(y)
– A / B = { x | ∀y ∈B ( (x,y)∈A ) }
– all entries x in A such that for all y in B
there is an (x,y) in A
• Pictorially:
How do we implement
division in relational
algebra?
A
x
x1
y
y1
x1
x2
y2
y1
x3
y2
/
B
A/B
y
y1
x
x1
y2
=
Division (/) Compound Operator
• If all the x values in A are in the result
what would A look like?
𝜋x(A) × B =
A
B
x
x1
y
y1
y
y1
x1
x2
x3
y2
y1
y2
y2
x
x1
y
y1
x2
x3
y2
x
x1
x1
x2
y
y1
y2
y1
x2
x3
y2
y1
x3
y2
Division (/) Compound Operator
• If all the x values in A were in the result
what would A look like?
𝜋x(A) × B
• Which of these tuples are we missing
from A:
A
B
x
x1
y
y1
y
y1
x1
x2
x3
y2
y1
y2
y2
Missing
( 𝜋x(A) × B) – A
(x,y) values
x
y
x
y
x1
y1
x1
y2
x1
y1
x2
y1
x1
y2
x2
y2
x2
y1
x3
y1
x3
y2
x3
y2
–
=
x
x2
x3
y
y2
y1
Division (/) Compound Operator
• If all the x values in A were in the result
what would A look like?
𝜋x(A) × B
• Which of these tuples are we missing
from A and what are their x values:
A
B
x
x1
y
y1
y
y1
x1
x2
x3
y2
y1
y2
y2
𝜋x(( 𝜋x(A) × B) – A )
x
x2
x3
y
y2
y1
=
x
x2
x3
Disqualified
x values
Division (/) Compound Operator
• If all the x values in A were in the result
what would A look like?
𝜋x(A) × B
• Which of these tuples are we missing
from A and what are their x values:
A
B
x
x1
y
y1
y
y1
x1
x2
x3
y2
y1
y2
y2
𝜋x(( 𝜋x(A) × B) – A )
• Remove disqualified x values:
A / B = 𝜋x(A) – 𝜋x((𝜋x(A) × B) – A)
Operators in Extended Algebra
• Group By / Aggregation Operator (𝛾 ):
𝛾age, AVG(rating)(Sailors)
– With selection (HAVING clause):
𝛾age, AVG(rating), COUNT(*)>2(Sailors)
• Textbook uses two operators:
GROUP BY age, AVG(rating) (Sailors)
HAVINGCOUNT(*)>2 (GROUP BY age, AVG(rating)(Sailors))
Big Picture Overview
Relational Algebra
SQL Query
SELECT S.name
FROM Reserves R, Sailors S
WHERE R.sid = S.sid
AND R.bid = 100
AND S.rating > 5
𝜋
(𝜎bid=100⋀rating>5(
Query ParserS.name
Reserves
Optimized (Physical) Query Plan:
Why?
(Logical) Query Plan:
𝜎R.bid=100 ⋀ S.rating > 5
A declarative
expression of the
⋈R.sid=S.sid
query
result
Reserves
Sailors
On-the-fly
Project Iterator
𝜋S.name
𝜋S.name
SQL
⋈R.sid=S.sid Sailors))
Query Optimizer
Relational Algebra
⋈ description of
OperationalR.sid=S.sid
𝜎S.rating>5
On-the-fly
Select Iterator
Indexed Nested
Loop Join Iterator
Operator Code
B+-Tree
Indexed Scan
Iterator
a computation.
𝜎R.bid=100
Sailors
Heap Scan
Iterator
SystemsReserves
optimize and execute
relational algebra query plan.
Big Picture Overview
Relational Algebra
SQL Query
SELECT S.name
FROM Reserves R, Sailors S
WHERE R.sid = S.sid
AND R.bid = 100
AND S.rating > 5
(𝜎bid=100⋀rating>5(
Reserves
What?
How?
(Logical) Query Plan:
𝜋S.name
SQL
𝜎R.bid=100 ⋀ S.rating > 5
A declarative
expression of the
⋈R.sid=S.sid
query
result
Reserves
𝜋
Query ParserS.name
Sailors
Query Optimizer
⋈R.sid=S.sid Sailors))
Optimized (Physical) Query Plan:
On-the-fly
Project Iterator
𝜋S.name
Relational Algebra
⋈ description of
OperationalR.sid=S.sid
𝜎S.rating>5
On-the-fly
Select Iterator
Indexed Nested
Loop Join Iterator
Operator Code
B+-Tree
Indexed Scan
Iterator
a computation.
𝜎R.bid=100
Sailors
Reserves
Heap Scan
Iterator
Translating SQL to Relational
Algebra
Restrict attention to single (𝜎𝜋×- query)
SELECT S.sid, MIN (R.day)
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = “red”
GROUP BY S.sid
HAVING COUNT (*) >= 2
Select-Project-Join Query
For each sailor with at least two reservations for red boats,
find the sailor id and the earliest date
on which the sailor has a reservation for a red boat.
Translating SQL to Relational
Algebra
Restrict attention to single (𝜎𝜋×- query)
SELECT S.sid, MIN (R.day)
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = “red”
GROUP BY S.sid
HAVING COUNT (*) >= 2
p
Select-Project-Join Query
S.sid, MIN(R.day)
(HAVING
COUNT(*)>2 (
GROUP BY S.Sid (
sB.color = “red” (
Sailors Reserves
Boats))))