Slides-10

Retrievals &
Projections
Objectives of the Lecture :
•To consider retrieval actions from a DB;
•To consider using relational algebra for defining relations;
•To consider the Project operator and its use in SQL retrievals.
Using a Relational DB
There are two parts to using a DB :
1. defining the
relation(s) to
be used;
A
1
2
3
4
B
5
6
7
8
C
9
10
11
12
2. specifying the
action to be taken
with it/them.
A
1
2
3
4
B
5
6
7
8
C
9
10
11
12
Actions on a DB

Actions are expressed in SQL statements.

The keyword or key phrase that starts an SQL statement
determines what the statement’s action is.
Examples :
Create (Table)
Alter (Table)
Insert (Into),
Delete (From)
Update,
Commit and Rollback.


Data retrieval is another action. It is the raison d’etre of any DB.
There would be little point in a DB whose data could not be
retrieved.
The SQL keyword that starts a retrieval statement is Select.
Retrieval Actions

The SQL Select statement must comprise the following two
phrases :
Select …….
From ……. ;
Optionally it may contain other phrases.
Retrievals are sometimes known as Queries, because the data
retrieved can be thought of as the answer to a question.
Example :“Which employees are married ?”
Data from the DB provides the answer.


On this course, the DBMS is set up so as to always retrieve
data to our computer screen, so that we can see it.
(However SQL
Select does permit retrievals to other locations).
Defining Relations for Actions

Relational algebra or relational calculus languages are used to
define a relation that is to be acted on.

A defined relation may be :
 a part of a DB relation,
 A merger of 2 or more DB relations,
 part of a ‘merged’ relation.

Relational algebra / calculus language can provide the great
power and flexibility, together with conceptual simplicity,
needed to define ‘relations-to-be-acted-on’.

SQL is a mixture of relational algebra and calculus plus other ‘ad
hocery’  needlessly complex.
Relational Algebra (1)

The SQL language is based on a mixture of relational algebra,
relational calculus, and ad hoc peculiarities of its own.

Relational algebra is simpler than SQL.

Therefore for simplicity, this course follows the tradition of
using relational algebra to define the relation to be retrieved. It
is then expressed in SQL for execution.

Relational algebra consists of :
 a number of relational operators; a monadic operator operates
on one operand, i.e.one relation, to produce a single relation
as a result; a dyadic operator operates on two operands to
produce a single result relation.
 A way of combining relational operators together to form a
relational expression.
Relational Algebra (2)
Relational algebra is based on the same concepts as arithmetic
algebra.

Intuitively useful operators :
Arithmetic
Examples :
plus
minus

Relational
Project – pick out attributes
Restrict – pick out tuples
“Closure Under the Algebra”
In
3-((56)+2)
( 5  6 ) results in a number, 30; so the next calculation is
(30 + 2), giving 32; the last calculation is (3 - 32), giving -29.
Each operator generates another number, i.e. closure.
Similarly each relational operator generates another relation.
In both cases, arbitrarily complex formulae can be built up.
Designing Retrievals

Designing a retrieval means defining the relation to be retrieved,
since it is only necessary to prefix the SQL definition with Select
for it to be retrieved.

As each relational algebra operator encapsulates an intuitively
useful concept, with a universally accepted name for referencing
it, the following method is used to learn how to write queries :
 Learn a number of individual algebra operators and how to
express them in SQL.
 Learn how to combine operators to create powerful
expressions, and how to write the expressions in SQL.

SQL uses a standard fixed format for writing expressions. This
is straightforward for simple and reasonable queries, but can be
constraining for complex queries.
Example of Projection (1)
R
B
5
3
8
9
5
1
D
4
7
5
6
8
2
A
4
6
4
3
4
6
B
5
3
8
9
5
1
R Project[ B, D ]
C
1
2
1
7
2
2
D
4
7
5
6
8
2
Example of Projection (2)
R
A
4
6
34
43
4
66
C
1
2
71
27
2
2
A
4
6
4
3
4
6
B
5
3
8
9
5
1
R Project[ A, C ]
C
1
2
1
7
2
2
D
4
7
5
6
8
2
Definition of ‘Project’
1. Creates a new relation containing only
the specified attributes.
2. Any would-be duplicate tuples are
removed, to ensure the result is a set of
tuples.
3. The result’s attribute names are those of
the operand’s attribute names specified in
the parameter to the Project operation.
If the specified
attribute(s) include
a candidate key,
then there can be
no duplicate tuples
to remove.
Characteristics of Projection


Don’t need to know about any constraints on the operand
relation in order to be able to do a projection.
Boundary Cases :
 If all the attributes are specified, then
result  operand


If no attributes are specified, then
result  nullary relation
i.e. result has no attributes.
In principle, the unwanted attributes can be specified instead,
i.e. those to be removed.
Example : in relation R( A, B, C, D )
R Project[ ~ B, D ]  R Project[ A, C ]
SQL : Projection
Principles :

Put the operand relation’s name in the From phrase.

Put the attributes to be projected out in the Select phrase.

This SQL statement also retrieves the newly created relation
from the DB.
Examples :
SQL equivalent of the 2 example projections on relation ‘R’ :Select B, D
Select A, C
From R ;
From R ;
SQL : Duplicate Rows

SQL does not remove duplicate rows automatically.
Problem !

SQL must be instructed to remove duplicates, by inserting
the Distinct keyword
between the Select keyword and
the list of attribute names.
Examples :Select Distinct B,D
From
R;
Select Distinct A, C
From
R;
As there were no duplicate tuples,
Distinct is strictly unnecessary;
but it is still a correct statement.
This does a real projection;
the earlier version did not.
SQL : Removing Duplicate Rows

The reason why SQL is not designed to remove duplicate
tuples automatically is because, originally, this was thought
to need too much computing power.

This issue is now contentious :
1. Computers are now much more powerful.
2. Much more serious performance optimisation problems
can be created by retaining duplicates than by removing
them.
Advice :
 Always initially insert Distinct, unless a candidate key in the
result ensures no problems of duplicates.
 IF performance cannot then be made fast enough
THEN decide whether performance or lack of duplicates is
more important, and choose the preferred option.
SQL : Projection Characteristics

To project out all the attributes in a relation,
use * in the Select phrase.
Example :Select *
From R ;
The result is the same as relation ‘R’.
(Thus a retrieval of a whole table is in fact a retrieval of a
projection of all the attributes).

Nullary relations.
These are not permitted in SQL. If nothing is put between the
Select and From keywords, a syntax error will be issued.

SQL does not allow the specification of unwanted attributes
in the Select phrase.