PPT

Introduction to SQL
Elements and process: relations  database (with XAMPP)
Select-From-Where Statements
Grouping and Aggregation
Slides by Jeff Ullman
(infolab.stanford.edu/~ullman/dscb/pslides/sql1.ppt),
with example modified and some additions
Bettina Berendt, ISI 2015
Last updated 2015-10-21
1
Why SQL?
u SQL is a very-high-level language.
wSay “what to do” rather than “how to do it.”
wAvoid a lot of data-manipulation details
needed in procedural languages like C++ or
Java.
u Database management system figures
out “best” way to execute query.
wCalled “query optimization.”
2
Select-From-Where Statements
SELECT desired attributes
FROM one or more tables
WHERE condition about tuples of
the tables
3
A note about case-sensitivity
u SQL is case-sensitive inside strings.
u MySQL is case-sensitive on some
operating systems (e.g. Mac).
u It isn‘t on Windows.
4
Our Running Example
All our SQL queries will be based on the following database schema
Session (Month, Year)
Session_Day (Day, Month, Year)
Agenda_Item (Agenda_item_ID, Title, Day, Month, Year, Number)
Speech (Speech_ID, Spoken_text, Language, Video_URI, Agenda_item_ID,
Number, MEP_ID)
Parliament_Member (MEP_ID, Date_of_birth, Given_name, Family_name)
Country (Acronym, Name, EU_member_since)
Role (Name)
Political_Institution (Acronym, Institution_label)
Represents (MEP_ID, CountryAcronym)
In_Political_Function (MEP_ID, RoleName, Inst_Acronym, Start_date,
End_date)
EU_Party (Inst_Acronym)
National_Party (Inst_Acronym)
EU_Committee (Inst_Acronym)
5
So first, we need to turn these
relations into actual tables!
u We will use XAMPP (https://www.apachefriends.org), which installs a
web server, a database management system, and a browser-based
interface for your databases on your computer. The first is Apache, the
second MariaDB (was: MySQL*), and the third phpmyadmin.
u A short tutorial is here:
https://www.siteground.com/tutorials/phpmyadmin/phpmyadmin_create_database.htm
u In the lecture, we will build up the database step by step.
u To load the already-implemented and –populated database:
1.
2.
3.
Download the file eup.sql from Toledo
create a database named (e.g.) eup
Go to Import, choose the file eup.sql, click OK
* For our purposes, the same.
See http://programmers.stackexchange.com/questions/120178/whats-the-difference-between-mariadb-and-mysql
6
Example
u Using Agenda_Item (Agenda_item_ID,
Title, Day, Month, Year, Number), what
titles were discussed in 2012?
SELECT Title
FROM Agenda_Item
WHERE Year=2012;
7
Result of Query
The answer is a relation with a single attribute,
Title, and tuples with the title of each agenda item
discussed in 2012.
8
Meaning of Single-Relation Query
(“Formal Semantics”)
u Begin with the relation in the FROM
clause.
u Apply the selection indicated by the
WHERE clause.
u Apply the extended projection indicated
by the SELECT clause.
9
Operational Semantics
Title
Year
…
tv
Composition
<etc.>
2012
Include tv.Title
in the result
Check if
2012
10
Operational Semantics
u To implement this algorithm think of a
tuple variable ranging over each tuple
of the relation mentioned in FROM.
u Check if the “current” tuple satisfies the
WHERE clause.
u If so, compute the attributes or
expressions of the SELECT clause using
the components of this tuple.
11
* In SELECT clauses
u When there is one relation in the FROM
clause, * in the SELECT clause stands for
“all attributes of this relation.”
u Example using Agenda_item:
SELECT *
FROM Agenda_Item
WHERE Year=2012;
12
Result of Query:
Now, the result has each of the attributes
of Agenda item.
13
Complex Conditions in WHERE
Clause
u From Agenda_item, find the titles of
agenda items in February 2012:
SELECT Title
FROM Agenda_Item
WHERE Year=2012
AND Month=02;
14
Patterns
u WHERE clauses can have conditions in
which a string is compared with a
pattern, to see if it matches.
u General form:
<Attribute> LIKE <pattern> or
<Attribute> NOT LIKE <pattern>
u Pattern is a quoted string with % =
“any string”; _ = “any character.”
15
Example
u From Agenda_item, find the agenda items whose title
includes “Composition” :
SELECT Title, Month, Year
FROM Agenda_Item
WHERE Title LIKE "%Composition%";
Note that SQL is
case-sensitive
inside strings,
and MySQL is
case-sensitive
on some
operating
16
NULL Values
u Tuples in SQL relations can have NULL as a
value for one or more components.
u Meaning depends on context. Two common
cases:
w Missing value : e.g., we know Louise Weiss has
some birth date, but we don’t know what it is.
w Inapplicable : e.g., the value of attribute spouse
for an unmarried person.
17
Comparing NULL’s to Values
u The logic of conditions in SQL is really 3valued logic: TRUE, FALSE, UNKNOWN.
u When any value is compared with NULL,
the truth value is UNKNOWN.
u But a query only produces a tuple in the
answer if its truth value for the WHERE
clause is TRUE (not FALSE or UNKNOWN).
18
Three-Valued Logic
u To understand how AND, OR, and NOT
work in 3-valued logic, think of TRUE =
1, FALSE = 0, and UNKNOWN = ½.
u AND = MIN; OR = MAX, NOT(x) = 1-x.
u Example:
TRUE AND (FALSE OR NOT(UNKNOWN))
= MIN(1, MAX(0, (1 - ½ ))) =
MIN(1, MAX(0, ½ ) = MIN(1, ½ ) = ½.
19
Surprising Example
u From the following Sells relation:
bar
beer
price
Joe’s Bar Bud
NULL
SELECT bar
FROM Sells
WHERE price < 2.00 OR price >= 2.00;
UNKNOWN
UNKNOWN
UNKNOWN
20
Multirelation Queries
u Interesting queries often combine data
from more than one relation.
u We can address several relations in one
query by listing them all in the FROM
clause.
u Distinguish attributes of the same name
by “<relation>.<attribute>”
21
Example
u Using relations Parliament_member and
In_Political_Function, find the names of MEPs who are
chairs (or co-chairs, vice-chairs, …) of some institution.
SELECT Given_name, Family_name,
Inst_Acronym, RoleName
FROM Parliament_Member,
In_Political_Function
WHERE
Parliament_Member.MEP_ID =
In_Political_Function.MEP_ID
AND
RoleName LIKE "%chair";
22
Result
23
Formal Semantics
u Almost the same as for single-relation
queries:
1. Start with the product of all the relations
in the FROM clause.
2. Apply the selection condition from the
WHERE clause.
3. Project onto the list of attributes and
expressions in the SELECT clause.
24
Operational Semantics
u Imagine one tuple-variable for each
relation in the FROM clause.
wThese tuple-variables visit each
combination of tuples, one from each
relation.
u If the tuple-variables are pointing to
tuples that satisfy the WHERE clause,
send these tuples to the SELECT clause.
25
Example
MEP_ID
Family_name
tv1
1003
Pisoni
MEP_ID
1003
Inst_Acronym
PPE
Role
tv2
vice-chair
check
for role
Parliament_member
check these
are equal
In_political_function
to output
26
Joins with more tables: In what languages
do these people give speeches?
27
Explicit Tuple-Variables
u Sometimes, a query needs to use two
copies of the same relation.
u Distinguish copies by following the
relation name by the name of a tuplevariable, in the FROM clause.
u It’s always an option to rename
relations this way, even when not
essential.
28
Example
u From Speech, find
all pairs of
speeches by the
same MEP.
wDo not produce
pairs like (1,1).
wProduce pairs in
ascending order,
e.g. (1,2), not
(2,1).
29
Controlling Duplicate Elimination
u Force the result to be a set by
SELECT DISTINCT . . .
u Force the result to be a bag (i.e., don’t
eliminate duplicates) by ALL, as in
. . . UNION ALL . . .
30
Example: DISTINCT
(What does this mean? What would you get without DISTINCT?)
31
Aggregations
u SUM, AVG, COUNT, MIN, and MAX can
be applied to a column in a SELECT
clause to produce that aggregation on
the column.
u Also, COUNT(*) counts the number of
tuples.
32
Example: Aggregation
u From Session, find the average month:
SELECT AVG(Month)
FROM Session;
Why?
33
Eliminating Duplicates in an
Aggregation
u Use DISTINCT inside an aggregation.
u Example: Session_day has 4 days in
January 2012 and 1 in February 2012.
u What is the result of these queries?
SELECT count( * ) FROM Session_day;
SELECT count( Month ) FROM Session_day;
SELECT count( DISTINCT Month ) FROM Session_day;
34
NULL’s Ignored in Aggregation
u NULL never contributes to a sum,
average, or count, and can never be the
minimum or maximum of a column.
u But if there are no non-NULL values in a
column, then the result of the
aggregation is NULL.
35
Example: Effect of NULL´s
u An incomplete version of the database has
u SELECT count(*) FROM `speech`  628
u SELECT count(agenda_item_ID) FROM `speech`  0
36
Grouping
u We may follow a SELECT-FROM-WHERE
expression by GROUP BY and a list of
attributes.
u The relation that results from the
SELECT-FROM-WHERE is grouped
according to the values of all those
attributes, and any aggregation is
applied only within each group.
37
Example: Grouping
38
Grouping
over >1
table
39
Grouping
over >1
table,
sorted
(Note:
ORDER BY
also for
other
attributes)
40
Restriction on SELECT Lists
With Aggregation
u If any aggregation is used, then each
element of the SELECT list must be
either:
1. Aggregated, or
2. An attribute on the GROUP BY list.
41
HAVING Clauses
u HAVING <condition> may follow a
GROUP BY clause.
u If so, the condition applies to each
group, and groups not satisfying the
condition are eliminated.
42
Example: HAVING
43
Requirements on HAVING
Conditions
u These conditions may refer to any
relation or tuple-variable in the FROM
clause.
u They may refer to attributes of those
relations, as long as the attribute makes
sense within a group; i.e., it is either:
1. A grouping attribute, or
2. Aggregated.
44
Reading
Book III,
u Ch.1, 207-213 about basics (literal
values, variables, functions)
u Ch.2, 231-234, 236-240, 248-255 about
SELECT statements
u Ch.4, 308-313 about SELECT
statements on JOINed tables.
45
Next week
u More on SQL!
46