Database Management Systems

Database Management SystemsChapter 4
(Queries)
 Why do we Need Queries
o Natural languages (English) are too ambiguous
o We need a query system with more structure
o We need a standardized system so users and developers can
learn one method that works on any (most) systems.
o SQL: Standard Query Language.
o Query By Example (QBE): method to help beginners create
SQL queries.
o These visually oriented tools let users select items form lists and
handle the syntax details to make it easier to create ad hoc
queries (need to learn SQL commands)
 Three Tasks of a Query Language
o To create database and build application, you need to perform
three basic sets of tasks:
1. Define database.
2. Change the data.
3. Retrieve data.
Commands grouped as :
1. Data definition language (DDL): are used to define the data tables
and other features of the database, include (ALTER – CREATE AND
DROP).
2. Data manipulation language(DML): commands used to modify the
data (DELETE – INSERT- UPDATE)  SELECT  is used to retrive
data.
Database management systems are driven by query systems.
 Four Questions to Retrieve Data (To create a query)
1.
2.
3.
4.
What output do you want to see?
What do you already know (or what constraints are given)?
What tables are involved?
How are the tables joined together?
Note: the columns are not required to have the same name
Note: the reason for splitting one category in a table. is to keep additional
information about this category.
Query By Example & SQL
What tables?
SELECTAnimalID, Category, Breed, Color
FROM Animal
WHERE (Color LIKE ‘%Yellow%’);
Animal
AnimalID
Name
Category
Breed
DateBorn
Gender
What to see?
What conditions?
Field
AnimalID
Category
Breed
Color
Table
Animal
Animal
Animal
Animal
Sort
Criteria
Like ‘%Yellow%’
Or
List all animals with yellow in their color
(since there is only one table, only three questions need to be answered)
To match the animal regardless of where the word yellow is located, you
need to use the LIKE pattern-matching function. By entering the condition
LIKE %yellow%, you are asking the query system to match the word
yellow any where in the list (with any number of characters before or after
the word).
 Basic SQL SELECT
The most commonly used command in SQL is the SELECT statement, which
is used to retrieve data from tables, which contains the four basic parts:
SELECT
columns
What do you want to see?
FROM tables
What tables are involved?
JOIN
conditions How are the tables joined?
WHERE
criteria
What are the constraints?
The SQL keywords can also be typed in lowercase.
ORDER BY
SELECT
FROM
JOIN
WHERE
ORDER BY
Animal
AnimalID
Name
Category
Breed
DateBorn
Gender
columns
tables
join columns
conditions
columns (ASC DESC)
SELECT Name, Category, Breed
FROM Animal
ORDER BY Category, Breed;
Debbie
Terry
Field
Name
Category
Breed
Table
Animal
Animal
Animal
Ascending
Ascending
Sort
Name
Cathy
Charles
Curtis
Ruby
Sandy
Hoyt
Criteria
Or
Category
Bird
Bird
Bird
Bird
Bird
Bird
Bird
Bird
Bird
Bird
Bird
Bird
Breed
African Grey
Canary
Cockatiel
Cockatiel
Lovebird
Other
Parakeet
Parakeet
Parakeet
Parrot
Parrot
Parrot
 ORDER BY (SORTING THE OUTPUT)
(The ORDER BY clause sorts the output rows. The default is to sort in
ascending order; adding the keyword DESC after a column name result in a
descending sort. When columns like Category contain duplicate data, use a
second column).
 Distinct
DISTINCT
SELECT Category
FROM Animal;
Category
Fish
Dog
Fish
Cat
Cat
Dog
Fish
Dog
Dog
Dog
Fish
Cat
Dog
...
o
o
SELECT DISTINCT Category
FROM Animal;
Category
Bird
Cat
Dog
Fish
Mammal
Reptile
Spider
The DISTINCT keyword tells the DBMS to display only rows that
are unique
To prevent the duplicates form being displayed, use the
SELECT DISTINCT phrase.
Criteria Variant
The primary concept of constraints is based on Boolean algebra:
o Means that various conditions are connected with AND and OR
clauses, some times use a NOT statement  NOT(Category =
‘DOG’).
Constraints: And
Animal
SELECTAnimalID, Category, DateBorn
FROM Animal
WHERE ((Category=‘Dog’)
AND (Color Like ‘%Yellow%’)
AND (DateBorn>’01-Jun-2004’));
AnimalID
Name
Category
Breed
DateBorn
Gender
Field
AnimalID
Category
DateBorn
Color
Table
Animal
Animal
Animal
Animal
>’01-Jun-2004’
Like ‘%Yellow%’
Sort
Criteria
‘Dog’
Or
List all dogs with yellow in their color born after 6/1/04.
(An example of three conditions connected by AND. Notice the # signs
surrounding the date. They are a convention used by Microsoft Access to help
in recognize a date. They are useful if you want to enter a text date (e.g. June
1,2004)
Boolean Algebra
And: Both must be true.
Or:
Either one is true.
Not: Reverse the value.
a
T
T
F
F
b
T
F
T
F
a AND b
T
F
F
F
a OR b
T
T
T
F
a=3
b = -1
c=2
(a > 4) And (b < 0)
F
T
F
(a > 4) Or (b < 0)
F
T
T
NOT (b < 0)
T
F
A condition consisting of two clauses connected by AND can be true only if
both of the clauses (a ANA b) are true.
Boolean Algebra
The result is affected by the order
of the operations.
Parentheses indicate that an
operation should be performed first.
With no parentheses, operations are
performed left-to-right.
a=3
b = -1
c=2
( (a > 4) AND (b < 0) ) OR (c > 1)
F
T
T
F
T
T
Always use parentheses,
so other people can read
and understand your query.
(a > 4) AND ( (b < 0) OR (c > 1) )
F
T
T
F
T
F
DeMorgan’s Law Example
Customer: "I want to look at a cat, but I don’t want any cats that are
registered or that have red in their color."
Animal
SELECT AnimalID, Category, Registered, Color
FROM Animal
WHERE (Category=‘Cat’) AND
NOT ((Registered is NOT NULL)
OR (Color LIKE ‘%Red%’)).
AnimalID
Name
Category
Breed
DateBorn
Gender
Field
AnimalID
Category
Registered
Color
Table
Animal
Animal
Animal
Animal
‘Cat’
Is Null
Not Like ‘%Red%’
Sort
Criteria
Or
 DeMorgan’s Law
o DeMorgans’ Law: explains how to negate conditions when two clauses
are connected with an AND or an OR. DeMorgan’s
law states that to negate a condition with an AND or
an OR connector, you negate each of the two clauses
and switch the connector. An AND becomes an OR,
and vice versa.
o The DeMorgan’s law is useful to simplify complex statements.
DeMorgan’s Law
 Negation of clauses
 Not
Not
 Not
Not
(A And B) becomes
A Or Not B
(A Or B) becomes
A And Not B
T
Registered=ASCF
Color=Black
or
F
T
not
F
(Registered is NULL) AND NOT (Color LIKE ‘%Red%’)
F
not
F
and
T
F
(Compound statements are negated by reversing each item and swapping the
connector (AND for OR). Use truth tables to evaluate the examples.)
Conditions: AND, OR
SELECT AnimalID, Category, Gender, Registered, DateBorn, Color
FROM Animal
WHERE (( Category=‘Dog’) AND
( ( (Gender=‘Male’) AND (Registered Is Not Null) ) OR
( (DateBorn<’01-Jun-2004’) AND (Color Like ‘%White%’) ) ) );
Animal
AnimalID
Name
Category
Breed
DateBorn
Gender
Field
AnimalID
Category
Gender
Registered
DateBorn
Color
Table
Animal
Animal
Animal
Animal
Animal
Animal
Criteria
‘Dog’
‘Male’
Is Not Null
Or
‘Dog’
< ’01-Jun-2004’
Like ‘%White%’
Sort
List all dogs who are male and registered or who were born before
6/1/2004 and have white in their color.
DeMorgan’s Law
Useful Where Conditions
Comparisons
Examples
Operators
<, =, >, <>, BETWEEN, LIKE, IN
Numbers
AccountBalance > 200
Text
Simple
Pattern match one
Pattern match any
o
o
o
o
Name > ‘Jones’
License LIKE ‘A_ _82_’
Name LIKE ‘J%’
Dates
SaleDate BETWEEN ’15-Aug-2004’
AND ’31-Aug-2004’
Missing Data
City IS NULL
Negation
Name IS NOT NULL
Sets
Category IN (‘Cat’, ‘Dog’, ‘Hamster’)
Text comparisons are usually made with the LIKE
operator for pattern matching.
% to match any number of characters.
( _ ) the underscore to match exactly one character,
Access uses (asterisk(*) and a question mark (?)
instead. Example ProductID LIKE “ _ _ _ dog _ _ _
_”.
The BETWEEN clause is not required , but it saves
some typing and makes some conditions a little
clearer. Example
o The clause (SaleDate BETWEEN ’15-Aug-2004’ AND ’31Aug-2004’ is equivalent to (SaleDate >= ’15-Aug-2004’
AND SaleDate <= ’31-Aug-2004’).
o Testing for missing data with the NULL comparison
o Two common forms are :pattern
o IS NULL and IS NOT NULL
 Be careful the statement (City = NULL) will not
work with most systems, because NULL is not
really a value. You must use (City IS NULL)
instead.
 Computations
o Queries can be used for two types of computations : aggregations
and simple arithmetic.
o Aggregation  computation of totals and subtotals.
 The most commonly used functions are Sum and Avg.
o Difference between Sum and Count.
 Sum totals the values in a numeric column. Count simply
counts the number of rows
 Sum can be used only on a column of numeric data (e.g
Quantity).
Example1: how many employees does Sally have?
SELECT Count (*) From Employee.
Example 2: how many units of Item 9764 have been sold?
Simple Computations
Select OrderID, ItemID, SalePrice, Quantity,
SalePrice*Quantity As Extended
From SaleItem;
OrderID
ItemID
Price Quantity Extended
151
9764
19.50
2
39.00
151
7653
8.35
3
25.05
151
8673
6.89
2
13.78
Basic computations (+ - * /) can be performed on numeric data.
The new display column should be given a meaningful name.
SELECT Sum (Quantity) FROM OrderItem.
Computations:
Aggregation--Avg
Computations
(Math
Operators)
OrderItem
SELECT Avg(SalePrice)
AS AvgOfSalePrice
SELECT Sum(Quantity*Cost)
AS OrderTotal
PONumber
FROM SaleAnimal;
FROM OrderItem
ItemID
Quantity
WHERE (PONumber=22);
SaleAnimal
Sum
Cost
Field
Table
Avg
SaleID
AnimalID
PONumber
OrderTotal: Quantity*Cost
Min
SalePrice
OrderItem
OrderItem
Max
Total
Field
Sort
Criteria
Or
=22 Table
Total
SalePrice
SaleAnimal
Count
OrderTotal
StDev or StdDev
1798.28
Var
Avg
Sort
Criteria
 What is the
total value of the order for PONumber 22?
Or
 Use any common math operators on numeric data.
What is the average sale price of all animals?
 Operate on data in one row at a time.
 SELECT COUNT (DISTINCT Category) FROM Animal
o This statement will count the number of different categories and
ignore duplicates (by using the DISTINCT clause).
 Subtotals (where) and GROUP BY
o To look at totals for only a few categories, you can use the Sum
function with a WHERE clause.
o For example: you might ask how many cats are in the animal list? The
query is straightforward:
Subtotals (Where)
Animal
AnimalID
Name
Category
Breed
DateBorn
Gender
SELECT Count(AnimalID) AS CountOfAnimalID
FROM
Animal
WHERE (Category = ‘Cat’);
Field
AnimalID
Category
Table
Animal
Animal
Total
Count
Where
Sort
‘Cat’
Criteria
Or
How many cats are in the Animal list?
 SELECT Count (AnimalID) FROM Animal Where (Category =
“Cat”).
o The GROUP BY statement can be used only with one of the
aggregate functions (Sum, Avg, Count and so on).
o With the GROUP BY statement, the DBMS looks at all the data ,finds
the unique items in the group, and then performs the aggregate
function for each item in the group.
Groups and Subtotals
Animal
AnimalID
Name
Category
Breed
DateBorn
Gender
SELECT
FROM
GROUP BY
ORDER BY
Field
Category
AnimalID
Table
Animal
Animal
Total
Group By
Count
Sort
Criteria
Descending
Category, Count(AnimalID) AS CountOfAnimalID
Animal
Category
Count(AnimalID) DESC;
Category CountOfAnimalID
Dog
100
Cat
47
Bird
15
Fish
14
Reptile
6
Mammal
6
Spider
3
Or
 Count the number of animals in each category.
 You could type in each WHERE clause, but that is slow.
 And you would have to know all of the Category values.
 Highest count listed first.
 You can easily limit the output displayed by including the TOP statement;
for example,:
o
SELECT TOP 10 SalesPerson, SUM(Sales) FROM sales GROUP
BY SalesPerson ORDER BY SUM(Sales) DESC.
o This query will compute total sales for each salesperon and display
a list sorted in descending order. However, only the first 10 rows of
the output will be displayed, also using (TOP 5 PERCENT), which
will cut the list off after 5 percent of the rows have been displayed.
(Oracel does not support the TOP condition)
Conditions on Totals (Having)
Animal
AnimalID
Name
Category
Breed
DateBorn
Gender
SELECT
FROM
GROUP BY
HAVING
ORDER BY
Field
Category
AnimalID
Table
Animal
Animal
Total
Group
By
Count
Sort
Descending
Criteria
>10
Category, Count(AnimalID) AS CountOfAnimalID
Animal
Category
Count(AnimalID) > 10
Count(AnimalID) DESC;
Category CountOfAnimalID
Dog
100
Cat
47
Bird
15
Fish
14
Or
Count number of Animals in each Category, but only list them if
more than 10.
 Conditions on Totals (HAVING)
o One-way to reduce the amount of data displayed is to add the Having
clause. The HAVING clause is a condition that applies to the GROUP BY
output. In the example presented.
o HAVING clause is a possible substitute in Oracle, which lacks the TOP
statement.
Where (Detail) versus Having (Group)
Where (Detail) v Having (Group)
Animal
AnimalID
Name
Category
Breed
DateBorn
Gender
SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
Category, Count(AnimalID) AS CountOfAnimalID
Animal
DateBorn > ’01-Jun-2004’
Category
Count(AnimalID) > 10
Count(AnimalID) DESC;
Field
Category
AnimalID
DateBorn
Table
Animal
Animal
Animal
Total
Group By
Count
Where
Sort
Descending
Criteria
>10
>’01-Jun-2004’
Category CountOfAnimalID
Dog
30
Cat
18
Or
Count Animals born after 6/1/2004 in each Category, but only list Category
if more than 10.
o The key is that WHERE statement applies to every single row in the
original table. The HAVING statement applies only to the subtotal
output form a GROUP BY query.
(In the above figure the WHERE clause first determines whether each row will
be used in the computation. The GROUP BY clause produces the total count
for each category. The HAVING clause restricts the output to only those
categories with more than 10 animals)
o QBE: Query by example
The Best and the Worst
Which product is best-seller?
To make a SQL statement that answer this Question
SELECT ItemID, Sum(Quantity) FROM SaleItem GROUP BY ItemID
ORDER BY Sum(Quantity) DESC
o This query will compute the total quantities purchased for each item and
display the result in descending order – the best sellers will be at the top of
the list.
o Other Solution and notes:
a. SELECT MAX(Quantity) FROM SaleItem
i. This query will run. It will return the individual sale that
had the highest sale quantity, but it will not sum the
quantities.
b. SELECT ItemID, Max(Sum(Quantitiy) FROM SaleItem
GROUP BY ItemID.
ii. This query will not run bcause the database cannot
compute the maximum until after I has computed the sum.
Multiple Joining
Tables (Intro
& Distinct)
Tables
CustomerI D
SELECT DISTINCT Sale.CustomerID, Customer.LastName
6
FROM
Customer
SELECT DISTINCT CustomerID
8
INNERSale
JOIN Sale ON Customer.CustomerID
FROM
14 = Sale.CustomerID
WHERE(SaleDate
(SaleDate
Between
’01-Apr-2004’ 19
And ’31-May-2004’)
WHERE
Between
’01-Apr-2004’
22
AndCustomer.LastName;
’31-May-2004’)
ORDER BY
Sale
SaleID
SaleDate
EmployeeID
CustomerID
SalesTax
Sale
ORDER BY CustomerID;
Customer
SaleID
Field
CustomerID
SaleDate
Table
Sale
EmployeeID
CustomerID
Sort
Ascending
CustomerID
SaleDate
Phone
24
28
36
37
CustomerID
38
22 39
57 42
38 50
42 57
63 58
63
74
74
36 80
6 90
LastName
Adkins
Carter
LastName
Franklin
Criteria
Between ’01-Apr-2004’
Froedge
Field
CustomerID LastName
SaleDate
And ’31-May-2004’
Grimes
Table
Sale
Customer
Sale
Or
Hinton
Holland
Sort
Ascending
Hopkins
Criteria
Between ’01-Apr-2004’
50
Lee
List the CustomerID of everyone
who bought something
between 01-AprAnd ’31-May-2004’
58
McCain
2004 and 31-May-2004.
…
Or
Sale
FirstName
List LastNames of Customers who bought between 4/1/2004 and
5/31/2004.
Multiple Tables
SQL JOIN
FROM table1
INNER JOIN table2
ON table1.column = table2.column
SQL 92 syntax (Access and SQL Server)
FROM table1, table2
WHERE table1.column = table2.column
SQL 89 syntax (Oracle)
FROM table1, table2
JOIN table1.column = table2.column
Informal syntax
o Joining tables causes the rows to be matched based on the columns in the
JOIN statement. You can then use data from either table.
 Joining Many Tables.
Multiple Tables (Many)
SELECT DISTINCTROW Customer.LastName, Customer.Phone
FROM Customer INNER JOIN (Sale INNER JOIN (Animal INNER JOIN SaleAnimal
ON Animal.AnimalID = SaleAnimal.AnimalID) ON Sale.SaleID = SaleAnimal.SaleID)
ON Customer.CustomerID = Sale.CustomerID
WHERE ((Animal.Category=‘Cat’) AND (Animal.Registered Is Not Null)
AND (Color Like ‘%White%’) AND (SaleDate Between ’01-Jun-2004’ And ’31-Dec-2004’));
Animal
SaleAnimal
AnimalID
Name
Category
Breed
SaleID
AnimalID
SalePrice
Sale
Customer
SaleID
SaleDate
EmployeeID
CustomerID
CustomerID
Phone
FirstName
LastName
Field
LastName
Phone
Category
Registered
Color
SaleDate
Table
Customer
Customer
Animal
Animal
Animal
Sale
Sort
Ascending
‘Cat’
Is Not Null
Like ‘%White%’
Between ’01-Jun-2004’
And ’31-Dec-2004’
Criteria
Or
 List the Last Name and Phone of anyone who bought a registered
White cat between 6/1/2004 and 12/31/2004.
A query can use data from several different tables. The process is similar
regardless of the number of the tables. Each table you want to add must be
joined to one other table through a data column.
Syntax for Three Tables
SQL ‘92 syntax to join three tables
FROM Table1
INNER JOIN (Table2 INNER JOIN Table3
ON Table2.ColA = Table3.ColA)
ON Table1.ColB = Table2.ColB
Easier notation, but not correct syntax
FROM Table1, Table2, Table3
JOIN
Table1.ColB = Table2.ColB
Table2.ColA = Table3.ColA
Joining Tables (Hints)
 Build Relationships First
 Drag and drop
 From one side to many side
 Avoid multiple ties between
tables
 SQL
 FROM Table1
 INNER JOIN Table2
 ON Table1.ColA = Table2.ColB
 Join columns are often keys, but
they can be any columns--as
long as the domains (types of
data) match.
 Multiple Tables
 FROM (Table1
 INNER JOIN Table2
 ON T1.ColA = T2.ColB )
 INNER JOIN Table3
 ON T3.ColC = T3.ColD
 Shorter Notation
 FROM T1, T2, T3
 JOIN T1.ColA = T2.ColB

T1.ColC = T3.ColD
 Shorter Notation is not correct
syntax, but it is easier to write.
Hints on Joining Tables
Cross-JOIN: Where every row in one table is paired with every row in the
other table. In algebra, a Cross JOIN is known as a Cartesian product of two
sets.