CS 480: Database Systems Lecture 12 February 11, 2013 SQL Basic Query Format SELECT FROM WHERE A1,A2,…,An r1,r2,…,rm P • Suppose the ri’s have scheme ri(Ri) where Ri is a set of attributes. • Then the Ai’s are attributes in R1 … Rm. • P is a boolean predicate in which an atom is a selection atom on r1 r2 … rm or other types of SQL boolean predicates: string predicates (LIKE,CONTAINS) t IN ri t θ ALL(ri), t θ SOME(ri) others SQL Basic Query Format SELECT FROM WHERE A1,A2,…,An r1,r2,…,rm P • Queries are written in SELECT, FROM, WHERE order • It’s important to understand the operational order: 1. FROM: Cartesian product of the given relations 2. WHERE: Selection based on the given predicate 3. SELECT: Projection of the given attributes. SQL Basic Query Format SELECT FROM WHERE In relational algebra: A1,A2,…,An r1,r2,…,rm P Completeness of SQL • Projection (Π) SELECT FROM A1,A2,…,An r Completeness of SQL • Selection (σ) σP(r) Completeness of SQL • Selection (σ) σP(r) SELECT FROM WHERE * r P Completeness of SQL • Selection (σ) σP(r) SELECT FROM WHERE * r P * denotes “all attributes” Completeness of SQL • Union () rs Completeness of SQL • Union () rs (SELECT FROM UNION (SELECT FROM * r) * s) Completeness of SQL • Difference (–) r–s Completeness of SQL • Difference (–) r–s (SELECT FROM MINUS (SELECT FROM * r) * s) Completeness of SQL • Difference (–) r–s (SELECT FROM EXCEPT (SELECT FROM * r) * s) Completeness of SQL • Cartesian Product () rs Completeness of SQL • Cartesian Product () rs SELECT FROM * r,s Completeness of SQL • Cartesian Product () ΠR(r s) SELECT FROM r.* r,s Completeness of SQL • Rename (ρ) ρd(r) Completeness of SQL • Rename (ρ) ρd(r) SELECT FROM * r AS d Set Membership () • Retrieve the student_id’s of students that took both CS480 and CS580 and got an A in both. • ENROL(student_id,course,grade) SELECT student_id FROM enrol WHERE course=‘CS480’ AND grade=‘A’ AND student_id IN (SELECT student_id FROM enrol WHERE course=‘CS580’ AND grade=‘A’) Set Membership () • Retrieve the student_id’s of students that took CS480 but have not taken CS580. • ENROL(student_id,course,grade) SELECT student_id FROM enrol WHERE course=‘CS480’ AND student_id NOT IN (SELECT student_id FROM enrol WHERE course=‘CS580’) Set Comparison • Retrieve the names of instructors that have a salary higher than at least one instructor in the Biology department. • INSTRUCTOR(name,dept,salary) SELECT name FROM instructor as i WHERE i.salary > SOME(SELECT salary FROM instructor as j WHERE j.dept=‘Biology’) Set Containment • The CONTAINS clause is a mechanism by which SQL implements the division operator. • ENROL(id,course,grade) STUDENT(id,name,major) • Names and majors of students that took all the courses that John Doe took. SELECT name,major FROM student as d WHERE (SELECT course FROM enrol enrol.id = d.id) CONTAINS (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) Set Containment • The CONTAINS clause is a mechanism by which SQL implements the division operator. • ENROL(id,course,grade) STUDENT(id,name,major) • Names and majors of students that took all the courses that John Doe took. SELECT name,major FROM student as d WHERE (SELECT course FROM enrol enrol.id = d.id) CONTAINS (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) Set Containment • The CONTAINS clause is a mechanism by which SQL implements the division operator. • ENROL(id,course,grade) STUDENT(id,name,major) • Names and majors of students that took all the courses that John Doe took. SELECT name,major FROM student as d WHERE (SELECT course FROM enrol enrol.id = d.id) CONTAINS (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) Set Containment • The CONTAINS clause is a mechanism by which SQL implements the division operator. • ENROL(id,course,grade) STUDENT(id,name,major) • Names and majors of students that took all the courses that John Doe took. SELECT name,major FROM student as d WHERE (SELECT course FROM enrol enrol.id = d.id) CONTAINS (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) Set Containment • The CONTAINS clause is a mechanism by which SQL implements the division operator. • ENROL(id,course,grade) STUDENT(id,name,major) • Names and majors of students that took all the courses that John Doe took. SELECT name,major FROM student as d WHERE (SELECT course FROM enrol enrol.id = d.id) CONTAINS (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) Set Containment • The CONTAINS clause is a mechanism by which SQL implements the division operator. • ENROL(id,course,grade) STUDENT(id,name,major) • Names and majors of students that took all the courses that John Doe took. Π (student name,major SELECT name,major (Πid,course(enrol) FROM student as d Πcourse(σname=‘John Doe’(student enrol))) WHERE (SELECT course FROM enrol enrol.id = d.id) CONTAINS (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) Set Containment • Names and majors of students that only took all the courses that John Doe took. SELECT name,major FROM student as d WHERE (SELECT course FROM enrol enrol.id = d.id) CONTAINS (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) AND (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) CONTAINS (SELECT course FROM enrol enrol.id = d.id) Set Containment • Names and majors of students that only took all the courses that John Doe took. SELECT name,major FROM student as d WHERE (SELECT course FROM enrol enrol.id = d.id) CONTAINS (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) AND (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) CONTAINS (SELECT course FROM enrol enrol.id = d.id) Set Containment • Names and majors of students that only took all the courses that John Doe took. SELECT name,major FROM student as d WHERE (SELECT course FROM enrol enrol.id = d.id) CONTAINS (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) AND (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) CONTAINS (SELECT course FROM enrol enrol.id = d.id) Set Containment CONTAINS no longer part of the current SQL standards. Now, “a contains b” can be restated as “NOT EXISTS (b EXCEPT a) SELECT name,major FROM student as d WHERE (SELECT course FROM enrol enrol.id = d.id) CONTAINS (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) Set Cardinality CONTAINS no longer part of the current SQL standards. Now, “a contains b” can be restated as “NOT EXISTS (b EXCEPT a) SELECT name,major FROM student as d WHERE NOT EXISTS ((SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) EXCEPT (SELECT course FROM enrol enrol.id = d.id)) Set Cardinality CONTAINS no longer part of the current SQL standards. Now, “a contains b” can be restated as “NOT EXISTS (b EXCEPT a) SELECT name,major FROM student as d WHERE NOT EXISTS ((SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’) EXCEPT (SELECT course FROM enrol enrol.id = d.id)) NOT EXISTS – Tests if a relation is empty EXISTS – Tests if a relation is nonempty Set Cardinality • ENROL(id,course,grade) STUDENT(id,name,major) • Retrieve the id’s of students that didn’t take any course that John Doe took. SELECT d.id FROM student as d WHERE NOT EXISTS ((SELECT course FROM enrol enrol.id = d.id) INTERSECT (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’)) Set Cardinality • ENROL(id,course,grade) STUDENT(id,name,major) • Retrieve the id’s of students that didn’t take any course that John Doe took. SELECT d.id FROM student as d WHERE NOT EXISTS ((SELECT course FROM enrol enrol.id = d.id) INTERSECT (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’)) Set Cardinality • ENROL(id,course,grade) STUDENT(id,name,major) • Retrieve the id’s of students that didn’t take any course that John Doe took. SELECT d.id FROM student as d WHERE NOT EXISTS ((SELECT course FROM enrol enrol.id = d.id) INTERSECT (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’)) Set Cardinality • ENROL(id,course,grade) STUDENT(id,name,major) • Retrieve the id’s of students that didn’t take any course that John Doe took. SELECT d.id FROM student as d WHERE NOT EXISTS ((SELECT course Arbitrary Level of Nesting FROM enrol enrol.id = d.id) INTERSECT (SELECT course FROM enrol as e,student as s WHERE e.id=s.id AND s.name=‘John Doe’)) Aggregate Operations • Operations that take a collection (set or multiset) as input and return a single value. • Average: AVG • Minimum: MIN • Maximum: MAX • Total: SUM • Count: COUNT Aggregate Operations • INSTRUCTOR(name,dept,salary) TEACHES(name,course,semester,year) • Ex: Retrieve the average salary of professors that have taught ‘CS480’. SELECT FROM WHERE AND AVG(salary) instructor as i,teaches as t i.name=t.name i.name IN (SELECT name FROM teaches WHERE course=‘CS480’) Aggregate Operations • INSTRUCTOR(name,dept,salary) TEACHES(name,course,semester,year) • Ex: Retrieve the average salary of professors that have taught ‘CS480’. SELECT FROM WHERE AND AVG(salary) instructor as i,teaches as t i.name=t.name i.name IN (SELECT name FROM teaches WHERE course=‘CS480’) Aggregate Operations • INSTRUCTOR(name,dept,salary) TEACHES(name,course,semester,year) • Ex: Retrieve the total number of professors that have taught CS480. SELECT FROM WHERE AND COUNT(name) instructor as i,teaches as t i.name=t.name t.course=‘CS480’ Aggregate Operations • INSTRUCTOR(name,dept,salary) TEACHES(name,course,semester,year) • Ex: Retrieve the total number of professors that have taught CS480. SELECT FROM WHERE AND COUNT(name) instructor as i,teaches as t i.name=t.name t.course=‘CS480’ Aggregate Operations • INSTRUCTOR(name,dept,salary) TEACHES(name,course,semester,year) • Ex: Retrieve the total number of professors that have taught CS480. SELECT FROM WHERE AND COUNT(name) instructor as i,teaches as t i.name=t.name t.course=‘CS480’ This will count instructors that have taught CS480 multiple times more than once. Aggregate Operations • INSTRUCTOR(name,dept,salary) TEACHES(name,course,semester,year) • Ex: Retrieve the total number of professors that have taught CS480. SELECT FROM WHERE AND COUNT(DISTINCT name) instructor as i,teaches as t i.name=t.name t.course=‘CS480’ Now, any professor that has taught the course will be counted only once. Aggregation with GROUP BY • Sometimes we want to apply aggregate functions to a single set of tuples (relation), but also to a group of sets of tuples (subsets of the relation). • To do that we use the GROUP BY clause. • The attributes given in the GROUP BY clause are used to form groups. • Then the aggregation is performed for each group. Aggregation with GROUP BY • INSTRUCTOR(name,dept,salary) • Retrieve the faculty budget for each department (what they pay in total to their professors). SELECT FROM GROUP BY SUM(salary) instructor dept Aggregation with GROUP BY • INSTRUCTOR(name,dept,salary) • Retrieve the faculty budget for each department (what they pay in total to their professors). SELECT FROM GROUP BY SUM(salary) instructor dept Aggregation with GROUP BY • INSTRUCTOR(name,dept,salary) • Retrieve the faculty budget for each department (what they pay in total to their professors). SELECT FROM GROUP BY dept,SUM(salary) instructor dept Aggregation with GROUP BY • INSTRUCTOR(name,dept,salary) • Retrieve the faculty budget for each department (what they pay in total to their professors). SELECT FROM GROUP BY dept,SUM(salary) instructor dept This will return a relation with 2 columns, one for dept and the other for the sum of all the professor salaries. Aggregation with GROUP BY • INSTRUCTOR(name,dept,salary) • Retrieve the faculty budget for each department (what they pay in total to their professors). SELECT FROM GROUP BY dept,SUM(salary) AS budget instructor dept This will return a relation with 2 columns, one for dept and the other for budget. Aggregation with GROUP BY • INSTRUCTOR(name,dept,salary) • Retrieve the number of professors that earn more than $100,000 in each department SELECT FROM WHERE GROUP BY COUNT(name) instructor salary>100000 dept Aggregation with GROUP BY • INSTRUCTOR(name,dept,salary) • Retrieve the number of professors that earn more than $100,000 in each department SELECT FROM WHERE GROUP BY dept,COUNT(name) instructor salary>100000 dept Aggregation with GROUP BY and HAVING • It may be useful to state a condition that applies to the groups rather than the tuples. • The HAVING clause applies conditions to the groups formed by the GROUP BY clause. • Like a WHERE clause but for the groups. Aggregation with GROUP BY and HAVING • INSTRUCTOR(name,dept,salary) • Query: Retrieve the faculty budget of each department that has 25 or more professors. SELECT FROM GROUP BY HAVING SUM(salary) AS budget instructor dept COUNT(name)>=25 Aggregation with GROUP BY and HAVING • INSTRUCTOR(name,dept,salary) • Query: Retrieve the faculty budget of each department that has 25 or more professors. SELECT FROM GROUP BY HAVING dept, SUM(salary) AS budget instructor dept COUNT(name)>=25 Aggregation with GROUP BY and HAVING • INSTRUCTOR(name,dept,salary) • Query: Retrieve the faculty budget of each department that has 25 or more professors. SELECT FROM GROUP BY HAVING dept,SUM(salary) AS budget instructor dept COUNT(name)>=25 Aggregation with GROUP BY and HAVING • STUDENT(id,name,address,GPA) ENROL(id,course,section,semester,year) • Query: For each course section offered in 2012, find the average GPA of all the students enrolled in the section, if the section had at least 10 students. SELECT FROM WHERE GROUP BY HAVING course,semester,year,section,AVG(GPA) enrol NATURAL JOIN student year=2012 course,semester,year,section COUNT(id)>=10 Aggregation with GROUP BY and HAVING • STUDENT(id,name,address,GPA) ENROL(id,course,section,semester,year) • Query: For each course section offered in 2012, find the average GPA of all the students enrolled in the section, if the section had at least 10 students. SELECT FROM WHERE GROUP BY HAVING course,semester,year,section,AVG(GPA) enrol NATURAL JOIN student year=2012 course,semester,year,section COUNT(id)>=10 Aggregation with GROUP BY and HAVING • STUDENT(id,name,address,GPA) ENROL(id,course,section,semester,year) • Query: For each course section offered in 2012, find the average GPA of all the students enrolled in the section, if the section had at least 10 students. SELECT FROM WHERE GROUP BY HAVING course,semester,year,section,AVG(GPA) enrol NATURAL JOIN student year=2012 course,semester,year,section COUNT(id)>=10 Aggregation with GROUP BY and HAVING • STUDENT(id,name,address,GPA) ENROL(id,course,section,semester,year) • Query: For each course section offered in 2012, find the average GPA of all the students enrolled in the section, if the section had at least 10 students. SELECT FROM WHERE GROUP BY HAVING course,semester,year,section,AVG(GPA) enrol NATURAL JOIN student year=2012 course,semester,year,section COUNT(id)>=10 Aggregation with GROUP BY and HAVING • STUDENT(id,name,address,GPA) ENROL(id,course,section,semester,year) • Query: For each course section offered in 2012, find the average GPA of all the students enrolled in the section, if the section had at least 10 students. SELECT FROM WHERE GROUP BY HAVING course,semester,year,section,AVG(GPA) enrol NATURAL JOIN student year=2012 course,semester,year,section COUNT(id)>=10 Aggregation with GROUP BY and HAVING • STUDENT(id,name,address,GPA) ENROL(id,course,section,semester,year) • Query: For each course section offered in 2012, find the average GPA of all the students enrolled in the section, if the section had at least 10 students. SELECT FROM WHERE GROUP BY HAVING course,semester,year,section,AVG(GPA) enrol NATURAL JOIN student year=2012 course,semester,year,section COUNT(id)>=10 Aggregation with GROUP BY and HAVING • STUDENT(id,name,address,GPA) ENROL(id,course,section,semester,year) • Query: For each course section offered in 2012, find the average GPA of all the students enrolled in the section, if the section had at least 10 students. SELECT FROM WHERE GROUP BY HAVING course,semester,year,section,AVG(GPA) enrol NATURAL JOIN student year=2012 course,semester,year,section COUNT(id)>=10 Aggregation with GROUP BY and HAVING • STUDENT(id,name,address,GPA) ENROL(id,course,section,semester,year) • Query: For each course section offered in 2012, find the average GPA of all the students enrolled in the section, if the section had at least 10 students. SELECT FROM WHERE GROUP BY HAVING course,semester,year,section,AVG(GPA) enrol NATURAL JOIN student year=2012 course,semester,year,section COUNT(id)>=10 Aggregation with GROUP BY and HAVING • STUDENT(id,name,address,GPA) ENROL(id,course,section,semester,year) • Query: For each course section offered in 2012, find the average GPA of all the students enrolled in the section, if the section had at least 10 students. SELECT FROM WHERE GROUP BY HAVING course,semester,year,section,AVG(GPA) enrol NATURAL JOIN student year=2012 course,semester,year,section COUNT(id)>=10 Aggregation with GROUP BY and HAVING • STUDENT(id,name,address,GPA) ENROL(id,course,section,semester,year) • Query: For each course section offered in 2012, find the average GPA of all the students enrolled in the section, if the section had at least 10 students. SELECT FROM WHERE GROUP BY HAVING course,semester,year,section,AVG(GPA) enrol NATURAL JOIN student year=2012 course,semester,year,section COUNT(id)>=10 Aggregation with GROUP BY and HAVING • STUDENT(id,name,address,GPA) ENROL(id,course,section,semester,year) • Query: For each course section offered in 2012, find the average GPA of all the students enrolled in the section, if the section had at least 10 students. SELECT FROM WHERE GROUP BY HAVING course,semester,year,section,AVG(GPA) enrol NATURAL JOIN student year=2012 course,semester,year,section COUNT(id)>=10 Aggregation with GROUP BY and HAVING • STUDENT(id,name,address,GPA) ENROL(id,course,section,semester,year) • Query: For each course section offered in 2012, find the average GPA of all the students enrolled in the section, if the section had at least 10 students. SELECT FROM WHERE GROUP BY HAVING course,semester,year,section,AVG(GPA) enrol NATURAL JOIN student year=2012 course,semester,year,section COUNT(id)>=10
© Copyright 2026 Paperzz