Relational Algebra Ch. 7.4 – 7.6 John Ortiz Relational Query Languages Query languages: allow manipulation and retrieval of data from a database. Relational QLs are simple & powerful. Strong formal foundation based on logic. Allows for much optimization. Query languages != programming languages! Not intended for complex calculations. Support easy, efficient access to large data sets. Lecture 4 Relational Algebra 2 Preliminaries A query is applied to relation instances, and the result of a query is also a relation instance. Schemas of input & result relations are fixed (determined by relations & query language constructs). A query is specified against schemas (regardless of instances). Attributes may be referenced either by names or by positions (two notation systems). Lecture 4 Relational Algebra 4 Relational Algebra Basic Operations: Selection (): choose a subset of rows. Projection (): choose a subset of columns. Cross Product (): Combine two tables. Union (): unique tuples from either table. Set difference (): tuples in R1 not in R2. Renaming (): change names of tables & columns Additional Operations (for convenience): Intersection, joins (very useful), division, outer joins, aggregate functions, etc. Lecture 4 Relational Algebra 5 Selection Format: selection-condition(R). Choose tuples that satisfy the selection condition. Result has identical schema as the input. Major = ‘CS’ (Students) Students SID 456 457 678 Name John Carl Ken GPA 3.4 3.2 3.5 Major CS CS Math Result SID Name GPA Major 456 John 3.4 CS 457 Carl 3.2 CS Selection condition is a Boolean expression including =, , <, , >, , and, or, not. Lecture 4 Relational Algebra 6 Projection Format: attribute-list(R). Retain only those columns in the attribute-list. Result must eliminate duplicates. Major(Students) Students SID 456 457 678 Name John Carl Ken Result GPA 3.4 3.2 3.5 Major CS CS Math Major CS Math Operations can be composed. Name, GPA(Major = ‘CS’ (Students)) Lecture 4 Relational Algebra 7 Cross Product Format: R1 R2. Each row of R1 is paired with each row of R2. Result schema consists of all attributes of R1 followed by all attributes of R2. Problem: Columns may have identical names. Use notation R.A, or renaming attributes. Only some rows make sense. Often need a selection to follow. Lecture 4 Relational Algebra 8 Example of Cross Product Students SID 456 457 678 Name John Carl Ken GPA 3.4 3.2 3.5 Major CS CS Math Awards SID Amount Year 456 1500 1998 678 3000 2000 Students Awards SID 456 456 457 457 678 678 Lecture 4 Name John John Carl Carl Ken Ken GPA 3.4 3.4 3.2 3.2 3.5 3.5 Major CS CS CS CS Math Math SID 456 678 456 678 456 678 Amount 1500 3000 1500 3000 1500 3000 Relational Algebra Year 1998 2000 1998 2000 1998 2000 9 Renaming Format: S(R) or S(A1, A2, …)(R): change the name of relation R, and names of attributes of R CS_Students(Major = ‘CS’ (Students)) Students SID 456 457 678 Name John Carl Ken Lecture 4 GPA 3.4 3.2 3.5 Major CS CS Math CS_Students SID Name GPA Major 456 John 3.4 CS 457 Carl 3.2 CS Relational Algebra 10 Union, Intersection, Set Difference Format: R1 R2 (R1 R2, R1 R2). Return all tuples that belong to either R1 or R2 (to both R1 and R2; to R1 but not to R2). Requirement: R1 and R2 are union compatible. With same number of attributes. Corresponding attributes have same domains. Schema of result is identical to that of R1. May need renaming. Duplicates are eliminated. Lecture 4 Relational Algebra 11 Examples of Set Operations TAs SID 456 457 678 Name John Carl Ken GPA 3.4 3.2 3.5 RAs Major CS CS Math SID Name GPA Major 456 John 3.4 CS 223 Bob 2.95 Ed TAs RAs TAs RAs SID 456 457 678 223 Name John Carl Ken Bob Lecture 4 GPA 3.4 3.2 3.5 2.95 Major CS CS Math Ed SID Name GPA Major 456 John 3.4 CS TAs RAs SID Name GPA Major 457 Carl 3.2 CS 678 Ken 3.5 Math Relational Algebra 12 Joins Theta Join. Format: R1 join-condition R2. Returns tuples in join-condition(R1 R2) Equijoin. Same as Theta Join except the joincondition contains only equalities. Natural Join. Same as Equijoin except that equality conditions are on common attributes and duplicate columns are eliminated. Lecture 4 Relational Algebra 13 Examples of Joins Students SID 456 457 678 Name John Carl Ken GPA 3.4 3.2 3.5 Age 29 35 25 Prof 123 123 154 Profs PID Pname Age Dept 123 John 35 CS 154 Scott 28 Math Theta Join. Students Students.Age<=Profs.Age Profs Result SID 456 457 678 678 Lecture 4 Name John Carl Ken Ken GPA 3.4 3.2 3.5 3.5 Age 29 35 25 25 Prof 123 123 154 154 PID 123 123 123 154 Pname John John John Scott Relational Algebra Age 35 35 35 28 Dept CS CS CS Math 14 Examples of Joins (cont.) Equijoin. Students Prof=PID AND Name=Pname Profs Result SID Name GPA Age Prof PID Pname Age Dept 456 John 3.4 29 123 123 John 35 CS Natural Join. Students Profs Result SID Name GPA Age Prof PID Pname Dept 457 Carl 3.2 35 123 123 John CS Lecture 4 Relational Algebra 15 Some Questions About Joins * What is the result of R1 R2 if they do not have a common attribute? What is the result of R R? Consider relations Students(SSN, Name, GPA, Major, Age, PSSN) Profs(PSSN, Name, Office, Age, Dept) Which type of join should be used to find pairs of names of students and their advisors? Can a natural join be used? How? Lecture 4 Relational Algebra 16 Division Format: R1 R2. Restriction: Every attribute in R2 is in R1. For R1(A1, ..., An, B1, ..., Bm) R2(B1, ..., Bm) and T = A1, ..., An (R1), Return the subset of T, say W, such that every tuple in W R2 is in R1. W is the largest subset of T, such that, (W R2) R1 Lecture 4 Relational Algebra 17 An Example of Division Takes CS_Req Takes CS_Req SID 456 456 456 457 457 532 678 CNO CS210 CS321 CNO CS210 CS321 CS135 CS210 CS321 CS210 CS321 Result SID 456 457 What is the meaning of this expression? Lecture 4 Relational Algebra 18 Grouping & Aggregate Functions Format: group_attributes F aggregate_functions ( r ) Partition a relation into groups Apply aggregate function to each group Output grouping and aggregation values, one tuple per group Ex: Major F count(SID), avg(GPA) (Students) Students SID 456 457 678 Name John Carl Ken Lecture 4 GPA 3.4 3.2 3.5 Major CS CS Math Result Major count(SID) avg(GPA) CS 2 3.3 Math 1 3.5 Relational Algebra 19 Dangling Tuples in Join Usually, only a subset of tuples of each relation will actually participate in a join. Tuples of a relation not participating in a join are dangling tuples. How do we keep dangling tuples in the result of a join? (Why do we want to do that?) Use null values to indicate a “no-join” situation. Lecture 4 Relational Algebra 20 Outer Joins Left Outer Join. Format: R1 R2. Similar to a natural join but keep all dangling tuples of R1. Right Outer Join. Format: R1 R2. Similar to a natural join but keep all dangling tuples of R2. (Full) Outer Join. Format: R1 R2. Similar to a natural join but keep all dangling tuples of both R1 & R2. Can also have Theta Outer Joins. Lecture 4 Relational Algebra 21 Examples of Outer Joins Students SID 456 457 678 Name John Carl Ken GPA 3.4 3.2 3.5 Awards Major CS CS Math SID Amount Year 456 1500 1998 678 3000 2000 Left Outer Join. Students Awards Result SID 456 457 678 Lecture 4 Name John Carl Ken GPA 3.4 3.2 3.5 Major CS CS Math Amount 1500 Null 3000 Relational Algebra Year 1998 Null 2000 22 Relational Algebra Exercises Find the result of these expressions. R S R R A B C D R.C=S.C S 1 2 3 4 B,E((B,C R) (E<7 S)) 2 2 5 1 3 4 2 6 (A,BR) - S(A,B) (D,C S) 4 2 5 3 Lecture 4 Relational Algebra S D 1 3 4 5 C 2 4 5 2 E 3 7 5 7 23 Queries In Relational Algebra Consider the following database schema: Students(SSN, Name, GPA, Age, MajorDept) Enrollment(SSN, CourseNo, Grade) Courses(CourseNo, Title, DName) Departments(DName, Location, Phone) Two methods: Use temporary relations. One expression per query. Lecture 4 Relational Algebra 24 Queries In Relational Algebra List student name and course title such that the student has an A in the course and the course is not offered by the student’s major department. Find those students who got an A in any course. Find the department of the students and the courses. Find the final answer. Lecture 4 Relational Algebra 25 Summary Relational model provides simple yet powerful formal query languages. Relational algebra is procedural and used for internal representation of queries. Several ways to express a given query. DBMS should choose the most efficient plan. Any language able to express all relational algebra queries is relational complete. Lecture 4 Relational Algebra 29 Summary (cont.) Lots useful properties. C1(C2(R)) = C2(C1(R)) = C1 and C2(R) L1( L2(R)) = L1(R) , if L1 L2 R1 R2 = R2 R1 R1 (R2 R3) = (R1 R2) R3 R1 R2 = R2 R1 R1 (R2 R3) = (R1 R2) R3 Lecture 4 Relational Algebra 30 Look Ahead Next topic: Translation form ER/EER to relational model Read from the textbook: Chapter 14.1 – 14.2 Lecture 4 Relational Algebra 31
© Copyright 2026 Paperzz