CS4432: Database Systems II Query Rewrite 1 Query Re-Writing Query in SQL Query Plan in Algebra (logical) Other Query Plans in Algebra (logical) How does the optimizer generate equivalent query plans? 2 Example Select B, D From R, S Where R.A = “c” And S.E = 2 And R.C=S.C Plan 1 Plan 2 Plan 3 3 Relational Algebra Optimization • Set of rules to apply called “Transformation rules” or “Algebraic Laws” • What are transformation rules ? – Preserve equivalence of plans • That is, must produce the same answer • What are good transformations? – Reduce query execution costs 4 Relational Operators (revisited) • Selection Basics – Idempotent – Commutative • Selection Conjunctions – Useful when pruning • Selection Disjunctions – Equivalent to UNIONS Rules: Selection and Binary Operators • Must push selection to both arguments: – C (R U S) = C (R) U C (S) • Must push to first arg, optional for 2nd: – C (R - S) = C (R) - S – C (R - S) = C (R) - C (S) • Push to at least one arg with all attributes mentioned in C: – product, natural join, theta join, intersection – e.g., C (R X S) = C (R) X S, if R has all the attributes in C Rules: Natural Join Rewriting R (R S = S R S) T =R (S T) Can also write as trees, e.g.: T R R S S T 7 Rules: Select p1p2(R) = [ p1 p2 Conjunction predicates (R)] disjunction predicates p1v p2(R) = [ p1 (R)] U [ p2 (R)] 8 Bags vs. Sets R = {a,a,b,b,b,c} S = {b,b,c,c,d} What about union R U S = ? • Option 1 Sum the occurrences R U S = {a,a,b,b,b,b,b,c,c,c,d} • Option 2 Max of occurrences R U S = {a,a,b,b,b,c,c,d} CS 4432 logical query rewriting - lecture 15 9 Bags vs. Sets Which option makes this rule work ? p1 v p2 (R) = p1 (R) U p2(R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c Let us try MAX(): p1v p2 (R) = {a,a,b,b,b,c} p1(R) = {a,a,b,b,b} Matching p2(R) = {b,b,b,c} p1(R) U p2 (R) = {a,a,b,b,b,c} CS 4432 logical query rewriting - lecture 15 10 Bags vs. Sets Which option makes this rule work ? p1 v p2 (R) = p1 (R) U p2(R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c Let us try SUM(): CS 4432 p1vp2 (R) = {a,a,b,b,b,c} p1(R) = {a,a,b,b,b} p2(R) = {b,b,b,c} p1(R) U p2 Not Matching (R) = {a,a,b,b,b,b,b,b,c} logical query rewriting - lecture 15 11 Bag Semantics in DBMSs • Usually the “SUM” option for bag union is more meaningful • Many DBMSs implement this semantics Great care must be taken, as some rules cannot be used for bags ! CS 4432 logical query rewriting - lecture 15 12 Rules: Project Let: X = set of attributes Y = set of attributes XY = X U Y pxy (R) = px [py (R)] 13 Rules: + Combined Let p = predicate with only R attributes q = predicate with only S attributes m = predicate with both R and S attributes (R (R (R) p S) = q S) = R m (R S p q(S) S) = No change Always a good idea to push selection down Join Predicates 14 Rules: + Combined Let p = predicate with only R attributes q = predicate with only S attributes m = predicate with both R and S attributes Rule can be derived ! 15 Rules: + Combined Let p = predicate with only R attributes q = predicate with only S attributes m = predicate with both R and S attributes What about these ones?? 16 Rules: + p combined Let x = subset of R attributes z = attributes in predicate P (subset of R attributes) Must ensure z attributes are projected Usually not that effective Unless… R contains really large attributes that we want to avoid reading 17 Rules: + p combined Let x = subset of R attributes y = subset of S attributes z = intersection of R,S attributes (Join columns) 18 Sometimes It’s Tricky • Suppose we have relations – StarsIn(title,year,starName) – Movie(title,year,len,inColor,studioName) • and a view – CREATE VIEW MoviesOf1996 AS SELECT * FROM Movie WHERE year = 1996; • and the query – SELECT starName, studioName FROM MoviesOf1996 NATURAL JOIN StarsIn; 19 An Improved Logical Query Plan Summary of Query Rewrite • Transformation rules to create equivalent query plans • Check textbook for more rules • Always select-push-down is good • Sometimes project-push-down is good Both reduce the size as early as possible • Pushing selection all the way enables using indexes • Order among join relations – Affects which one is outer or inner 21
© Copyright 2024 Paperzz