SQL Query Optimization: Reordering for a General Class of Queries Piyush IBM Santa Goel Teresa Bala Laboratory IBM Santa [email protected] bala@vnet Abstract Partial The strength from their of commercial ability equivalent reordering are no known for a SQL query containing lower groupby a SQL aggregations. to capture the SQL However, outer joins, identities, semantics, these joins, operators it is during with hypergraph outer current few results in query optimizing joins, specified with predicates reordering query outer expressions joins, that optimization and full generated rithms lations full binary outer joins) in the Gupta query et. ex- al. in appear adjacent unary predicate to each other, operators. references by an aggregation However, a column operator, if that known is algo- do not tell us how to reorder the binary reand assign operators to produce the same 1 in which either algorithms are unable containing predicate c generated (~3.b 02 VI .c) by aggregation oper- ator count: joins reference to exhaustively aggregations, outer join column involving complex View more two relations or reference columns generated To the best of our knowledge, aggregations. VI: Select rl .C as a, m.d as b, c = count(m) From by the rl, r2 rl .b 01 rz.b Where Groupby rl .c, rz .d reorder outer Query 1: Select rs.a, rq.b, VI .b From (Select * from VI Left OuterJoin rs. b .92 V1.c), rq Where rq.b = VI .b join predicates and outer join predicates that reference columns generated by aggregations. Bhargava et. al. [BHAR94, BHAR95a] have proposed algorithms to reorder queries containing outer join predicates that reference two or more relations, but their work can not reorder and other for re- that result. For example, let T1, r2, r3 and r4 be four relations such that attributes rl .b and rl .C are in relation rl; attributes r2 .b and r2 .d are in relation T2; has outer than queries (we know) require BHAR95a]. operations an outer join Query BYs, SQL joins to each intervening references GROUP existing outer joins T3; and atattributes r3. u and r3 .b are in relation tribute r4 .b is in relation ~4. Consider the following state-of-the-art for inner [BHAR94, without Motivation The adjacent the binary and Introduction 1.1 (joins, be brought a method that we identify a powerful primitive needed for a dbms. We report our findings of a simple, yet fundamental operator, generalized selection, and demonstrate its power to solve the problem of reordering of SQL queries containing joins, outer joins, and groupby aggregations. 1 operations joins, are sufficient their Algorithms and [GUPT95] and Bhargava et. al. in [BHAR95d] presented algorithms to “push-up/push-down” proj ections, aggregations and selections in such a way that and groupby Using .ibm.com outer pression reordering we propose cent aining While there of the reordering be missed. all Laboratory reordering: ordering like DB2 comes by generating all equivalent some cost may query order operators. joins, and a set of novel to reorder optimizers to generate Consequently, significantly model of binary methods aggregations. query to select an optimal Iyer Teresa where the following: 01,02 G {=, +,~, ~,<,>}. rs On Because of column Permission to make digitahhard copy of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage, the mpyright notice, the title of the publication and its date appear, and notice is given that copying is by permission of ACM, Inc. To capy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission andlor a fee. c which is generated by aggregation operator count, and no generally known definition of outer joins based on cartesian product, view V1 cannot be merged existing 47 the are unable to reorder the outer and inner joins specified in the query. Specifically, the join specified in view VI SIGMOD ’96 6/96 Montreal, Canada (D 1996 ACM 0-89791 -794-4/9610006 ...$3.50 with the rest of Query 1. Consequently, algorithms in [BHAR95a, BHAR95d] cannot be reordered with other outer and inner joins in Query specified 1- if predicate (I4 .SUPKEY T4 .b = VI .b is highly filtering then it may be beneficial to perform this join first, before performing the aggregation that counts the tuples obtained after combining relations This T1 and Tz. A motivating example the Example three relations is included VZ.QTY next. query, Consider 94AGG, 95 DE- if executed it TAIL, and SUP.DETAIL, where relation 94AGG contains aggregated information about different parts if 94AGG This relation is an index has three and QTY, PARTKEY cent ains unique unique identifiers contains QTY supplied by a supplier. contain four DATE the date tributes the total of SUP.DETAIL SUPKEY, SUP attribute tifier for the SUPDETAIL supplier, attribute relation to relation may generate supplier volume of approved is no longer with suppliers, a reliable the supplier at- attributes in SUPRATING and attribute information about is relatively either supplier is dropping. alternative to the potenial FROM Thus, grouping if there on attributes the reduction of can be used as a good reduction through join. The motivation, a class of queries exam- called “Join by employing queries For example, Tuple (TIS). consider following co-related query in which rl, r2, and r~ de- the note three relations such that rl .a, rl .b, rl .C and rl. ~ rs. because and/or rl .a From (Select the rl .b 6’1 Where From r2 Where r2.c = r-l.. and count(*) From rs Where rz.e = r, .e and rl. ~ = rs. f )) the In order rl count(*) rz. d & to where 01, 62 c {=, #, ~, <, <, >}. In tuple iteration semantics, first, every tuple in T1 is combined with tuples in r2 by applying predicate r2 .C = ~1.c. Next, each selected tuple in relation r2 (along with the tu- ple in rl ) is substituted in predicate r2 .e = r3 .e and rl. ~ = r3. ~ in order to select and count the selected tuples in relation r3. This process is repeated for every tuple in RI and may result in a very inefficient like processing strategy. Ganski [GANS87] and Murlikrishna [MURA92] have proposed an algorithm to unnest the above query and to transform it into the following two queries PARTKEY; VZ.PARTKEY, Vz LeftOuterJoin specially Semantics (Select 95 DETAIL VZ.SUPKEY, V9.95AGGQTY Select it Iteration that Query: plan queries: Join-aggregate in relation VS: Select SUPKEY, PARTKEY, 95AGGQTY = COUNT(*) SUPKEY, ) then 95 DETAIL, a. SUPKEY as SUPKEY, a.QTY as QTY, a. PARTKEY as PARTKEY, b From 94AGG a, SUP~ETAIL = b. SUPKEY and Where a. SUPKEY b. SUPRATING = “ BANKRUPT”; GROUPBY from b. SUPKEY are attributes in relation ~1; r2 .C) r2 .d and r2 .e are attributes in relation r2; ~3.e and r3. ~ are attributes Vz: Select From selected = 95 DETAIL and PARTKEY. “nested-loops” View a better relation ecute the join- aggregate generate this list, the business analyst could define the following views and pose the following query on them: View are This cardinality. “BANKRUPT” through Select list = join. the (a. SUPKEY cardinality Suppose a business analyst is interested in identifying those suppliers that are to be discontinued from the tuples done Aggregate Queries” has been discussed in [GANS87, MURA92] and references therein. To the best of our knowledge, the majority of commercial RDBMS ex- 94AGG. three 94AGG outer reduce few ples and issues with and SUPDETAIL, contains unique iden- detailed Typically, small in comparison remaining of a supplier, contains a supplier. contain RATING, rating ● contains as in relation SUPKEY every contains 95 DETAIL DATE the the that be of a part PARTKEY, and meanings Let relation where of parts, and Also, let relation a translation, have similar very RATING SUPKEY attribute quantity SUPKEY, attributes where attribute QTY, and SUPKEY of suppliers, in entail table may be beneficial to first combine relations 94AGG and SUP-DETAIL with relation 95 DETAIL, before performing the aggregation on relation 95 DETAIL. SUPKEY, where attribute identifiers contains PARTKEY attribute attributes b. SUP and would 95 DETAIL potenial by predicate supplied by different suppliers in the year 1994, relation 95 DETAIL contains detailed information about every transaction recorded so far in the year 1995, and relation SUP_DETAIL contains inforLet us assume that mation about each supplier. 94AGG as written the participate may However, and on can aggregation and = V8.PARTKEY < 2 * V3.95AGGQTY); aggregation before 1.1 = V3.SUPKEY Vz .PARTKEY &.QTY, employ T1 key, relations V3 On 48 outer joins and do not require TIS. and r3. key denote key attributes r2. key rl, r2 and r3, respectively. Let of Query 2: Select into rl key, rl .a, rz. key, From (Select * from LeftOuterJoin r3 Groupby rz .d Query attributes and rl Left OuterJoin rl .a, rz key, tb count(rs 3. Select From V2 are virtual ceptualize rz On rl .C = rz .c), the key) rl .a the r-l .b 61 cozmt(rz.key) each from However, the cannot a~gorithms reorder 2 because then, predicate [BHAR95a, outer complex joins ~Z .e rl .j = ~s ~ – if the cardinality large then it may be beneficial Tz and r3 first, before in Query = p. joins We employ T and sch(p) We assume reference that at least to denote predi- one relation operand relations. If, references exactly references to to as complex we assume that a predicate as a simple more than two if relations Through are re- predtcate; two predzcate. predicates ) to the operator is referred of sch[r the relation We represent rs .e and of relation to combine accessing of relation to con- extensions null then out this m-tolerant2. BHAR95b] specified predicate is a way the a join it it is referred in the left of both identifiers E2 denote respectively. with with paper, and of predicate specified the r2, schema cates lations (row EI and schema specified r-l key Having rl denote rz.b TEMP Groupby it) relations rz. e = r~ .e and rl .f = rs, f I-1 key, Having TEMP rz.b minology ~1 is very relations lier work), which ~1. by in statement, [GUPT95] a Generalized accepts a new groupby SQL employed also Ikojectzon as its argument relation using (which according a relation the the tercites ear- (G P), 7rx,f(y T and produces subscripts X and attributes referenced j, ~(Y) [GUPT95]. In this ized paper, we introduce selectzon ent (GS), schedules example in for queries in query 2, this which if this substantially new join nary operations. that queries have in [BHAR95b, sented contain any we assume are then of the query. 1. Through been out simplified redundant queries (full) outer are stmple the Further, this no aggregate join with we assume predo not edges; that is, ~(Y) rest of the 1.2 provides provide some the 3, we to “break-up” paper basic definition provide is organized definitions. of the association In Some aggregate max, rein, new operator. identities complex provide an outline of the Section 5 provides the In that predicates. In enumeration tains Section Section by Section process. 4, Basic From now of SQL, convey more on, awg(distinct confusion, SQL the The algebraic semantics. pertinent in ing semantics. the SQL (R2,, Vz, Ez) where column 1 We names predicates conjunctive, that is, the form Ap2 A... full/left/right the outer join in the such that where & to @ denotes where DIs- TX(~) has it. (if present). represents FROM the SQL r GROUPBY duplicate-insensitive, If etc. DISTINCT ~ E then GP {rnaz, rein, )}. a eg. GP either statement function ), sum(distinct r2 or con- such a GP and is denoted count (distinct), Whenever r to denote T1 U r2, of two is the rl (RI U R2, E’ = {tl(% relation of W T2, both there 6 and is no ~. RI nR2 = 4, Note, that VI and operations predicate either p join is rows 2A schema or evaluates attribute operator. 49 ~ (R2 - G (RI - and rz relations The is the outer relation where = t’ A R,) U (V2 - predicate of p, to in R2) U (Vl in rl WT2 are padded are not present of U E2). T1 c EI)(t[RIVl] (VA containing compatible V,))(t[A] = NULL)) V2))(t[A] = NULL))}. (3t”e E2)(t[R2V2] = t“ A V rz = union (R, V, El relations VI U V2, E’), (YA and binary rz, are to the names), with rl lieu of captur- (sets attribute expression A pn, intent (r) (distinct), we employ set of definitions = (.??l, VI, El) specified in attention schemas or, equivalently, are p=pl rl two relations R2 denote that our with Let is used is constructed – a complete [BHAR95a], denote R1 and assume We restrict definitions are provided syntax syntax GP with duplzcate-znsensitzve union, and union, algebraic FROM Finally, conclusion. formal as us definitions albeit this X, count(Y) SELECT JX,f(y), rl 1.2 X the aggregation functions count the we enable in TX(r) SELECT that nx,co~n~(y) a duplicate-insensitive is termed 2, we Section SELECT specified specifies GP GP x. GAL192b]. as follows. instance, M@TLkntly, Note SELECT represents The Or, function example, statement they [GAL192a, For For X statement. 2. Subscript bi- the statements x TINCT method so that SQL GR.OUPBY may of those specifies statement. join this paper, by BHAR95c] T Conse- predicates this X groupby represents schedules enumeration conjunctive the 3. For first. selective 1. Subscript differ- 2 and joined complete only General- us to generate rl the cost contain to Query allows facilitates that operator, us to generate similar is very reduce operator queries enables T4 and relations quently, a new that p is over null FALSE R either [BHAR94]. a set tuples with in relation of in-tolerant for - real in that nulls rl r2. sch(p), called R Q sch(p) attributes attributes have for attributes or in relation NULL values in the if any p The antz join, rl relation (Rl, sion E’ is a set tion T1 (with jozn, rl ~ VI, E’), r2, T2, unmatched to predicate relations of (RI RZ, VI V2, E’), where ?’I and p is a predicate containing respect -% of relations where rl and extension The rz E’ in is the is the outeT union of re- relation tributes relatzon and The defined in T2 is the & 2 Generalized this the T1 is null Tz, . . ..Tn t~ons ~i, & re- similarly be in the 7’2 and with nulls. In we (GS’) define which set T2 & reordering of queries predicates that containing are For on definition relations generated then expression (TI 2.1 with to Query pl,s ~ (T1.f = SiIICe (Tl ‘$ Tz) ““~p’” T3 # T2 and (TI ‘~ T2) ““~p’” T3 # ~ (TI (r2.e ‘~ = T3) “’2-%’2’3 outer joins executed follows with are from ‘% the fact that cannot join operator. similar queries up complex predicates. tion of other plans and ~~,,,[7’~~Z](7’~ Intuitively, where retains In ~1 that padding This such (T2 ‘%$ Consider the attributes: a blclf those tuples addition, are it not tuples general, also by may specify and ‘$ be Consider spending defined applied for this ation m el of this 2 in The ‘Y Section following expression for p@2,3 T2) 1.1 with table the r3 provides above corre- predicates as the evalu- T3 which tables: blclflcldlel a elf generaT3) ‘~ T2) those in attributes than one = (Ri, expression following (T1 ‘~ T2) ‘~ table: r and relation T2 : in T – T1. relation operator, U, Ei)l to the predicate p, appropriately selection consider evaluates selection satisfy tuples predicate Now, a; [Tl] (T), p to relation T that more ri (T1 k of the to as for- Notice null and shown query next. {R, V, E), Query earlier. 1 < i < n, applying 50 that values in table r = relations, ‘2:EEEI lc21f2\ expression to us to break- conventional nulk+ for in a generalized defined the retains in ?’1 with we be preserved Let following Em specified operator predicate in relation selected by T3)). to – it, applies a~(T) is to is obtained 1 J 7’3: This facilitates selection C T, is similar T1 up enables operator T2) that ~. their full only the definition that as U~,,3[7’1T2]((T1 generalized can is specified. Reordering o’, ‘~ note re- and and predicates motivates r3) If no relation relation only real ‘$ Also, relation. resultant (GS), are specified [T1T2]((T1 p to relation 1.1, Ts.f) outer query it be broken (GS), ‘~ this complex join operator 2 in Section joins, which outer other mally in some other selectton In then manner generalized p. if only T,)), employed in the any with and (TZ where selection that preserved. el P1, ay,3 (TI r3.e). a;,,, is ‘l:m ‘$ T3 corresponding P2,3 rela- (R, V, E’), generalized a base the predicate I by ~ T2.c), of T1T2, is not T1T2 be preserved and T2) ‘1’3~p”3 = (GS), ~R,V,(~,(~))} - are preserved relation, relation TI, where (T1.c R, then: preserves relation — — {~R,v.(T) la21b1 pl,z ~ complete predicates columns consider that that & generalized perform complex specified example, one applying operator us to is the i # j, selectzon r LkJl<,<n above Example a novel n, - eg. in expression only here enables < ~p(T) those lations selection section gene?’a~tzed such R, by: = the only outer extension ‘r~ & padded E’ and full i in at- relation TZ, the TZ, the The relations) = ~, where ]( T ) , 0 f relation 1 < is given p~esemed supplying can supplying T1 and TI for the null T2, Similarly, appropriately aggregations. nulls the T1 ~ relation. of relations tuples selection Join, of relations TZ, union with In r2 is called outeT which with T1 is called Relation relation ?’tght preserved TI is the & r2 are padded sch(7-2 ). lation. Join, rl in base predicate K (lVj 2.1 o*[Tl, E’ from = # and Definition relation necessarily a null-intolerant Ri flRj rela- lefl (not p denote exten- tuples p). be relations TZ is the and the second in attributes T1 contains null the generalized in table T2 contains e and f whereas tuple the second values selection in these operator attributes. a ~,,,[rlrz] nontuple By at the top, the wecompensate correct result P1.3$P2,3 T3, as given 7’2 ) By for appropriately selectlon full operators outer join on the difference to table in se~ecting a generalized selection this corresponding generate ............. (~1 ‘$ T’l. the preserved operator, the product relatlons inner, can be defined cartesian and expression m outer and as a generalized as follows: H = ({T1, T2, T3, T4$T5}, Figure 3 Deferred application of and of predicates moving top. up E reordering queries referencing aggregations This that push-up and contain columns associated predicates to require breaking of a complex in following predicate, may as shown the and Consider the following expression: (Vl, r; ~ rl and pression T cannot In order to reorder gation r; in predicate to the ~ 772. Let pl,3 and be reordered and break-up A p2,3. Since, and in the base relations, . presslon [BHAR95a, the rect the reorderable GAL192a, modified result top GAL192b]. by applying – eg. the can in this the even if a query reference the aggregated complex does not In order hypergraph model presented association trees. hypergraph query, and which correspond full outer) join the sets of relations 3.1 pair where A in its 2V). For we call VI V1 and V2 as simply specified we an outer we say outer of both join with edge In the a hyperedge operation is in the is bz-diTected operation inner and an nodes. say a hyperedge join VI e as simply joins always it if in the query. Note, generate un- edges. Example 3.2 the f& = Tl ‘$ where Consider the following query Ti = (7’2‘2’4s’2’5((TA ‘h’7’5) paths3 this p1,3 at (Ri, Vi, Ei), is a predicate Pij cor- reordering .=count(?.,)((n ‘~ that a complete this, we employ earlier in [BHAR95a] the set of nodes involving two predicates to generate relations hypergraph cardinalities hyperedge represents a full predicates Figure to break-up represents operation Definition (V, E), to -+ VI, V2 6 2V, hypergraphs, Further, directed in it is desirable to show Intuitively, a hyperedge is a mapping Qq: T3)) ‘h’ no presented each contain in order set of reordering. E 1? : 2V 1 shows between nodes T1 h2, the we break ?’lhl (even T2h2T5, and predicate 1 < though < Note 5. that there are 3 ?’~hl?’zhz?’4h4T5 Note, are and i, j query. and T5). hypernodes complex for TJ, for this has no cycles — r1h1r3h3T5, hyperedge 1 < i < 5, is a relation T, aid the hypergraph hypergraph — for between for and {T2} directed {T4, p2,4 A p2,5 then If T5}. we obtain section. columns, predicate that (i.e., where the to hypernodes if it new Furtherl V2), refer dmected ex- r3) predicate m~, , [TlT21(7h/3,3,:.:,,1 712 )‘~ T3)),as shown can modified generate remaining such of V V2 ) and of query aggrega- pl,3 T2) ‘% For we 3.2 count. by predicate by the algorithms expression, Example predicate c is generated The Ex- the aggre- complex be applied with the outer join. . 1s flv3T3T:T; ,c=countfT1) ((Tl ‘~ is completely of column the tion longer be ref- p2,3. r, we can move pl,3 not c only in predicate due to aggregation expression top column not (Vl, context represents where hyperedges, subsets 1, we query. erenced of e = If e = and 3.1 for V2 hypemodes. V2 are example. Example set non-empty hyperedge requires the is the on aggregations aggregated of aggregation 1: Hypergraph complex predicates The h2, h3, h4}) {hi, referenced an (inner, hypernodes, H V is a non-empty set of the in the or between to be of nodes and intend Q; = = (T I ‘~ (T1 ‘% to show that and T2) T2 ) ‘% ‘+ ((T4 expression ( (T4 ‘h’ ‘M T5) T5) ‘h’ ‘M T3)), Q4 is equivalent to aj,,4[T1T2](Q~). the and as follows. is defined T3)) We Q: C7J2,4[T1r2](Qj) outer, a predicate expressions the and A query hypergraph senting predicate information ence regarding more than broken up into erators and For a given ‘Please is a useful connectivity, refer two outer identities query, to join relations smaller abstraction as well [BHAR95a, we can construct [BHAR95a] for predicates and, components the for as the that therefore, using precise be op- GAL192b]. hypergraph definition. refer- cannot the known GAL192a, its repre- semantic and then determine satisfy the lowing. reordering the in which (TI .r,).((~~.~s).m)) is (~~ .((~z.7’z).(~~.~s))) definition should and tree in [BHAR95a] be combined ation trees for tree is an association tree requires only after Q; and association for to for Q41 generate Q~, and tree for Q?. process of establish relations r4 we consider complex definiassoci- association trees the Then, equivalence trees and the next nodes in optimal be is to expression We trees. contain any step order also as a mean at generalize containing of cent ain selected. association that then order do not operators of the expression trees the can assigning predicate, expression internal cost the they trees, the trees. minimal ~2 The following by permitting to the relation that determine association expression the only operations; operators with use Note, Given assign for trees different operators. – tree combining Q~ as valid fed- combined association association an T54 – eg. (~1 .7’2). ((r4. r5). r3)). allows more association trees tion are The execution that in the of a hypergraph relations is an ))) operations as defined an association order (7’1 .((T2.T5).(T4.T3 the of the y criterion, Intuitively, specifies eg. those connectivity most our number to First, one results of to complex predicates. for Q4. 3.1 Definition 3.2 we define an For hypergraph a query assoczatzon T for tree H H = (V, E), as a binary Splitting Initially, tree most complex assume one that complex of transforming in which. expression 1. leaves(T) than = V, and no relation appears in either more the one leap. or 2. for any subtree 3. for any subtree the set T, T, H of \kaves(T,)6 at the root. with the of TI of all with leaves used TL and leaves(~) and VI ~ (27 .Tr ) of T, hyperedges hyperedges subtrees = in of T,, E and let in subtree Tr. that E;, Ts in Then connect Ve = (VI, and V2 ~ leaves, leaves (T. ); and all of the with the complex child into complex child predicate or grand or an equivalent of the where of root then the the complex 1< conjunctive child at the root, can be used to break-up the appears grand appears child at consists predicate i <4, predicate ri and rj, be four between then7: the n P;,2AP; ,2 + T2 r~ P:,2M,2 ++ 2 = a~: ~[rl](rl ‘S following (1) r2) VI ~ or Vz ~ Jeaves(Zl) either or -pi. + denote relations to ~ombine Vz) E ET,, expression the contains approach !. leaves denote order expression Our Let I-, = (RA, Vi, E,), relations, ET, denotes let root identities predicate. T, given which If the following is connected. a given predicate. the in predicates 2 r2 = uj: ~[rl, r2](rl ‘S 7-2) (2) is true: egET, ● o 3(v; , vj’ ) PI,2 @ 7-2) “’3S2’3 (rl ;or E ET, s~ch that either V1 s the above definition r3 = P1,2 @ r2) rrj1,3[rlr2]((rl V; and ‘Y (3) r3) V2~V20rVl~V2andVz~V1. The key the difference definition in that a hyperedge only the subsets hyperedge) are {r,, rz r5. tion trees for query for Q4 in can That Item be is, for Q; – can above and some ((rl.rz). definition Q! of as the rI ‘~ 2.3 6 As and relations been in SQL in uj: = relation 774 ,[rl](rl occurring renamed to more have than unique association trees association trees ((rl.rz).((rA.(7’s. ?’s)), rl ‘& a rl ‘% query we assume expression Vz) G Ev(3(V~, V,’) (r2 r2r3](rl ‘N (r2 ‘v r3)) (6) P:,3AP:,3 +r3)= , [rl, ((r2 LTji Jrl, 6 E rs)= VI ~ v~Av2 n.](rl ‘M (r2 ‘2 7-s)) (7) P;,3AP; ,3 rg)p~rq)= W have names. usc H l~=aveS(T,) h denote the induced sub-bpewaph where E’ = (leave. (T. ),E’), {(VI, VZ)IVI ~ lea~es(T,) A Vz ~ )A((Vl, (5) 7-3)) that 6 We leaves(T. d,3f%3 M (r2 UJ: Jrl, n ‘X in (7-2 ‘i’ etc. literature, once ‘Y associa- valid optimization (4) rs) two hz [BHAR95a]. other r2)’& P;, SAP; ,3 M 7-9) = (r2 u;; 4Definition ‘N so that allows (r~.((rz.r~).(r~.rs))), rs = uj1,3[rlr2)r3]((rl the valid ((~A.rs).rs)), (r~.((rz.r~).(rs.rs))), the hyperedge either “’34?2’3 that to be broken-up with P1,2 @ r2) states way before example, Q4 which such (rI and (corresponding combined the 3, in be combinedj For query Q4 are: is broken-up need combined. r~}) relation be of hypernodes subsets or [BHAR95a] can original ({rz}, between ~ vi))}. 52 r4](rl ‘% ((r2 ‘k3 rs) ‘2 r4)) (8) where are o E {M, +, not associat T2) ‘h3 an operator For ives and r3), that predicate, then the the the ~3)) root. rl (rz grand the the root. no By expression ‘x complex ?’,) operator expression appears Thus either in into ‘w 1 A p~edicate p can query Q contaznzng be transformed complex or a child into pTedicate or a grand chzld u ((b % applying can break-up Identity the 3, the complex ((?’4 generate Q; a~,,. [rlTz](Q2). pression ‘h’ child The of above = ~j,,,[rl, expression Q~ (Q~) can Q4 W applying u;,, generate Q; different and Q; For nition be made [7’17’2] to expression Q; is identical G&,, (Qi,4) tity 3. - (8)), we ments, and then be [T1T2]) their assigning plex ‘$ broken-up (Q~) and to ex- at assign ilar operators to the trees Q4 according operators refer apply unused consists to the internal method please operator, approach to Section generalized ing the preserved [BHAR95a] with the select ion. relations for (or, This steps (for ); and novel out using the MGOJ operator T2} ((TZ ‘% the tree approach. top ~ In of expression to obtain 7’3))) ~ ((r2 ~ b T4) r3))) to the original this equivalence, we GAL192a, GAL192b] a~, , , [rl, r2](Q4) The expression equivalence Q4 follows query query apply the to trans- where Q~,4. tree and (h) presz requires in similar that for of from Iden- query presl(h) = VI and = (Vi, does and = (h) hi-directed into (~, E an as- sets presl Let H Hz rj predi- is also preserved as follows. pTesz(h) h, then (h), contain Q‘ hypergraph E1) argua com- edge not of query Q, and h disconnect HI above a~, [~i c presl Q’ tree are determined ponents the Q contains a hi-directed to query query to if query A pz with the association sociation For and – a) those sim- to detail, b) expressions predicate the conf?ict belongs two com- E2)’, then = V2. hl then to concepts to conf(ho cannot 9 From selection, be [BHAR95a] described 53 1 in hypergraph expressions in of closest [BHAR95a]. the employ edge ho, the [BHAR95a], can into conj?ictzng operator be of the inner, (hi) connected reorder push by the (5)-(8)), outeT joins if hyperedge hl corresponding operator (full) set of closest every two not Identities Intuitively, that H we do (eg. a descendant trees Lemma disconnects which root ), then only 10 . For a join det ermin- in the set [BHAR95a]. in expression compensation the generalized pl (h)] (Q’ ), where 10 These 8 With ‘~ association our h4](r5 original presz interior by a method in [BHAR95a] 4 in predicates 6 H} to Defi- to the of two nodes presented for of at TZ]((TI the Q is equivalent we require our h hz is {Tl, (T1 ~4](7’5 expression be shown predicate complex general, a) to query hyperedge the root. We for expressions nodes. In r3))) the arguments it can cat e pl, and to expression association for by (rl equivalent expressions trees example, u~,,~ [rlrz] equivalent (~~,, generating association 3.2) can T3)) expression equivalent by equivalently, ‘h’ any ‘$/ step is equivalent form the queTy in expression expressions Thus, hyperedge T4)~Go~[~lT2, order to show in [BHAR95a, either with in Q4 ~ In identities Q; complex ((1) predicate. T5) its preserved hyperedge: ... ~ expression ikfGOJ[~lr2, complex szngle identity predicate equivalent for of the Toot. appropriate complex H, of the r = contains Formally, the an equivalent p2,.l~p2,5 Tz) to the it, as Following Now, set for generate a~,,g [? ’1, TZ] is applied b), (TI expression grand p appears hyperedge by is a path T2, P4,5](T5 complex the a join left” constructIntuitively, expression: appears the or step an employing which child the there We results we conclude: Lemma in which with by joins. --$ in hypergraph preserved Q4. (T1 .((~z.rq).(Ts.T’3,))) ((r. the given in outer preserved “to / ‘4)~Go~[Tl ‘2’442’5 by {T example, in query that predicate case, = computed outer are relations predicate, Q4 containing this For of (full) that hyperedge p?’es(h) example, operator ancestor. the and set relations set are the are sets of (full) one complex complex (T I root root relations pTeserved a directed the the For only ancestor this we transform equivalent root. the identity. of preserved the those in which at of 3.2 contains In ing complex expression either transform push The with transforming child of the in [BHAR95a], predicate a single of appears which (7’1 ‘3 is specified containing is next. a preserved an equivalent can # as described root. consists or identities On the other hand, in expressions such P1,3:P1,3 r3), due to the lack of associativity, to an root, the there in not predicate predicate appropriate we at the Q ?3) expression ‘~ the ‘!% application equivalent into (rz application [BHAR95a], results T1 ‘x into and requires we can (eg. Q4 in Example predicate, ‘g four complex child apply r~) last predicate the before the approach complex expression Note, expression our with +}. is not expression the or the a general given +, for outer ho joins conj%cting directed hyperedge components. employing MGO J outer ]oms, {hl denoted Rk%R, there M.~Ri~RJ can two in the Definition outer a set H these The hypergraph by Conf(ho Next, = all outer into join conj?act of join The edges as follows: T.)) . . . if ho is hi-directed 4 Note to Rk +$ Rl {h/R&2j~~ if ho is undirected {h} ~ ~ if ho is undirected is a join addition, preserved or a left/right we set: slightly for following and outer modify in H] ccoj(ho into the following p?’e$h, (h) or preserved By identity /$~H By pushing ~ and definition directed Theorem hl }, p~es(h), and ● ● Let the pre$cute Q’ that main a single (V, E). IV2 I >1, such the for result 1 Let the hypergrnph contamzng complex T of be the from for then edge - If conf(h) complex applying for (Vl, assoczatzon pl tree @ Q’. query P1 A PZ coTTespondzng V2), whe~e tree fOT for to = {hi, /V1 I > 1 or hypergraph assoctatzon Similar equivalent H; expression T ated tTee pres2(h)](Q’). In Q = u~, ~res(h)](Q’). then: — If conf(h) = {hi, then Q = Ujl bres~(h),... preserved from the sets and original the either root equivalent last expression expressions: in of its dependent be generated trees that from are predicate gener- PI or P2 or Q:. which one of some predicate(s)), ing can complex Q; application independent example, ‘~ r4)), of complex complex then: ,pTes~(h~)](Q’). conflict above expression to either expressions in which Q = u~l O(Q’). . . . . hn}, expressions Note, the (r3 = #, the equivalent the to new complex other our approach predicate first, predicate predicate(s) consists followed by re(de- of breaking the break- predicate(s). p?’es(h)](Q’) then: If conf(h) Q;. by breaking-up For e If o =W we generate example, following are equivalent ~., pr’es~(hn), (3), be Then: . . . . h~}, For the p4,5 A P4,6 predicate identity the quires = @, then If conf(h) the which Q = Q1 “$’ predicate Q = a~l ~resl(h), Q = ~&b~es~(h), once in root: section. pendent Note, expressions at the hl then: [~ =-+ only application (4): generates of this hyperedge expression predzcate If - the predicates P1; and viceQ5 can be transformed the ~ complex be h = be an If@ =+ – two PZ = p4,5 A p4, G. hyperedge by h away includes we state = contains ‘~ = {h} expressions. H (Tz ) = @ ccoj(ho) nopathr~... = { Q2 which ?’,)) does not require appear broken-up pl,2&p1,3 (rl equivalent predicates to = pI, Z A pl,3 and Pi man- predicates. to be equivalent. = ‘&f consists set: {rl Finally, (T5 PI predicate complex join. the a hi-directed h, the set of relations is the and “’5~4’e approach correspond Q5 our any in a top-down be shown consider how containing the complex that can explain U co:j(h) [ In is a path trees predicates that Our expressions of predicate P2, before applying versa. Therefore, expression ~Rk&RlisapathinH) . . . if ho is directed {h~R%%l?j~ ~ (T4 we expressions breaking-up example, ‘~ to predicates. predicates complex examples, given expressions For set for a hyper- the recursively complex = where of complex ner and exactly of be extended of traversing hyperedge. ), is defined means can number join join by results Note, conflicting then conflicting edge,ho,denoted Ccoj(ho) H’ disconnects closest 3.3 set if removing hypergraphs, same conj(ho) closest Also, , hn from ho, hl, have one Ccoj(ho). connected is the isapathin~} be at most hyperedge edges Ccoj(ho), sets are computed hypergraph. .54 consider where complex = F’l = pl,z predicate predicate predicate Q6 P2. PI: T1 “’2AP”4 Ap~,4 and P1 requires In this case, (r2 ‘2’3$P2’4 P2 = pz,~ Apz,Q, the application we first break and then break complex predicate dependencies P2: isting ordering and are generated, enumeration and employ view, join gated to expression up either trees complex one of the above that are predicate PI generated by or P2 are equivalent columns. may to and benefit data reorder than operator, point queries two join of outer that contain relations or aggre- operation is used other applications, re- propose generalized hierarchies, to applications “object” extended our we novel the now the outer “object” from to more child hierarchical cations breaking- Since parent this, powerful to ex- possible an implementation is similar We can all For very From referencing traverse like Note, yet operator operator. modifications generate operators. selection. this predicates assign a simple generalized slight algorithms data relational dbms applisystems findings. six expressions. References 4 Generation In this ation section, process imum we Our then the the root join lent more expression hypergraph. Then, combined specified satisfied. to technique found cost with in of MGOJ only one in, presents Note that selection it operator operator in the dy- [BHAR95d] [BHAR95a] G., Algorithm Bhargava, [GAL192a] [GAL192b] the cost of similar to ployed for taining tions. exhaustively joins, (full) We employ to capture all computed only reordering outer the the joins} query inter once paper for that of SQL and or GOJ op- hypergraph operator dependencies the query. given queries P. and joins,” Joins 87-99, 1995. Iyer, B., “No of and “Simpli- pp. 63-75, CASCON, P. and Iyer, joins To appear and B., “Efficient aggregate func- of Data in proceedings Engi- 1996. Galindo-Legarial C., and Rosenthal, a conventional [GUPT95] C. A., queries,” Ph.D. Science, Harvard of outer of Applied Ganski, 55 “Algebraic optimiza- dissertation, University, SQL “Outer joins pp. 348-358, R. A., and Wong, H. K. T., Queries as dis- 1994. “Optimiza- Revisited”, Proc. pp. 23-33, May 1987. A., proach 1995. C. A., SIGMOD, of nested Gupta, of 1992. Galindo-Legaria, that are join “How to handle proceedings tion “Generalized these join,” Galindo-Legaria, SIGMOD, con- outer A., optimizer pp. 402-409, 1992. Engineering, in order Once pp. with of outer tion aggrega- model B., Iyer, Enumeration G., Goel, Cambridge, can be em- Groupby and the Bhargava, Dept. [GANS87] in this P. for processing Data [GAL194] are presented queries 1995, pp. Queries Goel, one- and two-sided Conclusion results “Hyper- join SIGMOD, CASCON, G., to extend in [GAL192a]. New SQL of outer junctions,” 5 Goel, in Joins,” neering, of such The Bhargava, tions”, enumeration is very B., op- considers operator. selection Iyer, of outer 1995. assignments the so that and outer TR03. 567, July P. and predicates,” Regression fication of aggrega- of operator be extended [BHAR95C] enumeration the enumeration Details are details Goel, reordenngs complex Outer condi- trees say, G., Projections are RDBMS programming [BHAR95b] leaf subtrees of existing Report “Reorder- to eg. for as- the Techwcal B., joins 304-315. the given if all involving based with trees. create two included GAL192al generalized is to Bhargava, graph equiva- of association is easily joins, to possible up from step, tree, dynamic generalized cost erator step approach [BHAR95b]. has of the of the first [BHAR95a] pushed section, algorithm bottom and also explains along technique the check the all P. and Iyer, queries 1994. columns, and IBM joins,” pred- the association definition [BHAR95a, extensions are in the This timizers. tions generate a larger programming relation in the last subsequent to obtain a column or the jom G., Goel, a) presented an enumeration The - Bhargava, ing of complex min- method is broken-up by generating at each the steps generate base presented trees, constructed trees. namic two b) then trees query tions by the [BHAR94] enumer- with of two predicate(s) than It is easy to envision sociation root in join and schedule consists predicate(s) 3,1; of the of aggregations by the method Example schedule outline the to the is referenced references an selects if any icate optimal approach aggregations in [BHAR95b]; that the present that cost. push of Harinarayan, Projections: to aggregation”, V. and A Quass, powerful To appear in D., ap- VLDB, [MURA92] Muralilcrishna, rithms couver, British Pirahesh, VLDB H., Hellerstein, [ROSE90] Ca., June graphs, reorderable Algo- Queries”, Pro- pp 91-102, Van- Canada, 1992. J. M. and Hasan, rewrite SIGMOD, pp. W., optimiza39-48, San 1992. A. Rosenthal, “Query Con., based query in Starburst,” Diego, Unnesting SQL Columbia, “Extensible/rule tion “Improved Aggregate of 18th ceedings [PIRA92] M., for Join and Galindo-Legaria, implementing outer joins,” C., trees, and freelySIGMOD, pp. 291- 299, 1990. [SEL179] Selinger, D. D., P. G., Astrahan, Lorie, path selection ment system,” M. M., Chamberlain, R. A. and Price, in a relational SIGMOD, T. G., database pp. 23-34, “Access manage- 1979. 56
© Copyright 2026 Paperzz