Slide 1 Joining Relations in SQL Objectives of the Lecture : •To consider the Natural & Generalised Joins using the SQL1 standard; •To consider the Natural & Generalised Joins using the SQL2 standard. Slide 2 Joins in SQL Expressing joins in SQL has been particularly affected by two particular SQL standards : SQL1 Standard SQL, introduced in 1989 - this has no specific support for joins; SQL2 Standard SQL introduced in 1992 - this has special support for joins. The SQL3 Standard SQL introduced in 1999 maintains this support. It is important to know how to use both variants of SQL. As the SQL1 standard is a subset of SQL2, it is always possible to write an SQL1 style join in SQL2. Oracle SQL 9i supports SQL2 standard joins, but earlier versions of Oracle only meet the SQL1 standard. SQL1 joins will be studied first. We will then see how SQL2 joins are built on top of this. Students taking this module will generally be using an Oracle DBMS, although any other SQL DBMS meeting SQL standards is acceptable. Depending on the DBMS version used, SQL2 standard joins may or may not be available. Students should consider how they will gain familiarity with SQL2 joins. Slide 3 SQL1 : Generalised Join Syntax The Generalised Join of relations R and S has the syntax : Select From Where * R, S theta-join condition ; Principles : Put Select * to get all the attributes in the result. Put both relation names in the From phrase. Put the complete theta join condition in the Where phrase. The SQL statement also retrieves the result from the DB. SQL fulfills all the requirements of the Generalised Join operation. The result has all the columns from both tables. SQL is forgiving in that if the two tables have duplicate column names, both appear in the result and can be distinguished from each other because the column names are prefixed with their table names. SQL can afford to do this because it knows that all the columns in the Join result are to be retrieved, without further action. Relational algebra joins have to be strict about their result relations because, due to the generality of relational algebra, the result of the join could be the operand of another relational operator and therefore the result has to be properly formed so that that operator is not impeded from functioning correctly. When entering column names in the Where phrase, in principle (and for self documentation), each column name should be prefixed by its table name, the two names being separated by a full stop. However for any column name that is unique in both the tables (which in this case should be all the columns), SQL can deduce for itself which table the column comes from; hence the user can omit the table name and full stop and just enter the column name, leaving SQL to put the table name in for itself. Thus the syntax of the Where phrase is the same as that of the RAQUEL Generalised Join parameter. Slide 4 SQL1 : Natural Join Syntax The Natural Join of relations R and S has the syntax : Select From Where the result’s column names R, S equi-join condition ; Omit duplicate columns. Prefixed by TABLE_NAME. Principles : Put Select COLUMN_NAMES to get all the columns in the result. Put both relation names in the From phrase. Put the complete equi join condition Always prefixed with in the Where phrase. TABLE_NAME. The SQL statement also retrieves the result from the DB. SQL fulfills all the requirements of the Natural Join operation. The user has to manually enter all the result columns from both tables into the Select phrase, omitting duplicate columns. The same rule about prefixing column names with table names applies to all columns that appear in the Select phrase. Although the „=‟ comparison means that it does not matter which duplicate column appears in the result, i.e. is specified in the Select phrase, and which is omitted, SQL requires that the user make an arbitrary choice as to which column it will be and enter that name; this means that the user must arbitrarily choose a table name and prefix the common column name with that table name. Since by definition all the comparisons in the Where phrase must be „=‟ comparisons with the same column name on both sides, the user will have to prefix that name with the name of one table on the LHS of the „=‟ and the name of the other table on the RHS. It doesn‟t matter which way round the table names appear. Thus for every column name that appears in the RAQUEL Natural Join parameter, an „=‟ comparison for that column name must be entered in the Where phrase. Multiple comparisons must be Anded together. Slide 5 Examples : SQL1 Generalised Joins SQL1 equivalents of previous examples : Select From Where * R, S B<C; Select From Where * R, S A > E And B <> D ; As the operands have no column names in common, it is safe to use “*” in the Select phrase and omit table name prefixes in the Where phrase. The examples appeared in the previous lecture, slides 5 and 8. Slide 6 Example : SQL1 Natural Join SQL equivalent of previous example : Select From Where PNo, Qty, SHIP.SNo, SName, SHIP, SUPP SHIP .SNo = SUPP .SNo ; Doesn‟t matter from which table the “SNo” column comes. Or Select From Where PNo, Qty, SUPP.SNo, SName, SHIP, SUPP SHIP .SNo = SUPP .SNo ; The order in which the tables appear in the From phrase, and which “SNo” column appears on which side of the “=”, don‟t matter. The example appeared in the previous lecture, slide 13. Slide 7 Combining Algebra Operators Typically we want to join together 2 relations holding relevant data, and then prune the result down with a projection and restriction to yield just the required data : R Join[ Att ] S Restrict[ condition ] Project[ AttNames ] In SQL, put the Projected attributes in the Select phrase, the Joined relations in the From phrase, and And the Join and Restrict conditions together in the Where phrase, as follows : Select Distinct AttNames Join condition Restrict From R, S condition Where ( R.Att = S.Att ) And (condition ) ; SQL‟s built-in sequence of operations will execute a Cartesian Product of R and S, then a Restrict on the result using the entire Where condition, & finally a Project on that result using the Select attributes. Because the attributes to be projected out are taken from a relation created by a Join, it may not be obvious whether they include a candidate key or not; therefore include the Distinct keyword to be on the safe side. The set of attributes projected out by the Project operation must be either a proper subset of those created in the Join, or the same set of attributes. Either way, we can use the set of attributes in the Project operation to determine what to write in the Select phrase, and forget which columns we might have entered had we considered the Join, Cartesian Product and/or Restrict operations on their own. Note that the Where phrase always consists of a Join condition Anded with a Restrict condition (unless there is no Restriction in the query, in which case the latter must obviously be omitted). If there is more than one attribute in the Natural Join parameter, then there must be an „=‟ comparison for each one, all of them being Anded together, for the Join condition part of the Where phrase. If the Join operation had been a Generalised Join, then the condition parameter would have been used as the Join condition instead of the derived equi condition. SQL‟s implementation is normally significantly more efficient than the logical procedure that it in principle executes, although logically equivalent to it. Slide 8 Examples of Combining Operators Example : Get the supplier‟s name who supplies parts in quantities of 10. SHIP Join[ SNo ] SUPP Restrict[ Qty = 10 ] Project[ SName ] Select From Where Distinct Sname SHIP, SUPP SHIP.SNo = SUPP.SNo And Qty = 10 ; Example : Get the names of employees who own a Corsa 1.3. CAR Gen[ Owner = ENo ] EMPLOYEE Restrict[ Type = „Corsa 1.3‟ ] Project[ EName ] Select From Where Distinct EName CAR, EMPLOYEE Owner = ENo And Type = „Corsa 1.3‟ ; The examples are derived from examples in the previous lecture, slides 16 and 11 respectively. Note that column names have only been prefixed with table names where this is a logical necessity, i.e. in the Natural Join condition of the first example above. Slide 9 Designing SQL Queries Decide which DB relations contain data that will be required in the answer to the query, and join all those relations together with the appropriate Natural/Generalised Join operation(s). Remove any unrequired tuples with Restrict operation(s). In principle only one Restrict operation is required, but it may be more convenient to use several. Remove any unrequired attributes with a Project operation; only one Project operation will be necessary. Complete the appropriate SQL phrases with the relevant information from the algebra operations : Project attributes Tables to be joined Join Select …… condition From .…… Where ( ……… ) And ( ……… ) ; Restrict condition This sequence of operations is the simplest to create and simplest to convert into SQL, and is generally applicable. Therefore, although it is possible to design queries using other sequences of algebra operators, the above is recommended. If a Join/Restrict/Project operation is not required by the design, just omit it from the SQL. Note that : omitting a join means omitting a Join condition from the Where phrase, and a table from the From phrase; omitting a restriction means omitting a Restrict condition from the Where phrase; if there are no conditions in the Where phrase, then omit the Where phrase altogether; omitting a projection means putting „*‟ in the Select phrase; if there is more than one Join operation, then all their Join conditions must be Anded together to form the total Join condition; if there is more than one Restrict operation, then all their Restrict conditions must be Anded together to form the total Restrict condition. The optimiser of the DBMS can be relied on to make the query as efficient as possible. There is no point in learning advanced techniques for designing efficient queries before the design of the correct logical queries has been mastered. Efficiently executing the wrong query is a waste of time ! Slide 10 SQL : Cartesian Product SQL1 executes a Cartesian Product operation given the following syntax : Select * From R, S ; Hence the absence of a join condition in the Where phrase causes SQL to execute a Cartesian Product : If a Cartesian Product is actually needed in a query instead of a Natural or Generalised Join, then just omit the Join condition from the Where phrase. If a Join condition is accidentally omitted from the Where phrase by error, then the result will be unexpectedly (very) large due to a Cartesian Product operation ! SQL2 actually has a Cartesian Product operator, with syntax : Select * From R Cross Join S ; Thus it is important to form the Join conditions in the Where phrase correctly, taking care that the conditions are not omitted and are properly formed. It is not uncommon to have errors in the Join conditions, with unexpectedly and disconcertingly large consequences ! SQL2 contains a Cartesian Product operator for completeness, because it also has proper Join operators that can be used in the From phrase. Therefore the Cartesian Product operator could be required for a query to be written completely in the SQL2 style We will now consider SQL2 Joins. Slide 11 SQL2 : Generalised Join Syntax The Generalised Join of relations R and S has the syntax : Select * From R Join S On ( theta-join condition ) ; Principles : Put Select * to get all the attributes in the result. Put R Join S On ( theta-join condition ) in the From phrase, where R and S are the operands and ( theta-join condition ) is the complete generalised join condition. No Where phrase is required. The SQL statement also retrieves the result from the DB. SQL fulfills all the Generalised Join requirements. SQL2 uses an algebra-like style to express the joins. Thus everything to do with a Join operation is written succinctly in one place, i.e. in the From phrase. It is much simpler to use and so should be the preferred way of writing Generalised Joins wherever SQL2 syntax is available. SQL‟s built-in sequence of operations is modified as a consequence of the Join expression in the From phrase, so that logically it executes the Generalised Join operation instead of a Cartesian Product operation. It then proceeds as before with the remainder of the SQL statement. Slide 12 Examples : SQL2 Generalised Joins SQL2 equivalents of previous examples : Select * From R Join S On ( B < C ) ; Select * From R Join S On ( A > E And B <> D ) ; As the operands have no column names in common, it is safe to use “*” in the Select phrase and omit table name prefixes in the Where phrase. The examples appeared in the previous lecture, slides 5 and 8 respectively. Slide 13 SQL2 : Natural Join Syntax There are 2 ways of writing a Natural Join of operands R and S in SQL2 : Select * From R Natural Join S ; Select * From R Join S Using ( AttributeName(s) ) ; The attributes on which the „=„ comparison(s) is/are made. Principles : These are the same as for Generalised Join, except that a different required expression is put in the From phrase. The SQL statement also retrieves the result from the DB. Both variants fulfill all the Natural Join requirements. Like the Generalised Join, SQL2 uses an algebra-like style for both versions of the Natural Join, with everything written succinctly in the From phrase; so they should be the preferred way of writing Natural Joins wherever SQL2 syntax is available. Both versions of the Natural Join do exactly the same thing. The first version above automatically looks for all column names that are common between the two tables and uses them all for the join; if columns with the same name cannot have their values compared due to type differences, then an error will ensue. The second version above uses the column names specified in the Using parameter (only) for the joining. Again SQL‟s built-in sequence of operations is modified so that logically it executes the Natural Join operation instead of a Cartesian Product, and then proceeds as before with the remainder of the SQL statement. Slide 14 Examples : SQL2 Natural Joins SQL2 equivalents of a previous example : Select From * SHIP Natural Join SUPP ; Select From * SHIP Join SUPP Using ( SNo ) ; The example appeared in the previous lecture, slide 16. Pros and Cons of the two syntaxes : R Natural Join S : Advantage : less to write. Disadvantage : easier to make a mistake if the required comparable columns don‟t exist. So use for interactive ad hoc queries where it is easy to recover from a mistake. R Join S Using ( AttributeName(s) ) : Advantage : makes explicit what the natural join is. Disadvantage : more to write. Use for self-documenting queries that may be repeatedly executed without prior checking. Slide 15 SQL2 : Join Problem (1) Select * From CAR Natural Join EMPLOYEE ; Select * From CAR Join EMPLOYEE Using ( Owner, ENo ) ; Neither will work ! Columns “Owner “ and “ENo” don‟t appear in both tables. So use an SQL Generalised Join to express the required join, & remove the duplicate data in the Select phrase : Select RegNo, Type, Owner, EName, M-S, Sal From CAR Join EMPLOYEE On ( Owner = ENo ) ; Could have omitted “Owner” instead of “ENo” in Select phrase. The example appeared in the previous lecture, slide 11. Slide 16 SQL2 : Join Problem (2) Consider the join expressed as : R Join S Using ( J1 ) Suppose there are two attributes, named J1 and J2, both of which appear in R and in S, and are type compatible. The join will be carried out just using J1, as specified. ==> the result will have two attributes called J2 in it. There are 2 considerations concerning the result : If a real join requires both J1 and J2, then SQL will have generated the wrong result (unless by chance the data in the tables avoids this). If the problem was unhelpful column names, so that the correct result was generated, the two columns can be distinguished with their table name prefix in the Select phrase. Slide 17 Combining Algebra Operators Follow the same procedure is as before, but using SQL2 syntax. Example : SHIP Join[ SNo ] SUPP Restrict[ Qty = 10 ] Project[ SName ] becomes Select Distinct Sname From SHIP Natural Join SUPP Where Qty = 10 ; Or SHIP Join SUPP Using(SNo) Example : CAR Gen[ Owner = ENo ] EMPLOYEE Restrict[ Type = „Corsa 1.3‟ ] Project[ EName ] becomes Select Distinct EName From CAR Join EMPLOYEE On (Owner = ENo) Where Type = „Corsa 1.3‟ ; The examples are derived from examples in the previous lecture, slides 16 and 11 respectively. Thus the revised design procedure is : Put all joins in the From phrase. A Join expression can be put in parentheses to become the operand of another Join expression. By this means, repeated if necessary, more than 2 relations can be joined together. The Where phrase now only contains any Restrict conditions, Anded together as before if there is more than one. The Select phrase is used as before for Projections. The advantage of the SQL2 syntax over the SQL1 syntax is that SQL2 keeps Join conditions in the From phrase, quite separate from the Restrict conditions in the Where clause. Thus with a complex query involving both joins and restrictions, it is easier to get it right using SQL2 syntax. Even if SQL2 syntax is not available on your DBMS, you might consider, for a complex query, writing it out in SQL2 first and then translating it into SQL1, in order to help get it correct.
© Copyright 2026 Paperzz