Document #: US-13272,EN ------------------------------------------------------------------------------ Major subject: analysis Minor subjects: tech_notes Keywords: quel, sql Abstract: Outer Joins - Equivalent to INGRES Release 6 Technical Note #32 or INGRES Release 5 note #79. Expert note: Outer Joins =========== o What is an Outer Join? o Sample Data o Method 1 - SQL SELECT with a UNION, Release 6 Ingres o Method 2 - Using a Correlated Update on a Temporary Table o Method 3 - QUEL REPLACE with a Temporary Table o Method 4 - Using Aggregate Functions o Summary What is an Outer Join? ---------------------- When a relational join is performed, data from either table can be obtained if the joining column(s) contains matching data in both tables. There are circumstances, however, when one, or both, of the columns to be joined has values that the other column does not have. In such situations, it is not unusual to require that a query give a listing of both matched and non-matched values, along with other data values from those rows where possible. This is an "outer join". As an example, suppose a PERSONNEL table lists all employees. An APPRAISAL table lists the names of employees who have been evaluated, with their ratings. You want a list of all employees, with ratings for those evaluated, and a "0" rating for those not yet evaluated. For this outer join, we can refer to the PERSONNEL table as the "outer" table, and the APPRAISAL table as the "inner" table. This paper describes several methods to accomplish an outer join, using both the SQL and QUEL query languages. Note that a SELECT with a UNION can be used in SQL in Release 6 of Ingres, but not in Release 5 (except directly in the Terminal Monitor) because of translation of the SQL query into QUEL. Note also that starting in Release 6.3 a correlated update (updating one table from another) could be used in SQL. Also note that there are performance issues involved in these different query methods. If performance is a concern, then note that as a general rule, you can follow these guidelines: o Using the SQL SELECT with a UNION (Release 6) is faster for shorter running queries, with smaller amounts of data o Using the SQL correlated update with a temporary table is faster on more expensive queries If you're not sure which approach to take, do some benchmarking/testing, and compare QEPs for the different approaches. Sample Data ----------- The data for the following discussion are the two tables: PERSONNEL APPRAISAL ------------ ------------------- | name | |name |rating| ------------ ------------------- | Adams | |Adams |7.5 | | Brown | |Dixon |9.2 | | Campbell | ------------------- | Dixon | | Estes | ------------ Method 1 - SQL SELECT with a UNION, Release 6 INGRES ---------------------------------------------------- SELECT a.name, a.rating FROM appraisal a, personnel p WHERE a.name = p.name UNION SELECT p.name, 0.0 FROM personnel p WHERE NOT EXISTS (SELECT * FROM appraisal a WHERE a.name = p.name) Method 2 - Using a Correlated Update on a Temporary Table --------------------------------------------------------- CREATE TABLE otemp AS SELECT p.name, rating = 0.0 FROM personnel p; UPDATE otemp o FROM appraisal a SET rating = a.rating WHERE o.name = a.name; Method 3 - QUEL REPLACE with a Temporary Table ---------------------------------------------- RANGE OF p IS personnel RETRIEVE INTO temp (p.name, rating = 0.0) RANGE OF t IS temp RANGE OF a IS appraisal REPLACE t (rating = a.rating) WHERE a.name = t.name Method 4 - Using Aggregate Functions ------------------------------------ The QUEL language offers an alternative method for performing an outer join. This method relies on syntax rather than temporary tables. With the same data that we used in the previous method, we can do a simple outer join that counts the number of matching rows in the inner table, and includes a count of 0 for values in the outer table that have no match in the inner table. Here is the retrieval and the resulting table: RANGE OF p IS personnel RANGE OF a IS appraisal RETRIEVE (p.name,n=COUNT(a.name BY p.name WHERE a.name=p.name)) --------------------- |name |n | |-----------|-------| |Adams |1 | |Brown |0 | |Campbell |0 | |Dixon |1 | |Estes |0 | --------------------- If we want to obtain the actual rows, and not just count them, then we have to use the aggregate function ANY. If a retrieved row passes the qualification, then ANY returns a value of "1". A "0" is returned if the qualification is not passed. If we put a join in the ANY function, we are, in effect, testing whether a value is shared between two tables. By requiring that the function return a "0", we are asking for those values not shared by the two tables. If we combine that qualification with the usual join syntax, we obtain the rows that do have a shared value, as well as those that do not - an outer join. There is one extra step required to use this method. A dummy row must be added to the inner table to provide values for the data when no joining values are found. These can be codes or standard values that you have decided represent "missing data". Here is the QUEL syntax for performing the outer join: RANGE OF p IS personnel RANGE OF a IS appraisal APPEND TO appraisal (name=" ",rating=0.0) RETRIEVE (p.name,a.rating) WHERE (p.name=a.name) OR ((ANY(p.name BY p.name WHERE p.name=a.name) = 0) AND a.name= " ") The result of this RETRIEVE is the following table: +----------------------+ |name |rating | |-------------|--------| |Adams |7.5 | |Brown |0.0 | |Campbell |0.0 | |Dixon |9.2 | |Estes |0.0 | +----------------------+ First, the RETRIEVE does a standard join by the "WHERE (p.name=a.name)" syntax. Then, rows in PERSONNEL that do not have a match in APPRAISAL are included because of the "ANY...= 0" syntax. When they are included, they are joined to the dummy row in APPRAISAL by the "AND a.name..." syntax so the dummy values can be included for the data fields. Summary ------- An outer join is a join of two tables in a database in which the values of the joining field(s) do not always occur in both tables. Outer joins can be performed in INGRES using QUEL or SQL code ter, the best and safest way to add or change your compiler is to run the "iibuild". Releases affected: 6.0/00(all.vms) - Releases not affected: Errors: Bugs/SIRS: ------------------------------------------------------------------------------
© William Yuan 2000
Email William