Outer Joins

                         Document #: US-13272,EN
------------------------------------------------------------------------------

Major subject: analysis    Minor subjects: tech_notes

Keywords: quel, sql

Abstract:
Outer Joins - Equivalent to INGRES Release 6 Technical Note 
#32 or INGRES Release 5 note #79.


Expert note:
Outer Joins 
===========

o	What is an Outer Join?
o	Sample Data
o	Method 1 - SQL SELECT with a UNION, Release 6 Ingres
o	Method 2 - Using a Correlated Update on a Temporary Table
o	Method 3 - QUEL REPLACE with a Temporary Table
o	Method 4 - Using Aggregate Functions
o	Summary

What is an Outer Join?
----------------------
When a relational join is performed, data from either table can be
obtained if the joining column(s) contains matching data in both tables.
There are circumstances, however, when one, or both, of the columns to
be joined has values that the other column does not have.

In such situations, it is not unusual to require that a query give a
listing of both matched and non-matched values, along with other data
values from those rows where possible.  This is an "outer join".

As an example, suppose a PERSONNEL table lists all employees.  An
APPRAISAL table lists the names of employees who have been evaluated,
with their ratings.  You want a list of all employees, with ratings for
those evaluated, and a "0" rating for those not yet evaluated.  For this
outer join, we can refer to the PERSONNEL table as the "outer" table,
and the APPRAISAL table as the "inner" table.

This paper describes several methods to accomplish an outer join, using
both the SQL and QUEL query languages.  Note that a SELECT with a UNION can 
be used in SQL in Release 6 of Ingres, but not in Release 5 (except directly 
in the Terminal Monitor) because of translation of the SQL query into QUEL.  
Note also that starting in Release 6.3 a correlated update (updating one table 
from another) could be used in SQL.

Also note that there are performance issues involved in these different
query methods.  If performance is a concern, then note that as a general
rule, you can follow these guidelines:

o	Using the SQL SELECT with a UNION (Release 6) is faster for 
	shorter running queries, with smaller amounts of data

o	Using the SQL correlated update with a temporary table is faster
	on more expensive queries

If you're not sure which approach to take, do some benchmarking/testing,
and compare QEPs for the different approaches.


Sample Data
-----------
The data for the following discussion are the two tables:

PERSONNEL                     APPRAISAL
------------                  -------------------
| name     |                  |name      |rating|
------------                  -------------------
| Adams    |                  |Adams     |7.5   |
| Brown    |                  |Dixon     |9.2   |
| Campbell |                  -------------------
| Dixon    |
| Estes    |
------------


Method 1 - SQL SELECT with a UNION, Release 6 INGRES
----------------------------------------------------
SELECT a.name, a.rating
FROM appraisal a, personnel p
WHERE a.name = p.name

UNION

SELECT p.name, 0.0
FROM personnel p
WHERE NOT EXISTS
         (SELECT *
          FROM appraisal a
          WHERE a.name = p.name)


Method 2 - Using a Correlated Update on a Temporary Table
---------------------------------------------------------
CREATE TABLE otemp AS SELECT
        p.name, rating = 0.0 FROM personnel p;
UPDATE otemp o FROM appraisal a
        SET rating = a.rating
        WHERE o.name = a.name;


Method 3 - QUEL REPLACE with a Temporary Table
----------------------------------------------
RANGE OF p IS personnel
RETRIEVE INTO temp (p.name, rating = 0.0)
RANGE OF t IS temp
RANGE OF a IS appraisal
REPLACE t (rating = a.rating) WHERE
   a.name = t.name


Method 4 - Using Aggregate Functions
------------------------------------
The QUEL language offers an alternative method for performing an outer
join.  This method relies on syntax rather than temporary tables.

With the same data that we used in the previous method, we can do a 
simple outer join that counts the number of matching rows in the inner
table, and includes a count of 0 for values in the outer table that have
no match in the inner table.  Here is the retrieval and the resulting
table:

RANGE OF p IS personnel
RANGE OF a IS appraisal
RETRIEVE (p.name,n=COUNT(a.name BY p.name WHERE a.name=p.name))

---------------------
|name       |n      |
|-----------|-------|
|Adams      |1      |
|Brown      |0      |
|Campbell   |0      |
|Dixon      |1      |
|Estes      |0      |
---------------------

If we want to obtain the actual rows, and not just count them, then we
have to use the aggregate function ANY.  If a retrieved row passes the
qualification, then ANY returns a value of "1". A "0" is returned if
the qualification is not passed. If we put a join in the ANY function,
we are, in effect, testing whether a value is shared between two tables.
By requiring that the function return a "0", we are asking for those
values not shared by the two tables.  If we combine that qualification
with the usual join syntax, we obtain the rows that do have a shared
value, as well as those that do not - an outer join.

There is one extra step required to use this method.  A dummy row must be
added to the inner table to provide values for the data when no joining
values are found.  These can be codes or standard values that you have
decided represent "missing data".

Here is the QUEL syntax for performing the outer join:

RANGE OF p IS personnel
RANGE OF a IS appraisal
APPEND TO appraisal (name=" ",rating=0.0)
RETRIEVE (p.name,a.rating) WHERE (p.name=a.name) OR
 ((ANY(p.name BY p.name WHERE p.name=a.name) = 0) AND a.name= " ")

The result of this RETRIEVE is the following table:

+----------------------+
|name         |rating  |
|-------------|--------|
|Adams        |7.5     |
|Brown        |0.0     |
|Campbell     |0.0     |
|Dixon        |9.2     |
|Estes        |0.0     |
+----------------------+

First, the RETRIEVE does a standard join by the "WHERE (p.name=a.name)"
syntax.  Then, rows in PERSONNEL that do not have a match in APPRAISAL
are included because of the "ANY...= 0" syntax.  When they are included,
they are joined to the dummy row in APPRAISAL by the "AND a.name..."
syntax so the dummy values can be included for the data fields.


Summary
-------
An outer join is a join of two tables in a database in which the values
of the joining field(s) do not always occur in both tables.  Outer joins
can be performed in INGRES using QUEL or SQL code ter, the best and safest way to add or change your compiler
is to run the "iibuild".



Releases affected:  6.0/00(all.vms) -   Releases not affected: 
Errors:                                                             
Bugs/SIRS:                                                             
------------------------------------------------------------------------------
Ingres Database Reference
To William's Home Page

© William Yuan 2000

Email William