Document #: US-38697,EN ------------------------------------------------------------------------------ Major subject: analysis Minor subjects: tech_notes Keywords: checklist, dba_guide Abstract: The "Query Execution Plan" section of the 6.4 INGRES Database Administrator's Guide "Optimizing Database Performance" chapter. Equivalent to the Release 6 Technical Note #4 or Release 5 Technical Note #28. Expert note: The Query Execution Plan (QEP) ============================== o Listing a QEP o Reading a QEP o Sample Tables for QEP Examples o Types of Nodes in a QEP o Multiple Query Plans o More Complex QEPS o Evaluating QEPS: A Summary When the INGRES optimizer evaluates a query it generates a QEP (Query execution plan) showing how the query will be executed. Once the QEP has been generated, INGRES can use it one or more times to execute the corresponding database query. Since there are often many different ways to optimize any given query, choosing the best QEP may have a significant impact on performance. You can print an informal listing of the QEP selected, which can be used to gain insight into how queries are handled by the INGRES optimizer. The word "query" refers to any of the data manipulation language (DML) statements that require access to the database. These include the SQL statements 'select', 'insert', 'update', 'delete', and 'create table as'. Examining QEPs help you understand what is involved in executing complex queries in single-user situations. For multiuser performance issues, refer to Chapter 14, "INGRES Locking." Listing a QEP ------------- In general it is a good idea to test run a query in the terminal monitor. You can obtain a printed copy of the QEP, using the '\script' statement and using the 'set optimizeonly' statement to generate a QEP without actually executing the query. Use one of the following statements to list QEPs: set qep\g (Terminal Monitor statement) exec sql set qep; (Embedded SQL) To view query execution plans without executing the queries, use 'set optimizeonly' in conjunction with 'set qep'. Less frequently you might want a QEP from a query submitted through a program. In this case, define an environment variable that INGRES will translate: UNIX:C shell: $ setenv ING_SET "set qep" UNIX:Bourne shell: $ ING_SET = "set qep" $ export ING_SET VMS: $ define ING_SET "set qep" Setting Optimize-Only - - - - - - - - - - - The following 'set' statement can be used to specify whether query execution will halt after the optimization phase: set [no]optimizeonly To halt execution after the query has been optimized, specify 'set optimizeonly'. To continue query execution after the query is optimized, specify 'set nooptimizeonly'. This is the default operation. This statement is used to cancel the effect of a previously set 'optimizeonly' state. Reading a QEP ------------- A QEP is printed as a binary tree, with each node having one or two children: Parent / \ Child Child The tree is read in postorder sequence (i.,e., child, child, parent recursively). Of the different types of nodes (described in the next section) only Join and Cart-Prod nodes have two children; other nodes have only a left child. Different levels in the tree are also separated by lines containing "\" and "/" symbols: Parent / \ Child Child / Parent / \ Child Child Each level has one or more nodes, with each node being made up of a block of 3 to 5 lines of text (explained below). As noted, the nodes are connected by "/" which is a left child link and "\" which is a right child link. The "/" and "\" links point to a node one level up in the tree. When reading a QEP the first thing to do is to find the "orig" nodes of the tree, which are those nodes with no children. Orig nodes contain details of the tables (and secondary indexes) that are being used in the query. Queries that have been "modified" by INGRES to include views, integrities and/or permits are more involved. For very involved queries, the tree may be very wide and could wrap so that similar levels in the tree appear on different levels in the printout. You may find it easier to read such QEPs if you cut and paste the diagrams into the correct levels. Sample Tables for QEP Examples ------------------------------ In the examples in this section, the following two tables are used: 1. Table arel(col1,col2,col3): Name: arel Owner: supp60 Created: 26-oct-1989 08:50:00 Location: ii_database Type: user table Version: ING6.0 Row width: 413 Number of rows: 156 Storage structure: hash Duplicate Rows: allowed Number of pages: 70 Overflow data pages: 6 Column Information: Column Key Name Type Length Nulls Defaults Seq col1 integer 4 yes no 1 col2 integer 4 yes no col3 varchar 400 yes no Secondary indexes: aindex (col2) structure: isam 2. Table brel(col1,col2): Name: brel Owner: supp60 Created: 26-oct-1989 08:53:00 Location: ii_database Type: user table Version: ING6.0 Row width: 10 Number of rows: 156 Storage structure: isam Duplicate Rows: allowed Number of pages: 3 Overflow data pages: 1 Column Information: Column Key Name Type Length Nulls Defaults Seq col1 integer 4 yes no 1 col2 integer 4 yes no Secondary indexes: none Types of Nodes in a QEP ----------------------- There are three general types of nodes on a QEP: o Orig (or leaf) node -- describes a particular table o Proj-rest node -- describes the qualification, or projection restriction, on one or more subsidiary orig nodes o Join nodes -- describes a join. One of the following strategies is used: o Cartesian product o Full sort merge o Partial sort merge o Key and tid lookup joins o Subquery joins Also, many of these nodes may be shown as a "sorted" node. This type of node is also described below. Orig Node - - - - - The "orig", or leaf, node usually describes a table being accessed from the query. This node is displayed in the following format on the QEP listings: tablename storage structure (colname) Pages n Tups m Orig Node Parameters Parameter Description tablename The table on which the query is being run, usually a table name. It is a secondary index if secondary indexes are selected by the optimizer for execution of the query. storage structure The storage structure in use: Btree(key|NU) Hashed(key|NU) Heap Isam(key|NU) where "key" is the key. If the primary key will not be used, the notation 'NU' (Not Used) appears. colname The name of the column on which processing is done. For "colnames" there may be a list of column names separated by blanks. Pages n Tups m "n" is the total number of INGRES pages involved at the node (1 page = 2K bytes). "m" is total number of tuples (rows). Orig nodes are the most common type of QEP node. The orig node lists the storage structure of the tables being used by the query and the key, if any (unless the type is 'heap', which does not have a parenthesized entry). Proj-rest Node - - - - - - - - The project-restrict ("Proj-rest") node defines how a subsidiary orig node is to key into the table and what qualification to use on the table. This node is displayed in the following format on the QEP listings: Proj-rest Heap Pages n Tups m Dx Cy or Proj-rest Sort on (colnames) Pages n Tups m Dx Cy Proj-rest Node Parameters Parameter Description colname The name of the column on which processing is done. For "colnames" there may be a list of column names separated by blanks. Pages n Tups m "n" is the total number of INGRES pages involved at the node (1 page = 2K bytes). "m" is total number of tuples (rows). Dx Cy Cumulative amounts of cost that are anticipated at each stage in the execution of the query: Dx Disk I/O cost. "x" approximates the number of 2K pages to be referenced. Cy CPU usage. "y" units can be used to compare amounts of CPU resources required. Proj-rest nodes are used to remove data irrelevant to the query from a table so that the minimum amount of data is carried around during the execution of the query. Only those columns referenced in the target list and 'where' clause for a table are "projected," and only those rows that satisfy constraints in the 'where' clause are "restricted" for use. Proj-Rest nodes mostly consume disk operations that read the data from the tables. The amount of disk I/O depends on what storage structures were used as well as the number of pages accessed. For Proj-rest and other non-orig nodes, the storage structure generally is 'heap', unless the data is being sorted. All non-orig nodes also show the cumulative count line: Dx Cy Since these values are cumulative up the tree, the top node carries the total resources required to execute the query. The effort involved in executing a specific node is therefore the D and C values for that node minus those of the child node (or both child nodes in the case of a join node). Sort Nodes - - - - - - Sort nodes cause the data to be sorted as it is returned. Any node other than an orig node can appear with a sort indication. A query with a 'sort' clause on it has a sort node as the topmost node in the QEP. This node is displayed in the following format on the QEP listings: Sort(colname) Pages n Tups m Dx Cy Sort Node Parameters Parameter Description colname The name of the column on which sorting is done. For colnames there may be a list of column names separated by blanks. Pages n Tups m "n" is the total number of INGRES pages involved at the node (1 page = 2K bytes). "m" is total number of tuples (rows). Dx Cy Cumulative amounts of cost that are anticipated at each stage in the execution of the query: Dx Disk I/O cost. "x" approximates the number of 2K pages to be referenced. Cy CPU usage. "y" units can be used to compare amounts of CPU resources required. If the sort node is combined with a Proj-Rest node, the sort columns are displayed along with Proj-Rest. Sort nodes make use of a sort buffer and so consume primarily cpu resources, except requiring disk I/O when the data to be sorted cannot be accommodated in the sort buffer. INGRES uses the Quicksort algorithm which is very fast on nearly sorted data. The optimizer tends to sort whenever the data fits in the sort buffer (even if it believes only one row will be sorted). Examples - - - - - Some QEP examples that illustrate these non-join nodes are given below. This is an example of a simple primary key lookup. The query execution plan is shown below for the following SQL query: select col1,col2 from arel where col1 = 1234 order by col2; *********************************************************** QUERY PLAN 3,1, no timeout, of main query Sort on(col2) Pages 1 Tups 1 D1 C0 / Proj-rest Sorted(col1) Pages 1 Tups 1 D1 C0 / arel Hashed(col1) Pages 70 Tups 156 *********************************************************** In the above example the Hashed(col1) implies the hashed primary index was used in a direct lookup (i.e. only 1 disk operation unit). The project restrict selected the rows matching the 'where' constraint and removed superfluous columns. The final sort node reflects the sort clause on the query. The following is an example of a select on a non-keyed field. The query execution plan is shown below for the following SQL query: select col1,col2 from arel where col3 = 'x' order by col1; *********************************************************** QUERY PLAN 3,1, no timeout, of main query Sort on(No Attr) Pages 1 Tups 1 D9 C0 / Proj-rest Heap Pages 1 Tups 1 D9 C0 / arel Hashed(NU) Pages 70 Tups 156 *********************************************************** In this example the Hashed(NU) implies that the table had to be scanned (i.e., visiting all 70 pages). Note misleading number of pages and tuples returned. Without 'optimizedb' statistics the optimizer uses a "best guess" approach (1% for equalities and 10% for non-equalities). Also note that the optimizer takes into account disk readahead or group reads when performing scans of tables -- although 70 pages of data have to be read to scan arel the estimated disk I/O value is only 9 reads (D9) due to this effect. The optimizer assumes a typical readahead of 8 pages when performing scans, so here 70/8 reads generates an estimate of 9 disk operations. Join Nodes - - - - - - There is an inner and an outer tree beneath every join node. These function similarly to an inner and outer program loop. By convention, the left-hand tree is the outer tree and the right-hand tree is the inner tree. There are various types of join nodes, described individually below, but the joining method is the same: for each row from the outer tree, there is a join to all of the rows that can possibly qualify from the inner tree. Then the next row from the outer tree is processed, etc. Cartesian Products - - - - - - - - - - The cartesian product, or "cart-prod," strictly follows the unoptimized join model, with each outer node compared to all rows from the inner node. Cart-Prod / \ proj-rest table / table This does not mean that all rows are actually read, only that all rows that satisfy the conditions of the query are compared. This node is displayed in the following format on the QEP listings: Cart-Prod (colname) heap Pages n Tups m Dx Cy Cart-prod Node Parameters Parameter Description colname The name of the column on which processing is done. For "colnames" there may be a list of column names separated by blanks. Pages n Tups m "n" is the total number of INGRES pages involved at the node (1 page = 2K bytes). "m" is total number of tuples (rows). Dx Cy Cumulative amounts of cost that are anticipated at each stage in the execution of the query: Dx Disk I/O cost. "x" approximates the number of 2K pages to be referenced. Cy CPU usage. "y" units can be used to compare amounts of CPU resources required. The cart-prod is most frequently observed in disjoint queries (i.e., when use of correlation variables and table names are mixed in the query). However, the following cases may also generate cart-prods without adversely affecting performance: o Queries involving ORs that can usually be decomposed into a sequence of smaller queries o No join specification (a disjoint table or no 'where' clause so that there is no relationship between the two tables) o Small tables or amounts of data o Non_equijoins: where r1.col1 > r2.col2 Cart-prods are sometimes associated with substantial estimates for disk I/O usage and affect performance adversely. Here is an example of a simple disjoint query: select arel.col1 from arel,arel a where a.col1 = 99; *********************************************************** QUERY PLAN 7,1, no timeout, of main query Cart-Prod Heap Pages 1 Tups 243 D9 C4 / \ Proj-rest Proj-rest Sorted(NU) Heap Pages 1 Tups 1 Pages 1 Tups 156 D1 C0 D8 C1 / / arel arel Hashed(col1) Hashed(NU) Pages 70 Tups 156 Pages 70 Tups 156 *********************************************************** Full Sort Merge - - - - - - - - The "full sort merge "is an optimized version that attempts to join the inner and outer trees doing fewer comparisons than the cart-prod requires. This is done by assuming that both trees are in sorted order. This allows the outer value to be compared to a range of inner values rather than going through the entire inner loop. join / \ sort sort / / proj-rest proj-rest / / table table This node is displayed in the following format on the QEP listings: FSM join (colname) Heap Pages n Tups m Dx Cy Full Sort Merge Node Parameters Parameter Description colname The name of the column on which processing is done. For "colnames" there may be a list of column names separated by blanks. Pages n Tups m "n" is the total number of INGRES pages involved at the node (1 page = 2K bytes). "m" is total number of tuples (rows). Dx Cy Cumulative amounts of cost that are anticipated at each stage in the execution of the query: Dx Disk I/O cost. "x" approximates the number of 2K pages to be referenced. Cy CPU usage. "y" units can be used to compare amounts of CPU resources required. The full sort merge is most common when a "bulk" join takes place with no row restrictions on either table involved in the join. The SQL statement would have the form: select * from r1,r2 where r1.joincol=r2.joincol; Here is an example: select a.col2,b.col2 from arel a, brel b where a.col1 = b.col1; *********************************************************** QUERY PLAN 5,1, no timeout, of main query FSM Join(col1) Heap Pages 1 Tups 156 D9 C40 / \ Proj-rest Proj-rest Sorted(eno) Sorted(eno) Pages 1 Tups 156 Pages 1 Tups 156 D8 C1 D1 C1 / / arel brel Hashed(col1) Isam(col1) Pages 70 Tups 156 Pages 3 Tups 156 *********************************************************** Partial Sort Merge - - - - - - - - - - The "partial sort merge" is a cross between a full sort merge and a cart-prod. The inner tree must be in sorted order. The outer tree may be sorted or partially sorted. Note that the outer tree in PSM scenarios will always be derived from an 'isam' table. Comparisons proceed as for the full sort merge until an outer value is found to be out of order. At that point the inner loop is restarted. join / \ proj_rest sort / / table proj-rest / table This node is displayed in the following format on the QEP listings: PSM join (colname) Heap Pages n Tups m Dx Cy Partial Sort Merge Node Parameters Parameter Description colname The name of the column on which processing is done. For "colnames" there may be a list of column names separated by blanks. Pages n Tups m "n" is the total number of INGRES pages involved at the node (1 page = 2K bytes). "m" is total number of tuples (rows). Dx Cy Cumulative amounts of cost that are anticipated at each stage in the execution of the query: Dx Disk I/O cost. "x" approximates the number of 2K pages to be referenced. Cy CPU usage. "y" units can be used to compare amounts of CPU resources required. Here is an example: select a.col2,b.col2 from arel a, brel b where a.col1 = b.col2; *********************************************************** QUERY PLAN 6,1, no timeout, of main query PSM Join(col1) Heap Pages 1 Tups 156 D9 C26 / \ Proj-rest Proj-rest Heap Sort on(col1) Pages 1 Tups 156 Pages 1 Tups 156 D1 C1 D8 C1 / / brel arel Isam(col2) Hashed(col1) Pages 3 Tups 156 Pages 70 Tups 156 *********************************************************** Key and Tid Lookup Joins - - - - - - - - - - - - - In "key" and "tid lookup joins" the outer and inner data set is not static. For each outer row, the join selects values and forms a key to search in the inner join. A key lookup join uses keyed access, and a tid lookup join uses the tid value. join / \ sort isam (or hash or btree) / proj-rest / table This node is displayed in the following format on the QEP listings: K|T join (colname) Heap Pages n Tups m Dx Cy Key/Tid Node Parameters Parameter Description colname The name of the column on which processing is done. For "colnames" there may be a list of column names separated by blanks. Pages n Tups m "n" is the total number of INGRES pages involved at the node (1 page = 2K bytes). "m" is total number of tuples (rows). Dx Cy Cumulative amounts of cost that are anticipated at each stage in the execution of the query: Dx Disk I/O cost. "x" approximates the number of 2K pages to be referenced. Cy CPU usage. "y" units can be used to compare amounts of CPU resources required. This case is seen most frequently where the proj-rest returns few rows, so it is faster to do a keyed lookup on an 'isam' ('hash' or 'btree') table than any sort merge operations. Here is an example: select b.col1,b.col2,a.col2 from arel a, brel b where a.col3 = 'x' and a.col1 = b.col1; *********************************************************** K Join(col1) Heap Pages 1 Tups 1 D3 C0 / \ Proj-rest brel Heap Isam(col1) Pages 1 Tups 1 Pages 7 Tups 128 D1 C0 / arel Hashed(col1) Pages 67 Tups 128 *********************************************************** Here access to the base table is through the secondary index and proj-rest collects tids for sorting. The tids are then used for a direct tid lookup in the base table. Hence the storage structure of the base table is NU -- the one case where this is acceptable. join (tid) / \ sort base table / structure(NU) proj-rest / index table Here is an example: select a.col1,a.col2 from arel a where a.col2 = 99 order by a.col2; *********************************************************** Sort(col2) Pages 1 Tups 2 D4 C1 / T Join(tidp) Heap Pages 1 Tups 2 D4 C0 / \ Proj-rest arel Sort on(tidp) Hashed(NU) Pages 1 Tups 1 Pages 67 Tups 128 D2 C0 / aindex Isam(col2) Pages 2 Tups 128 *********************************************************** Subquery Joins - - - - - - - - The "subquery join" is specific to SQL because SQL allows subselects as part of a query. These nodes join data to the subselect specified in the query. SE Join / \ proj-rest Tn / table where Tn identifies the previous QEP. This node is displayed in the following format on the QEP listings: SE join (colname) Heap Pages n Tups m Dx Cy SE Node Parameters Parameter Description colname The name of the column on which processing is done. For "colnames" there may be a list of column names separated by blanks. Pages n Tups m "n" is the total number of INGRES pages involved at the node (1 page = 2K bytes). "m" is total number of tuples (rows). Dx Cy Cumulative amounts of cost that are anticipated at each stage in the execution of the query: Dx Disk I/O cost. "x" approximates the number of 2K pages to be referenced. Cy CPU usage. "y" units can be used to compare amounts of CPU resources required. Here is an example: select * from arel a where are1.col2 = ( select col2 from brel b where a.col1 = b.col1) and col1 = 5; *********************************************************** QUERY PLAN 3,1, no timeout, of T1 Proj-rest Heap Pages 1 Tups 1 D1 C0 / brel Hashed(col1) Pages 3 Tups 156 QUERY PLAN 4,2, no timeout, of main query SE Join Heap Pages 1 Tups 1 D2 C0 / \ Proj-rest T1 Heap Heap Pages 1 Tups 1 Pages 1 Tups 1 D1 C0 / arel Hashed(col1) Pages 70 Tups 156 *********************************************************** Multiple Query Plans -------------------- The optimizer may generate multiple QEPs if the query includes any of the following objects: o SQL subqueries ('in', 'exists', 'all', 'any', etc.) o SQL 'union' statement o SQL 'group' statement with 'set's o Views that need to be materialized As an example of multiple QEPs, consider the processing of a view on a 'union'. The following statement creates the view: create view view1 as select distinct col1 from arel union select distinct col2 from arel; There are two selects, designated #1 and #2 in their respective QEPs below. Now consider the optimizer action in evaluating the following query: select * from view1; This generates three QEPs: o #1 -- the first select in the union o #2 -- the second select in the union o Main query *********************************************************** QUERY PLAN of union subquery #1 Sort Sort on(No Attr) Pages 1 Tups 96 D1 C10 / Proj-rest Heap Pages 1 Tups 96 D1 C0 / aindex Isam(col2) Pages 2 Tups 96 QUERY PLAN of union subquery #2 Sort Sort on(No Attr) Pages 1 Tups 96 D4 C10 / Proj-rest Heap Pages 1 Tups 96 D4 C0 / arel Hashed(NU) Pages 38 Tups 96 QUERY PLAN of main query Sort Sort on(No Attr) Pages 1 Tups 96 D19 C20 / Proj-rest Heap Pages 1 Tups 96 D12 C11 / T0 Heap Pages 1 Tups 96 *********************************************************** More Complex QEPS ----------------- The previous series of QEPs on the different classes of joins involved only two tables. More complex QEPs involving joins with three or more tables can be read as a sequence of two-table joins that have already been described, with the optimizer deciding what is the optimal join sequence. The key to understanding these complex QEPs is recognizing the join sequences and the types of joins being implemented. Evaluating QEPS: A Summary -------------------------- Here is a list of the main points to check for when evaluating a QEP: o Cart-Prods can be caused by errors due to disjoint queries or queries that involve certain types of "or" operations. Also, joins involving calculations, data type conversions and non-equijoins involve cart-prods. Alternative ways of posing the query is often advised under these circumstances. o The "NU" on the storage structure part in the orig node description is not a good sign if you believe the storage structure should have been used. o Verify that the appropriate secondary indexes are being used. Running 'optimizedb' on the indexed columns allows the optimizer to better differentiate between the selectivity powers of the different indices for a particular situation. o If there is little data in a table (e.g., less than 5 pages) the optimizer may consider a scan of the table rather than use any primary or secondary indexes, because little is to be gained from using them o Check that row estimates are accurate on the QEP nodes. If not, use 'optimizedb' on the columns in question. Releases affected: all(all.all) - Releases not affected: Errors: Bugs/SIRS: ------------------------------------------------------------------------------
© William Yuan 2000
Email William