QEF sort size

>From: rhh@tachy.uah.ualberta.ca (Roy Hann)
>Subject: Re: qef.sort_size
>Date: 1 Aug 1995 16:06:49 GMT

>stevec@water.ca.gov (Steve Croft) writes:
>: 
>: I have a call into CA on this one but I thought maybe someone out there
>: would have a similar problem.
>: 
>: I was given a query to benchmark on several systems.  The query would
>: take about 5 minutes to execute except on one particular system.  In comparing
>: the configuration of the different systems I finally found that by removing
>: a qef.sort_size entry I could cut the execution time in half!  The systems
>: have plenty of RAM and no paging was observed during the test.  Is there a
>: problem with the qef.sort_size option or the way I have used it?  I entered
>: the qef.sort_size figure in bytes (the manuals are a little vague).

>I have heard of this, and I have heard of an explanation that I find a
>little hard to swallow, but I'll repeat it here for the amusement of
>everyone:  memory resident sorts are done using a fairly inefficient
>algorithm--not a bubble sort I don't think, but something about as
>bad.  On the other hand, disc sorts are done using some kind of Shell
>sort or quicksort.  By making the sort buffer bigger, you invite Ingres
>to use a horrible exponential algorithm, while if you make it smaller,
>you force Ingres to use disc and a more efficient logarithmic
>algorithm.

>It's bizarre, but that's the word on the street.  The good news is that
>in OI 1.1 memory resident sorts supposedly are done using a quicksort 
>too.

>Regardless of whether this is really the cause, the effect is real.

>--Roy Hann

Yep,

Roy is pretty much on the money (again), setting the qef sort size to a larger 
value WILL degrade performance.  you can try setting between 64k and 1Mb and 
monitor the performance to find an optimal setting.  My experience (which 
isn't exactly exhaustive) indicates that there is no 'optimal' setting across 
all platforms-queries.

Cheers,

Paul
--------------------------------------------------------------------------------
Paul N. Neal              |
PNeal Information Systems |
Office (905) 785-0404     |        This space intentionally left blank !
Fax    (905) 785-0415     |
email:  pneal@inforamp.net|
--------------------------------------------------------------------------------
Roy Hann writes:

>I have heard of this, and I have heard of an explanation that I find a
>little hard to swallow, but I'll repeat it here for the amusement of
>everyone:  memory resident sorts are done using a fairly inefficient
>algorithm--not a bubble sort I don't think, but something about as
>bad.  On the other hand, disc sorts are done using some kind of Shell
>sort or quicksort.  By making the sort buffer bigger, you invite Ingres
>to use a horrible exponential algorithm, while if you make it smaller,
>you force Ingres to use disc and a more efficient logarithmic
>algorithm.

Tony Smith's posting gives a fairly good read on the problem (thanks,
Tony).  Also, CA Tech Support forwarded a couple of tech notes on the
problem, US-122212 and US-125218.  The documents were not posted to Advisor
at the time pending review (they may be there now).  Since CA pastes a huge
copyright notice at the start of their tech notes, I won't repost it here,
but I can summarize...

The QEF uses an insertion sort and an array of pointers.  Inserting into
the array becomes less efficient as the amount of data grows.  This is because
the data in the array below the insertion point must be moved downward to make
room for the new entry; and this is done for each tuple!  You can see that as
the data sets get larger this algorithm becomes less efficient.

If you are so inclined I suppose you can fiddle with qef.sort_size and find
that magic number where larger data sets would be better off with the disk
sort.

Steve.Croft@water.ca.gov
DBMS Admin
Ingres Q & A
To William's Home Page
Email William