> Quick question : what type of Sort(s) does Ingres use to order its data for > output or joining etc. ? There are two data sorters in the Ingres server. Firstly, the query execution facility (QEF) has its own sorter, which uses a binary insertion sort. A server startup parameter (qef.sort_size) controls the size of the buffer used by this sorter. The default is 8 pages (16Kbytes). Secondly, if the number of tuples being sorted doesn't fit in the QEF sort buffer, then a separate sorter in the data manipulation facility (DMF) is invoked. This is a n-way merge sort using replacement selection through a heap. A single sort output work file is used to gather all the runs, which are then merged into a second output work file. This process continues, alternating between reading and writing the two sort work files until the sort is complete. The DMF sorter is not really a standard tape merge sort, because the sort work files are not accessed sequentially, and must be on disk or other random access device. In 6.4, it is possible to "stripe" the sort work files across multiple locations using the 'II_WORKDIR_n' environment variables. If these are not defined, then the sort work files are placed in the default location for the database. If you want more details of the sorting algorithms, then a standard text such as Knuth's "Sorting and Searching" will explain much better than I can! -Mike Glendinning, Sequent Computer Systems Ltd. Michael Leo wrote: > > Gajendra Jainwrites: > > > > > Hi! > > > > How does Ingres uses sorting alrgorithm. For Hp -UNIX system does it use > > default sorting provided in the kernel or Ingres has its on sorting mechanism. > > Our plan is to replace the sorting used by kernel with a fast sorting > > software Syncsort. This will help us gain 200-300% on sorting. > > > > Need a response from Ingres Gurus. > > > > Thanks, > > Gajendra > > I am reasonably certain that Ingres uses its own internal sorting > for both in-memory and disk sorts. I believe this to be true because > of sort bugs I've run in to on certain platforms in the past. > > The in-memory sort is a bubble sort. That should amuse the folks > at Syncsort. > > The disk sorts depend upon the query plan and are highly integrated > with the Ingres DMF facility. > > In conclusion, I don't believe Syncsort's product will help your > Ingres performance at all. I think it is a good product, though. > > Cheers, > > Michael Leo mleo@cariboulake.com mal@visi.com > Caribou Lake Software http://www.cariboulake.com Java/RDBMS/Ingres Solutions > It is true that Ingres has its own sort techniques and that external sort packages cannot be hooked into it. All sorts done as part of query plan execution start off as QEF in-memory sorts. If the number of rows being sorted causes memory requirements to exceed the "qef_sort_mem" config parameter, QEF defers to a DMF disk sort. The QEF sort used to be an insertion sort (not a bubble sort) which used a binary search to determine the location of each added row. The only virtue of this technique was that duplicates were detected as they entered the sort (since the rows were being ordered as they were read) and could be discarded before they occupied any sort memory (obviously only for duplicates removal sorts). It was decided prior to 1.2 that the rather large performance penalty paid by all sorts (not just duplicates elimination sorts) for this luxury wasn't worth it. So starting with 1.2, the QEF sort uses the same algorithm as the DMF sort (only without the disk support). And that algorithm is a heap sort, which I'm told is one of the more reasonable algorithms kicking around (according to my data structures and algorithms text, it has n log n performance, as does Quicksort). And to confirm earlier responses to the original question, there is indeed no way to substitute a commercial sort package for any of the sorting done by Ingres. Doug. On Sep 15, 13:53, Doug Inkster wrote: > ..... So in 1.2 and 2.0, the > QEF sort is an in-memory heap sort (one of the more reasonable sort > algorithms, or so I'm told), the same as DMF's algorithm (except DMF > will offload to disk, if necessary). Well, that's certainly interesting. I would have thought it would be a shell-sort, or a quicksort. Quicksort is generally considered the fastest, as long as you are careful to avoid its n^2 behaviour on already-sorted input. Heapsort is noted for being an N log N algorithm in the worst case as well as the average case, although with a high constant factor. So it tends to be not ever especially fast and not ever particularly slow. I guess this means that if you've got the memory, large qef_sort_mem values are now more plausible than they were in the old days. -- Karl Schendel Phone: (412) 963-8844 Telesis Computer Corp Fax: (412) 963-1373 wiz@telesismfg.com
© William Yuan 2000
Email William