On Aug 5, 14:19, Beedle,Paul(STP) wrote: > Subject: RE: MAXPAGES > [re: why does MAXPAGES exist?] > > I have used: > > with MAXPAGES = 1 > > for tables that contain sequential key values for use in other tables. > I want to make sure that the table contains only one row and that it will > never grow, so I pad the rest of the row with a column of 1024 bytes. I'm not sure what benefit you are getting here. MAXPAGES does not limit your tablesize permanently -- it only limits the number of primary hash pages allocated when you modify the table. I would suggest that you not bother with either the padding column or the HASH structure, and simply use a HEAP with one row and one column. (or, you can use ii_sequence_values, which is used by the SEQUENCE_VALUE 4GL builtin procedure). -- Karl Schendel Phone: (412) 963-8844 Telesis Computer Corp Fax: (412) 963-1373 wiz@telesis.com I wrote: > I suppose you could use MAXPAGES to allow some space for new rows in the > main pages, while limiting the table size by forcing existing rows into > overflow pages. ... So in other words, fooling > around with MAXPAGES might be a way to get two different fillfactors, > allowing you to bias the modified table favorably towards inserts. Well, my curiousity got the best of me, so I tried some experiments. I modified a table to hash unique on a key column (which just happened to be a TABLE_KEY WITH SYSTEM_MAINTAINED, ugh) and got a 512 page table. I then modified it to hash on the same key with MAXPAGES=64, FILLFACTOR=25. Guess what? The fillfactor was completely ignored! It would appear that Ingres decided to fill up all the primary pages 100%, and then filled the overflow pages to whatever extent was necessary. So my clever and useless idea is in fact incorrect in the existing implementation. (6.4/05 was what i tested with.) My current thinking is that MAXPAGES is a strange synonym for "Ignore that there fillfactor setting..." Actually I think Roy hit it right on with: > ... Or was it just an irrational loathing of the > asymmetry of having only a MINPAGES parameter without a matching MAXPAGES? By the way, it looks to me like the fillfactor in a HASH modify is used only to compute a guess at the required number of primary pages. It seems to ignore fillfactor when actually filling the pages. So if you fool with the primary page count by specifying MINPAGES or MAXPAGES, you're probably wasting your time specifying FILLFACTOR too. This sort of seems like a misfeature to me, perhaps even (dare I say it) a bug. -- Karl Schendel Phone: (412) 963-8844 Telesis Computer Corp Fax: (412) 963-1373 wiz@telesis.com Karl, From what I've learned, HASH tables reclaim deleted row space, while HEAP continues to grow, leaving deleted row space. I use HASH for tables that are delete heavy and maintain a relatively stable size. For sequential keys, I use update statements rather that delete-insert, and I originally used HEAP to avoid optimizer overhead. I switched to HASH because of my mistaken notion that MAXPAGES would prevent table growth, and because the overhead associated with that small of a table minuscule compared to the remainder of my database transactions. Regarding MAXPAGES not preventing overflow, I was unaware. Thanks for the info Karl and Roy. Paul Beedle Roy Hann wrote: > > On Mon, 5 Aug 1996, Beedle,Paul(STP) wrote: > > > I have used: > > > > with MAXPAGES = 1 > > > > for tables that contain sequential key values for use in other tables. I > > want to make sure that the table contains only one row and that it will > > never grow, so I pad the rest of the row with a column of 1024 bytes. > > An interesting notion, but I don't think it limits growth at all. All it > does is limit you to one primary page. Growth would still be permitted > but it would be along an overflow chain. A 1 page hash table is an > interesting concept also. > > No, sorry, good try. :-) I'm still not satisfied that this is what the > designers had in mind when they included the MAXPAGES option. Is there > anyone else who can convince me it is not just an aberration due to a surfeit > of lobster-and-gorgonzola pizza one late night? > > ======================================================================== > Roy Hann rhann@tnc.com > BKB Engineering Ltd., NAIUA Member-at-Large > 11211-76th Avenue, > Edmonton, Alberta http://www.tnc.com/bkb > T6G 0K2 Tel: (403)497-2409 > Canada FAX: (403)436-6055 > ======================================================================== Hi Roy, Because the number of main pages in a hash table is always rounded to the _nearest_ power of two (it works best with the hashing algorithm), hash tables double in size when they pass a threshold such that a higher power of two is nearest to the number of pages required. If you have very large hash tables then in order to manage your disk space, you may want to inhibit this doubling of size by using maxpages. Sample Calculation ------------------ 10000 rows at 240 bytes per row Number of rows per page = (2010 / (240 + 2)) = 8 Number of pages required = 10000/8 = 12500 50% fillfactor for hash = 25000 allocated. BUT 25000 is not a power of two. The powers of two which lie either side of the 25000 are 16384, and 32768. Of the two, 32768 is nearer to 25000 so that is the number of main pages that would be allocated. If the number of pages required had worked out at 24575 rather than 25000, then we would have allocated 16384 main pages instead. So at some point as my table grows, it passes through the 24575 barrier and doubles in main pages from 16384 to 32768. That's where maxpages comes in handy, you may want to prevent this doubling until either you have more space, or you can eliminate some old data. Hope this satisfies the requirements for it to be useful :) ! Tony ------------------------------------------------------------------------ Tony Smith tsmith@ingres.com ------------------------------------------------------------------------ Well folks, I got this rather intriguing justification for MAXPAGES from someone else who's shy and doesn't want his name used: > ... Consider a situation where for a key > choice you have a finite set of values (Some look-up table, 50 states, etc) > You also have gobs of data. Not using maxpages for your set of data > will produce an excessive number of hash buckets, but you really only > need 50 distinct ones. You want all the data for each value to be in the > overflow chain. If you set maxpages, you will limit the number of "wasted" > main pages. Assuming the above is accurate, and I have no reason to disbelieve it, what is happening is that MODIFY does not know anything about the hash key distribution before it chooses the number of primary pages. I suppose that's a realistic implementation of MODIFY. If you have decided to hash on a sufficiently non-unique key, Ingres will by default think that it needs lots of primary pages, and then will leave most of them empty because rows will hash to only a few buckets. As soon as those primary pages fill, overflow pages will occur, but the rest of the primary pages will remain empty and useless. I suppose with reasonably wide rows, you could get into the above situation even without massive key duplication (ie you don't necessarily need a stupid key choice to run into the page wastage). So it would seem that MAXPAGES solves a real world problem after all. Kudos to Mr Anonymous who thought that one up. (Now we need this knowledge inserted into the DBA Guide in place of the bogosity that's there now.) -- Karl Schendel Phone: (412) 963-8844 Telesis Computer Corp Fax: (412) 963-1373 wiz@telesis.com > Roy Hannwrote: > >Another explanation is offered by the DBA Guide BTW, but it strikes me as > >pretty lame too. According to the DBA Guide we can use MAXPAGES to > >encourage a compressed hash table to be smaller than it might otherwise > >be. It is not clear why that might be the case though. It doesn't > >strike me as being real important either... > > That's exactly what I'm using it for - you can't have a FILLFACTOR of > over 100(%) - or it least that's the way it's documented. My compressed > rows are typically much smaller than their uncompressed size and I would > get a very large, very sparse hash table without using MAXPAGES. There > may be a better way to do it - for instance if fillfactor could be > greater than 100, but this works for me - not lame at all. Using > MAXPAGES means I have to occasionally check the size of my table to make > sure my effective fillfactor doesn't get too large. When I said it was lame, I was thinking about why the feature was ever included in the first place. I was trying to think what obvious and important problem the designers were hoping to solve when they included the parameter. I am quite prepared to believe that it solves the problem you (and the manual) describe, I am just not quick to believe that it is THE problem the designers had in mind. The compressed hash table problem just seems too obscure and esoteric somehow. The explanation that Karl Schendel passed on, that MAXPAGES really means "MAXBUCKETS", does seem to me (for no good reason) more likely to be what they were thinking of. But whatever, I now concede that MAXPAGES is not a mere decoration, and that it can do a number of useful things. Thank you to everyone for setting me straight on that. ======================================================================== Roy Hann rhann@tnc.com BKB Engineering Ltd., NAIUA Member-at-Large 11211-76th Avenue, Edmonton, Alberta http://www.tnc.com/bkb T6G 0K2 Tel: (403)497-2409 Canada FAX: (403)436-6055 ======================================================================== >On Oct 10, 15:27, Roy Hann wrote: >> Subject: Fillfactors >> I got such a wonderful dividend of speculation (and even a good answer) >> from the modest investment I made in asking a single question about the >> purpose of MAXPAGES a few months ago that I am encouraged to ask another >> simple question. Are you ready? >> >> Is there any good reason why the default fillfactor for a non-compressed >> B-tree data page is 80% (and not say, 100%)? >> >> (S.M.: as you can tell, this is still keeping me awake nights :-) Hi Roy. I must have missed your original posting about maxpages so I apologise if somebody has already answered the same as this: The only good reason I can find for maxpages is to prevent the standard sizing algorithm from creating HUGE tables. By default the number of pages required for a hash table is (number of rows/rows per page) rounded up to the nearest power of 2. Therefore in some cases a 1Gb table can jump to 2Gb just by the addition of a few rows followed by a modify. Maxpages can be used to control this With regard to your second question, I too have thought about this and can only assume that the default fillfactor is not 100% to improve concurrency on the data pages....but why 80% - who knows! Perhaps it was too difficult to code a different fillfactor for data pages as opposed to the ADPs! Cheers Nick Wadge Seer Systems Ltd
© William Yuan 2000
Email William