Optimizer statistics & checkpoint anecdote

mdkale@aol.com (MDKale) said: 
 
> but beware, optimizedb can produce bad statistics (rep factors) when sampling.
> I have had to write several scripts on my 5GB database to "fudge" and replace rep
> factors after I collect statistics on a 5 million row table. 
 
Or just get complete statistics . I've found that with the exception of
control-type systems, the processing load varies over time, and there is
usually a window, even on a 5 GB database, where with some work you can get
full statistics if necessary. Some time during the day or night, there's
I/O left over, and you can get a brief shot at the system catalogs. 
 
The database optimize command has the benefit of being runnable from the
shell, making it schedulable via cron or similar, if you know your
processing schedule. It can also be configured to take individual tables,
columns, etc. So you can get statistics in chunks. 
 
There are several tools out there that help automate and organize this kind
of activity. Common Sense Computing's (I/AutoDBA), and AGEIS Management
Systems' (Name escapes me just now.) come immediately to mind. There might
well be others. 
 
> 
> Secondly, I believe the optimizer's assumption on an exact match is 1% of
> the rows, not 10%. 
 
You're right. Been a long time since I've RTFM on that one. Sorry. 
 
>To this point, online checkpoints should be well tested.   
 
Yes, Right, Hear, Hear, Of Course, Bravo!, etc. etc. 
 
 
 
 
One site I dealt with actually lost a database (kind of, we got most of it
back, but it wasn't pretty) because they had been religiously doing
checkpoints, and never tried to recover one. They had never RTFM, never
read any of the fine papers on the subject, and certainly hadn't bothered
to check what they were doing. The checkpoints were incomplete. 
 
Eventually something broke, and they were in a world of hurt, all because
of these EVIL TECHNOLOGY VENDORS. (Actually they lost a disk on a RAID 3
sub-system, which lost the file system due to some "bug" in the RAID. Thus
the need to recover. Had they yanked a disk out to see what happens? No.
Paid the bill before they knew what they had, and ended up with . . . a pig
in a poke. Then they decided RAIDs were a bad idea in general. Sigh.) 
 
So yes, just like any other mission critical, big risk item, recovery from
checkpoints needs to be actually tested, on the actual installation,
configured as it will actually be run. And re-tested when _anything_
changes. Ingres version, OS version, NFS mounts, Use of YellowPages, etc.
etc. 
 
Even when the technology works correctly (as Ingres did, and the RAID did
not in the example) IF IT AIN'T CONFIGURED RIGHT NOTHING GOOD WILL HAPPEN. 
 
Checkpoints are too important to be left to chance and assumptions. 
 
 
 
-- 
James Bullock - Rare Bird Enterprises 
620 Park Avenue # 171 
Rochester, NY. 
716-242-4824 
716-244-9072 (Fax)
Ingres Q & A
To William's Home Page
Email William