Problem: Devise a relatively simple and reliable way to provide for data (and server) "high-availability." Ideally, we'd like to have dedicated primary and backup server machines, where the backup machine can be made quickly (e.g. 10 mins) available in case of hardware and/or software failure on the primary machine. Very low data loss may be acceptable (e.g. a 15 minute window). Note that true fault-tolerance isn't necessary: a deferred copy arrangement or some such would be fine. Indeed, a deferred copy solution may help in preventing error propagation. (On the other hand, I'd appreciate hearing about fault-tolerant solutions.) Background: The two server machines are DEC 5000/240s (Ultrix). Ingres 6.3 (or 6.4). 3 databases, 350+ MB total, 500+ tables. I have worked with technical support on this for a while now. Although they have been helpful, their bottom line is essentially 'the suggested approaches will strain and use Ingres in ways for which it was not designed. Try talking with Ingres consulting.' Thus, it's time to turn to the net. Below I've listed some approaches in broad terms. If anyone has either thoughtful suggestions or known solutions (!) I would appreciate hearing about them. Thanks in advance. Approach 1: Use (Ultrix) disk-shadowing. Problems: As far as I have been able to ascertain, only intra-machine shadowing is available. Anyone know otherwise? I assume there are custom solutions available. Do I want to hear about these ($$$)? Approach 2: Use Ingres/STAR to automagically provide replication. Problems: Although I recall reading "direction" papers some time ago that hinted at replication, relation fragmentation, etc., my understanding is that none of these will be available in the near future. Approach 3: Variations on selective copy schemes. E.g. use copydb, copy the underlying files, etc. If one can partition a database into static and non-static tables, the overhead of a full checkpoint copy can be avoided. Problems: Inter- as well as intra-table consistency issues. Down time. Excessive data transfer time. Approach 4: Variations on incremental copying schemes. E.g. use the journaling system to update the shadow node, create routines to process the log file, use "triggers" to update the shadow node, etc. Problems: Some of these solutions may get complicated. Would require tight administration of databases. However, this category looks the most promising at the moment. Approach 5: Application and/or library level shadowing. Rewrite all applications to post transactions to both nodes. Write a cover library over libingres.a to provide replication (via multiple posting). Problems: Not reliable. Potentially not simple. Email or post as seems appropriate; if interest...summary...etc. Paul Turner, turner@kadsma.kodak.com
© William Yuan 2000
Email William