Checkpoint Concerns

From: emvy@platon.com (Maarten Veerman)
Subject: Attention DBAs: Checkpoint/Journalling has SERIOUS flaws

I found this out the hard way. Needless to say, I'm very choked, but
maybe this will save you some hairs...

Scenario: UNIX system with 6.4/04 Ingres

Disk1: UNIX, page space
Disk2: II_SYSTEM
Disk3: II_SYSTEM/ingres/jnl
       II_SYSTEM/ingres/ckp
       II_SYSTEM/ingres/dmp

Every night, the system backs up all the above, then takes a checkpoint.

One day (late morning), Disk2 crashes. No problem, you replace the
disk, then restore your backup of II_SYSTEM, being careful not to
overwrite the jnl|ckp|dmp directories. Everyone is concerned about
lost work, but you tell them its no problem, Ingres has this terrific
journalling...

So you do the above, then issue the rollforwarddb command to recover
work from the checkpoint & journal files, expecting it to work...

WRONG WRONG WRONG!

What happens is that Ingres looks at the aaaaaaaa.cnf file in the
data directory (not the dmp copy), sees that its different from the one in
the dmp directory, aborts the rollforward, and -without warning, mind you-
OVERWRITES the file in the dmp directory!

STUPID STUPID STUPID!

Now, the checkpoint & journal you have on Disk3 are useless! You might
as well delete them, because there is NO WAY you can use them. The
aaaaaaaa.cnf file in the data directory is the one the system uses,
and in this case it was restored from the backup. The checkpoint and
journal files can only be accessed if you have the CURRENT copy of
the aaaaaaaa.cnf file, which existed on the dmp directory, but was
overwritten by the abortive rollforward.

To make matters worse, the aaaaaaaa.cnf file which is "tar"ed into the
latest checkpoint (and can be extracted) is useless, since it indicates
that the checkpoint it is a part of is INVALID - of course, since the
checkpoint it is a part of has not yet completed when it is "tar"ed...

What I learned:

1) Copy the aaaaaaaa.cnf from the dmp directory to the data directory
   before doing rollforward - Ingres must be down while this is done

2) There is no Ingres tool to correct the aaaaaaaa.cnf file if it is
   in error; Ingres should be taken to task for this.

3) Make backup copies of your journal & aaaaaaaa.cnf files before
   trying a rollforward; if the roll fails it will trash these files
   and anything else that's handy.

4) The documentation makes no mention of any of this, and probably a
   lot of other *very* important things as well.

Thank you. I feel much better. Can I have my bullets back now, please?
-- MV
-----------------------------------------------------------------------------
-MV (Maarten Veerman)        Platon Consulting Limited, New Westminster, B.C.
emvy@platon.com              AIX-UNIX, Ingres, & Forest Industry Applications
-----------------------------------------------------------------------------



From: mal@winternet.com (Michael Leo)
Subject: Re: Attention DBAs: Checkpoint/Journalling has SERIOUS flaws

In article <4d6uhm$s5m@grid.direct.ca> emvy@platon.com (Maarten Veerman) writes:

Maarten, 

  Some of this is covered in my paper on the NAIUA site under the 
Ingres World 93 area.  

  But the point about backing up the dmp and jnl directories is not
there.  Nor is the destruction of the .cnf file.  Things I overlooked.

  I would recommend you do not recover the data on the II_SYSTEM disk.
If there is NO .cnf file there, Ingres will use the dmp copy.  This 
should OBVIOUSLY be documented.

  I will include this in the paper I am writing for Ingres World '96.

  Good points!

Hope you feel better soon, 

|---------------------------------------------------|
| Michael Leo              York & Associates, Inc.  |
| mal@winternet.com        (612) 921-8083 (voice)   |
|                          Minneapolis, MN, USA     |
|---------------------------------------------------| 
| NAIUA = North American Ingres Users Association   |
|         http://www.naiua.org/ingres/              |
|   Ingres FAQ is at ftp.naiua.org in /pub/ingres   |
|---------------------------------------------------|
Ingres Q & A
To William's Home Page

© William Yuan 2000

Email William