To add to the good advice given already: I assume that your on-line transactions are already as small as possible, and that you are not retaining shared locks (or, God forbid, exclusive locks) over user input. Best to use NOLOCK for reads (if you can) on the on line side if the data integrity isn't compromised by the batch job. It reduces the locks that the batch update must wait for. The key is the design of the batch job. It must let the locks go quickly to let the on-line user update. If not, the users get locked out while your batch job wanders over your database. OK, so you code it so that it loops and commits on a 'row-at-a-time' basis. You issue a commit every loop, to release locks and let in the users. Trouble is, you have now written a Row-At-Time batch job, and it performs like a dog. So what do you do? As Roy comments, the best thing is not to update your database from batch and on-line at the same time. If you have to, try to consider copies of tables and a batch process overnight to bring them in line. However, sometimes there are real business reasons why you have to do this. One site I know had this problem. They had major data input from batch (barcodes loaded via LDT terminals as a flat file) loading into their database during the day. They HAD to be processed, as the data was required to tell the business what would be on later flights on the same day. At the same time, a team of data entry clerks banged in lots of lovely data of a similar nature, from a separate manual source. This manually entered data was required on the database right away (and so couldn't be stuck in a file and uplifted like the LDT data). All this stuff hit the database at once. Let me share the pain they went through. The on-line side was OK. NOLOCK for read, and a brief set of update/inserts at the end of entry. The batch uplift was very complex, with lots and lots of integrity checking. The batch side was a problem. Written in a non-Ingres 4GL, it performed Row-at-a-time processing all in a single transaction. It therefore took a long time, escalated to table locks and knackered the on-line users. Commits were then added to let the on-line users in, but this extended the run times for the batch job, so much that performance couldn't keep up. The whole batch program was then re-written in ESQL to do normal Chunk At Time processing. Run times moved from hours to minutes (as normal). But - the CAT processing was dependent on a large transaction. Every so often, users got locked out for 3 minutes, while it did it's stuff. Flat files and copies were considered. But wait - COPY stuffs READLOCK=NOLOCK. Copy into a table and kiss READLOCK=NOLOCK goodbye. Finally, the solution was reading all the validation data into memory quickly and committing. Performing all the validation into temporary tables. Marking the rows in the temporary tables into sets for the updates/inserts. The size of the sets are dependent on a value set in the database (so it can be varied). Put in considerable effort to make the program re-runable. Do all the updates/inserts in a final transaction, looping for each set. If the 'no of records' values was set to a high value, locking concurrency was crap, but the batch program was dead fast. If set to 1, it was bloody slow, but concurrency was good. They played around with it, and left at 100 (nice round number). Performance is acceptable. But what a nightmare. Only do it if you have to. If the batch volumes are too high there is NO SOLUTION. Enjoy. -- --Mike Prollins --Apertus Systems Limited Tel : (44) 113 256 6000 --Leeds Email : mjp@apertus.demon.co.uk
© William Yuan 2000
Email William