Saturday, February 25, 2012

buffer latch?

The following error got posted in the error log:
Could not open FCB for invalid file ID 21808 in
database 'USAAREP'. Table or database may be corrupted..
It was followed by series of messages:
Time out occurred while waiting for buffer latch type 3,
bp 0x147cc500, page (21808:808465440), stat 0x40d, object
ID 8:-271269168:0, waittime 500. Continuing to wait.
At this point we cannot do any backups using SQLMAINT or
BACKUP neither through EM nor QA.
Anybody heard of how to fix this? The server is 7.0 SP4.
TIA,
RobertRobert
I had a problem like this, it was fun to fix as I could
find very little info anywhere. I was using SQL 7 and it
hangs the whole database and you need to stop and start
SQL Server to fix it.
You need to identify the process that is causing the
problem, the buffer latch error message should point you
at the object. Try to work out what processes where
running against that object shortly before the problem
occured.
What was happening in my case was that we had a stored
procedure that updated a table, but due to a design fault
in the database, a small amount of duplicate records were
getting inserted to the table. At the end of the stored
procedure it called another stored procedure to delete the
duplicate records. At some time during this process an
automatic checkpoint kicked in. A combination of updated
records not yet physically written to the database, a
checkpoint trying to write them and another process trying
to delete records that had not yet been written, seemed to
cause the problem.
In the short term I changed the job to have three steps,
do the updates, perform a chackpoint and then the deletes.
For the longterm fix, I rewrote the update process to not
insert duplicates.
You possibilly have something similar going on. Out of
interest do you have a checkpoint process trying to run,
but unable to finish?
Hope this helps
Regards
John

No comments:

Post a Comment