Database corruption issue

Feb 17, 2015 at 6:17 PM
I've received a corrupt database from a customer and my attempts at repairing the corruption has been unsuccessful. The corruption occurred on a Windows 2012 R2 machine, 32bit application. ESE killed itself due to "a transient memory corruption"†. The next JetInit after that failed. Before I received the database, they attempted to fix the database, so I don't know if/how things were further broken. One possible interesting thing is that the log signature for the database file is zeroed out.

Since I couldn't get esentutl /r to work to recover the database, I went onto attempts to /p it. The repair succeeded (with a warning that corruption existed, which was expected, I think). At this point, the primary / only index on the largest table didn't exist. An attempt to create that index resulted in an invalid parameter exception. Subsequent database attach attempts failed with invalid parameter, seemingly coming out from ErrDBTryCreateSystemTable (I didn't try to go deeper into that path). After going back to the original database and doing another repair, I attempted readonly access, and while I could open the table, TryMoveFirst / TryMoveNext failed with InvalidParameter. Subsequent repair attempts report nothing wrong, just this warning:
WARNING: Database file is too big (expected 33640176 pages, file is 33640432 pages)
The integrity check of that table takes a long time, and coupled with the "database file being too big" only being by 256 pages, I'm assuming that most of the data is still there and reachable, I'm just stumped for where to go from here. Attempting an offline defrag out of desperation results in an illegal duplicate key. I'm currently running an integrity check. Previous checksum validations passed fine.

Thank you,
Michael


† I'm not using ManagedESENT, so I'm assuming in addition to hardware issues, wild pointers are also a reason for the issue. I'm currently trying to find the cause of that problem. Is the detection new to Server 2012? Would the way that the process aborts create a crash dump if something was configured the right way? Either Windows-specific or an ESE setting?
Developer
Feb 17, 2015 at 7:17 PM
'Repair' should really be named 'Salvage', since it will happily drop lots of data, just in the effort to get whatever it can out of the database.

Try defragging the database (esentutl -d). It does need to be done after a Salvage, but it isn't very well publicized. In fact we generally discourage running Salvage.
How big is the file?

Don't worry about the 'database too large' warning. Growing the file is a two-stage event. First, grow the NTFS file, then update our internal data structures. That just means they're slightly out of sync.

-martin
Feb 17, 2015 at 9:22 PM
Thank you, Martin.

The file is 256 gigs. As the file came to me, before the Salvage, it was 274,953,289,728. After the salvage, 275,582,435,328. The vast majority of that data is in the table that is inaccessible. (Space info reports "owned" for that table at 262,762 megs of the 262,813).

I was expecting data loss to occur as part of the salvage, but basically losing 256 gigs was more than I anticipated.

I'll go back to the original database, doing a salvage on that and immediately a defrag to see how that behaves, but against a database that has had four salvages against it ("Maybe this time it'll get me something!"), it fails early on in the process:
Operation terminated with error -1605 (JET_errKeyDuplicate, Illegal duplicate key) after 1.453 seconds.

It leaves behind a 12 megabyte file (TEMPDFRG<randomnumber>.EDB). Comparing one of the "only salvaged" databases to the TEMPDFRG database, the record count in one of the tables match, but for the corrupt table, the TEMPDFRG database only has a single record in it. If I could say "for the defragged database, ignore all existing indexes", that would be fine with me, we can do any fixups later. Unless the problem is "there's a cycle in the source database and it keeps on trying to save the exact same record", in which case that wouldn't help me much :)

While I wait for the redo on the salvage to occur, in the space info dump, comparing the tables listed for each file, the MSysLocales and MSysObjids tables are not listed in the dump from the corrupt database, but they are in the TEMPDFRG one. I did specify /v for both dumps.

And thank you for the note on needing to run the defrag after the salvage, I'll keep that in mind for the future. Though, I'm hopeful I won't have to do this in the future, too...

-Michael
Feb 19, 2015 at 6:15 PM
Going back to the original copy and running repair followed by defrag resulted in the same "duplicate key" failure. Here's a copy of the output (both the console output and the .integ.raw file) from the salvage, if it is beneficial: http://1drv.ms/1MBmmjI
Developer
Feb 19, 2015 at 9:29 PM
I assume that 'Values' is the table you're having trouble with?

Thanks a lot for the integ.raw output. Unfortunately, it really doesn't explain much on why 'values' is corrupt.
values: 1960:   checking table "values" (11)
values: 1960:       Data table will be rebuilt
values: 1960:       Long-Value table will be rebuilt
values: 1960:       Indexes will be rebuilt
values: 1960:       Long-Value refcounts will be rebuilt
values: 1960:   table "values" is corrupted
values: 1960:       data: 11 (33557233)
values: 1960:   ERROR: integrity-check of table "values" (11) finishes with error -1206 (0xfffffb4a) after 0 seconds
That must be especially frustrating after Salvage took over 15 hours (!) to run.
Developer
Feb 19, 2015 at 9:33 PM
Oh yeah, the Transient Memory Corruption stuff: We do verify the database page twice. If it failed, but was previously successful, then sometime came along and changed it. It may have been a rogue pointer, but bad memory is also a potential cause. As well as bad device drivers (they can sometimes write to random parts of memory).

-martin
Feb 19, 2015 at 9:54 PM
Yup, the values table is the one that's hosed. Do you have any thoughts on where to go at this point?

Oh, I so want to blame this on a bad device driver, unfortunately it seems unlikely. Or maybe likely, but unlikely. Both of the times (or possibly three times? I need to double check), the edb was 256 gigs. But it would be odd if it was the same driver in both cases, unless it's a generic Windows driver, but I doubt that that would be the source at this time. There was another part of that question, though -- is it likely that a crash dump would get created by WER in that instance? Even if it was only the LocalDumps option that was enabled?
Developer
Feb 19, 2015 at 10:14 PM
I just clued in: you're using a 32-bit application for a large database? We routinely use large databases with 64bit processes, and small databases with 32bit process (esent.dll is used on Windows Phone). I know that there are performance limitations when using a 32bit process with a large database (for one: the database cache is pretty limited in size). I don't know of any data-corruption issues with 32bit esent.dll, but we also don't test it.

Can you move to a 64bit process?
Do you at least have a LARGE_ADDRESS_AWARE executable? (link -dump -headers foo.exe, look for Application can handle large (>2GB) addresses)?

You mentioned 2 or 3 times -- so this happened on different machines? Or at different times?

"The next JetInit after that failed" -- Do you know what error it failed with?

I'm also assuming you're running on NTFS. 256 GB is rather suspicious, but it's way beyond the fat32 limitations.


-martin
Feb 20, 2015 at 12:49 AM
Edited Feb 20, 2015 at 12:54 AM
We do have a 64bit version available. I'll check to see if there's a reason they're not using it (we used to use a closed source third party compression library for some of our network traffic, and they never released a 64bit version, so there's some old version/new version constraints that they may be dealing with). I believe most of our large database testing was also with the 64bit version. fake edit: including recent AppVerifier testing at around 256 gigs.

The 32bit version is large address aware, and hopefully not inaccurately. We routinely test forcing top down allocations and forcibly allocating the bottom 2 gigs of address space, but obviously there's the chance that it's not enough. :-/

Two different machines. Looking for that event on the machine locally, it was three total times. The first two times happened seconds apart, different databases, but similar datasets being pushed into them. Another happened a week later, but only to one of those processes. This seemingly occurred during the first JetInit after the crash (it was a rather passive test). However, the process has more than one thread at that point, so I'm still not ruling out our own badness.

JetInit error: I'll try to find out from them for theirs. Our logs rollover too quickly, so it'll have to be looking at the Windows log for it. I believe it was JET_errInvalidParameter, which was the same as one of our issues (see below).
First database:

First start:
"Unable to read page 33559110 of database path\history.edb. Error -1003."
"Database recovery/restore failed with unexpected error -1003." (repeat this after all of the other unable to read page errors)
This attempt got through the majority of the transaction log files before it failed. I don't know the exact count because the transaction log files were all deleted (I don't know why...).

Second start:
"Unable to read page 33554432 of database path\history.edb. Error -1003."
This only replayed ~3 megs of transactions before it failed. So that's probably why the page numbers are different.

Third start:
"Unable to read page 33554432 of database path\history.edb. Error -1003."
Again, three megs.

After many more starts, for some reason, the transaction log files were all deleted, and JetInit / the reported entries report log file not found.


The second database:
First start:
The second transient memory error occurred while playing back the edb.jtx transaction file, so the initial recovery didn't get a chance to "gracefully" fail. It looks like it attempted to replay through all of the logs to get to that point.

Second start:
"Unable to read page 33554550 of database path2\history.edb. Error -1003."
Though, if my logs match up, and the error handling is correct, it looks like JetInit actually failed with JET_errInvalidParameter?
This started replaying three generations after the first start and fails instantly during that one.



The filesystem is NTFS, at least in the three cases where it reproduced locally. It seems unlikely that the customer would be using ReFS, but I can try to get that information. And, using any of the FATs seems insanely unlikely.


And going back through those logs, it looks like WER did at least get some dumps in the transient memory error case -- I'll request from the customer any dumps that they may have for my own investigation. Unfortunately, on the machine where it happened locally, the dumps are all gone. Confusingly, the "Problem Reports" in Control Panel only shows the third transient memory error, for which Windows didn't collect any dumps (probably because it was similar to the previous two).

-Michael
Developer
Feb 20, 2015 at 11:41 PM
I don't suppose you have the windows Application Event Log from either of those?
It is VERY odd that reads fail with -1003. That's JET_errInvalidParameter.
Usually an I/O fails with -1022 (JET_errDiskIO) or something like that (JET_errFileSystemCorruption, JET_errDiskFull, JET_errInvalidPath ... stuff like that).
Sometimes (not always) we'll log a more detailed event with the error that came from the OS.

-martin
Feb 20, 2015 at 11:57 PM
We do have the event logs available. Just because it may contain sensitive info (unlikely, but it's not from my machine), do you mind if I email them directly to you? I'll contact you through the codeplex thing just to exchange email addresses.

I'm also close to having a 256gb database with a 32bit process, so hopefully I'll be able to see it in action shortly, if it does actually reproduce.
Developer
Feb 21, 2015 at 12:13 AM
It's martinc at microsoft dot com. I'm about to leave for the weekend though, I'll take a look on Monday. Sound OK?
Feb 21, 2015 at 12:14 AM
Yup, totally fine, thank you.