[dm-crypt] disappearing luks header and other mysteries

Ross Boylan ross at biostat.ucsf.edu
Mon Sep 15 01:25:39 CEST 2014


I went through the expansion of a partition, including the crypto
layer and the file system, apparently successfully.  But quite a
number of things are now going wrong, including my (encrypted) root
device (which I did not resize) being mounted ro after file system
errors.

The most dm-crypt'ish error is this that yesterday I got
markov:~# cryptsetup luksDump /dev/mongo/backup 
LUKS header information for /dev/mongo/backup

Version:        1
Cipher name:    aes
Cipher mode:    cbc-essiv:sha256
Hash spec:      sha1
Payload offset: 1032
MK bits:        128
....

and today
# date; cryptsetup luksDump /dev/mongo/backup 
Sun Sep 14 15:05:20 PDT 2014
Command failed: /dev/mongo/backup is not a LUKS partition

I do not presume the problem is from dm-crypt, even my possible misuse
of it, but am hoping for some wisdom here on the narrow issue of how
the LUKS header could apparently disappear.  And maybe some help with
my other problems :)

One possible difference is that today /dev/mongo/backup is in an
active dm-crypt mapping, and the result is mounted.  However, the
relevant file system clearly has errors, e.g.,
Sep 14 15:07:54 markov kernel: [74493.436968] REISERFS warning: reiserfs-5090 is_tree_node: node level 63300 does not match to the expected one 1
Sep 14 15:07:54 markov kernel: [74493.436973] REISERFS error (device dm-57): vs-5150 search_by_key: invalid format found in block 32802. Fsck?
Sep 14 15:07:54 markov kernel: [74493.436976] REISERFS (device dm-57): Remounting filesystem read-only
Sep 14 15:07:54 markov kernel: [74493.436981] REISERFS error (device dm-57): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [2 718 0x0 SD]
when I tried to read from it.

Earlier, my root file system (ext3) also had errors and went
read-only.

/dev/mongo/backup is on a separate drive, and the LVM VG mongo is
separate from my other VG's.  It has the backups that I made before the
resise.

The root partition, among other goodies, has the keyfiles for the
encrypted disks.  In some cases, though not for mongo/backup, there is
also a pass-phrase.

A somewhat more detailed history appears below.  I think the most
important points are that the system crashed when I attempt to umount
an encrypted file system after I mistakenly wrote zeros underneath it;
that the subsequent backup and expansion of the file system seemed OK;
that when the logs rotated much later there were read errors on the
root fs, which was remounted ro; and that just now as I attempted to
access the backup file system (which is on a separate disk) it got
read errors and went read-only.

Also, when I partitioned the new backup disk I thought I created an
initial 1G partition, but after the crash (which destroyed my logs)
the first partition was only 512 bytes.  That may, in fact, be the way
I created it.

I would appreciate any help, as things seem in a pretty bad state,
with even reading seeming hazardous.

I don't think I've overwritten any key areas, like the LUKS headers,
since my one disaster, but of course that's one possibility.

Ross Boylan

DETAILS for the curious

Added a new disk to the system; formatted GPT with parted; made a new
LVM VG "mongo" out of the second partition; created LV "backup",
encrypted volume based on it; filled with zeros; created (reiser) file
system.

Shutdown imap daemon and made backups to mongo/backup.

Started zero wiping free space in my main VG, "turtle".
Got confused and wrote zeros to /dev/turtle/backup (or maybe the
decrypted volume at /dev/mapper/backup_crypt).  Interrupted this.

umount /dev/mapper/backup_crypt causes the system to *HANG* or crash.

Rebooting was challenging since it tried to start off the new disk,
which had no boot info on.  Changed BIOS to boot off the 2nd drive and
system came back.

Recreated the file system on /dev/mapper/backup_crypt, redid the
shutdown of the mail server and copied to backup.

In turtle VG wrote zeros to encrypted free space, extended the mail
spool partition to use the free space, remapped the crypto device, and
grew the file system.

At this point everything looked fine.

Overnight the nightly snapshots were created (I observed this) and
remote backup ran from them (presumably).  Notably, mounting the
snapshot of the root partition did not report any file system errors.
Remote backups usually finish around 2:30am.

Around 6:30am the snapshot for the mail spool filled.

At 7:30am, as the logs were rotated, many errors were reported for the
root filesystem, which was remounted read-only.

11:10 the kernel reported that postgresql had been hung
up, and invoked the oom killer.

11:10 update.db failed (updating the db for locate).

15:07 filesystem errors for mongo/backup when I attempt to read from
it.
Sep 14 15:07:54 markov kernel: [74493.436973] REISERFS error (device dm-57): vs-5150 search_by_key: invalid format found in block 32802. Fsck?
Sep 14 15:07:54 markov kernel: [74493.436976] REISERFS (device dm-57): Remounting filesystem read-only
Sep 14 15:07:54 markov kernel: [74493.436981] REISERFS error (device dm-57): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [2 718 0x0 SD]
Sep 14 15:07:54 markov kernel: [74493.445506] REISERFS warning: reiserfs-5090 is_tree_node: node level 6437 does not match to the expected one 1
Sep 14 15:07:54 markov kernel: [74493.445509] REISERFS error (device dm-57): vs-5150 search_by_key: invalid format found in block 32771. Fsck?


More information about the dm-crypt mailing list