[dm-crypt] How to recover partially overwritten LUKS volume?

András Korn korn.andras at gmail.com
Sun Aug 26 16:33:58 CEST 2012

On Sat, Aug 25, 2012 at 10:21 PM, Arno Wagner <arno at wagner.name> wrote:

>> The original RAID5 array used a chunksize of 64k, which seems to
>> suggest that the first 64k of the 0th device (which is the one I had
>> removed) should still contain the overwritten LUKS data; however, the
>> header was considerably larger than 64k (see below), so it seems I'm
>> out of luck.
> Not necessarily. You just need the ehader (~600 bytes) and one
> intact key-slot. The keyslots are 128kiB with default
> values (256kiB with XTS mode). The smaller one may have survived,
> see below. An XTS keyslot will not have survived though.

I used the defaults.

>> > To determine this you would need to find out where exactly the
>> > mdadm superbloick landed, extract the rest of the key-slot
>> > and see whether you dinf that on the removed disk. If so,
>> > you may have the data missing from the key-slot on the
>> > removed disk.
>> The trouble is though that three of four disks were overwritten...
> Thinking about this again, you are right. The resync will
> have done additional damage.

I think the resync did no damage because it only wrote to the new
disk; the originals only had to be read.

I was thinking that maybe I could try to assemble the (copies of the)
original array with the 0th, 2nd and 3rd drive, leaving the 1st out
initially and then re-add it, allowing RAID5 to sync the data to it,
thereby hopefully regenerating the LUKS metadata. However, this won't
work either.

My array used the default left-symmetric layout, which afaik is:

D0 D1 D2 P0
D4 D5 P1 D3
D8 P2 D6 D7

D1, D2 and P0 are damaged. Everything else is intact.

So even omitting D1, the data that would be used to reconstruct it
would be incorrect.

> Now resync for RAID5 only writes
> the parity of all other disks to one.

Not quite. If you add a missing disk, it will contain data as well as
parity after the resync. The parity will be computed based on the data
on the other disks as you said, while the data will be computed as a
function of the data and the parity on the other disks.

> LUKS keyslots are 128kiB in size. So you may still be in luck,
> but this is going to take a lot of time to test out and sounds
> rather unlikely.

All chunks are 64k long. The first keyslot started somewhere in the
first part of D0, covered all of D1 and ended in the first part of D2.
It's gone for sure, becuase part of D1 got overwritten. If next
keyslot followed immediately, it covered D3 and ended in D4. It was
likely also overwritten because a raid superblock was written to
D2+4k. However, the third keyslot had to start in D4 (or even D5,
depending on the specific layout; D0+D1+D2+D3 together only have room
for two keyslots).

Therefore, the third keyslot should still be intact.

Now, how do I get to it?

>> > > I have some backups but they're older than I'd like; is there anything
>> > > sensible I might to that could help me recover the LUKS volume?
>> >
>> > Not really. The only faint hope is to have the missing data
>> > on the removed disk. Nothing else that I can see. Chances are
>> > roughly 25% that the missing part is on the removed disk.
>> Even if it was RAID device #0 in the original array? Its first four
>> bytes do say LUKS, and cryptsetup appears to recognise it as a LUKS
>> device (if I try to luksOpen it separately).
> Then the first 64k to 192k (see above) will be valid header data.
> Have you tried unlocking just the removed disk with the password?

I have, and it didn't work; but it can't be expected to work, can it?
That disk contains D0, D4, D8 etc. in this order.

>> However, luksOpen still says "No key available with this passphrase."
> Yes, because ther MD header 1.2 killed data at 4kB offset
> in the first key-slot.

But I also tried with the other keys.

>> How would I proceed with the detached header? Dump the header from the
>> corrupted (and reassembled) RAID array into a file and experiment with
>> that?
> Yes. Or use a loop-mounted file (see FAQ item 2.3)
>> How is that better than using a (partial) copy of the corrupted
>> array?
> It is easier to handle. Doing raw sector reads/writes on disks is
> harder than just reading/writing in files with offsets. With
> files you can also easily do things like using "head" and "tail"
> to combine pieces. For example
>   head -c 64K /dev/sdx
> gives you the first 64kiB of disk sdx, or
>   tail -c 64K /dev/sdx | head -c 64K
> gives you the second 64kiB. And you can combine with cat and
> ">>".

Fwiw, this also works with disks (using dd), but I see what you mean.

> Yes, but see above that up to 192k may be intact in a rotating fasion.
> Depends on how the RAID code distributes the parity stripes.
> If you are lucky, one of the key-slots made it.

It would appear that the third keyslot (#2) must have made it.

>> luksDump says:
>> Version:        1
>> Cipher name:    aes
>> Cipher mode:    cbc-essiv:sha256
>> Hash spec:      sha1
>> Payload offset: 1032
>> MK bits:        128
>> MK digest:      b9 68 70 a2 ac ca f7 f6 f6 8f b8 ba 33 59 3c 61 f3 e0 68 98
>> MK salt:        4a 42 a9 ab e0 74 0f ee 8a 98 5b f8 d7 80 f7 73
>>                 da a4 dd 16 5f 2e 18 48 f9 28 c7 7e e9 07 5f bf
>> MK iterations:  10
>> UUID:           5852d626-0428-4382-bca6-c04350559ceb
>> Key Slot 0: ENABLED
>>         Iterations:             141780
>>         Salt:                   58 a9 bb e9 4d 31 03 54 1b b1 85 27 24 73 5f e0
>>                                 63 52 18 cd 4f 3b ff fb 5f ed 26 b8 40 dd c7 b4
>>         Key material offset:    8
>>         AF stripes:             4000
>> Key Slot 1: ENABLED
>>         Iterations:             95596
>>         Salt:                   41 fc a7 02 38 4d ff 6d d1 39 fb 6f 8f 3a 0f 0a
>>                                 16 e0 e9 a6 b6 b2 86 e8 ae 01 f7 fc 41 6b 2e b4
>>         Key material offset:    136
>>         AF stripes:             4000
>> Key Slot 2: ENABLED
>>         Iterations:             109766
>>         Salt:                   cd 00 34 39 60 d3 0b d3 d8 c5 b6 72 b3 a1 cd 01
>>                                 77 a8 d4 84 0e bf 67 5c c2 73 b2 7e b7 ca de 75
>>         Key material offset:    264
>>         AF stripes:             4000
>> Key Slot 3: DISABLED
>> Key Slot 4: DISABLED
>> Key Slot 5: DISABLED
>> Key Slot 6: DISABLED
>> Key Slot 7: DISABLED
> Now you have a chance of one of the keyslots being
> intact on the removed disk or revoceralbe using all
> disks. Try the thing above on the removed disk.
> If this fails, you can start to try all 5 block combinations
> of the first 5 64kiB blocks from the removed and non-removed
> disks (LUKS header and 3 key-slots) and see whether any
> combination can be unlocked with one of the 3 passprases.

The way I see it, my only chance is with the 3rd slot.

However, I don't understand the above paragraph. What 5 block
combinations do you mean?

> Probably best value for effort: See whether any key-slot is
> intact on the removed disk, if not, cut your losses and use the
> backup.

I now believe that keyslot #2 is intact (provided keyslots are really
at least 128k in size, and that the left-symmetric raid5 layout is
what I think it is). I'm also fairly certain the passphrases for
keyslots #1 and #2 (the 2nd and 3rd keyslot) are identical (I'm
certain of the 3rd passphrase because I added it just a few days ago,
and I'm almost certain of the 2nd one).

Could it be that cryptsetup tries to use keyslot #1 based on the
passphrase I enter, realizes that it's corrupt and throws an error
without ever trying keyslot #2? But apparently no, because specifying
an explicit --key-slot also fails.

Any suggestions?


More information about the dm-crypt mailing list