[dm-crypt] How to recover partially overwritten LUKS volume?

Arno Wagner arno at wagner.name
Sat Aug 25 22:21:33 CEST 2012


On Sat, Aug 25, 2012 at 05:07:14PM +0200, Andr?s Korn wrote:
> On Sat, Aug 25, 2012 at 1:20 PM, Arno Wagner <arno at wagner.name> wrote:
> > > At this point, I made a mistake. I re-created the degraded array with:
> > >
> > > mdadm --create /dev/md2 --level=5 --raid-devices=4 --assume-clean
> > > missing /dev/sda4 /dev/sdc4 /dev/sdb4
> > >
> > > However, I forgot to specify --metadata=0.90 (which the original array
> >
> > Not good. Never, ever, ever recreate RAID arrays, filesystems,
> > etc. without a full binary backup of the originals, unless you
> > are prepared to lose all data that was on the devices.
> 
> This is very good advice and I often give it too. :)

Ah, yes. Giving advice is easier than to follow it. 
Happens to me too from time to time.

> > > used). I immediately rectified this, but by then mdadm had written a
> > > raid superblock somewhere where originally there was none, and now
> > > trying to luksOpen the volume with a known good passphrase results in
> > > "No key available with this passphrase".
> >
> > The default is metadata 1.2 for current mdadm. That put the
> > superblock at 4k from the start and right in the middle of the
> > first key-slot.
> >
> > > I still have the drive I removed, intact.
> >
> > It is unlikely but possible that what you lost is on there.
> 
> The original RAID5 array used a chunksize of 64k, which seems to
> suggest that the first 64k of the 0th device (which is the one I had
> removed) should still contain the overwritten LUKS data; however, the
> header was considerably larger than 64k (see below), so it seems I'm
> out of luck.

Not necessarily. You just need the ehader (~600 bytes) and one
intact key-slot. The keyslots are 128kiB with default
values (256kiB with XTS mode). The smaller one may have survived,
see below. An XTS keyslot will not have survived though.
 
> > To determine this you would need to find out where exactly the
> > mdadm superbloick landed, extract the rest of the key-slot
> > and see whether you dinf that on the removed disk. If so,
> > you may have the data missing from the key-slot on the
> > removed disk.
> 
> The trouble is though that three of four disks were overwritten...

Thinking about this again, you are right. The resync will 
have done additional damage. Now resync for RAID5 only writes
the parity of all other disks to one. It does this rotating by 
stripes. To 64k written to disk 0, next 64k to disk 1, ...
With 4 disks, ther pattern is that 192k stay intact on each disk 
and then 64k of parity is overwriten. Repeat until end. 

LUKS keyslots are 128kiB in size. So you may still be in luck,
but this is going to take a lot of time to test out and sounds 
rather unlikely. 

> > > I have some backups but they're older than I'd like; is there anything
> > > sensible I might to that could help me recover the LUKS volume?
> >
> > Not really. The only faint hope is to have the missing data
> > on the removed disk. Nothing else that I can see. Chances are
> > roughly 25% that the missing part is on the removed disk.
> 
> Even if it was RAID device #0 in the original array? Its first four
> bytes do say LUKS, and cryptsetup appears to recognise it as a LUKS
> device (if I try to luksOpen it separately).

Then the first 64k to 192k (see above) will be valid header data.
Have you tried unlocking just the removed disk with the password?
 
> > So unless you want to do some serious digging through raw
> > disk data on sector-level (and possibly writing some tools
> > for that yourself), no, nothing sensible.
> 
> I'd be up to some digging and tool-writing, but I don't know what it
> is I should be doing. :)
> 
> I think the data area that got overwritten on disks #1, #2 and #3 was
> intact on disk #0, but that didn't help (see below).
> 
> > > My first idea is to re-create the array with the removed drive
> > > included (making sure to specify the metadata version). T
> >
> > Don't do that! It will likely only destroy more data.
> 
> I meant using copies, of course.
> 
> That's what I did now: I copied the first and last 96MB of all four
> partitions to equally sized partitions on four other disks and tried
> to re-create the array with the correct parameters using these. The
> parameters are known correct now (metadata version, disk order as well
> as chunksize).

Ok, that is enough data.
 
> However, luksOpen still says "No key available with this passphrase."

Yes, because ther MD header 1.2 killed data at 4kB offset
in the first key-slot.

> Would it make sense to try a luksFormat with the same passphrase? I
> suppose not, because a random key is likely involved...?

No. See FAQ Secion 1.2 "Warnings". One of the very few
disadvantages (in this situation) of LUKS over plain dm-crypt. 

> I also assume that using more than the first and last 96MB of each
> partition won't do much good either, right?

You can actually reduce that to the first 3MiB or so if you are 
going to try to recover only the LUKS header and keyslot.
 
> Am I correct in surmising that I'm screwed?

No. Unclear at this time. But expect a lot of fiddeling
and you may be screwed after all.

> > You can try to puzzle the header back together
> > on different media, you do not need a data area.
> > You can alos use a detached header (newer cryptsetup)
> > and work in a file. As soon as you get an unlock, you
> > can then try to repair the old header with the recovered
> > one, but not before.
> 
> How would I proceed with the detached header? Dump the header from the
> corrupted (and reassembled) RAID array into a file and experiment with
> that? 

Yes. Or use a loop-mounted file (see FAQ item 2.3)

> How is that better than using a (partial) copy of the corrupted
> array?

It is easier to handle. Doing raw sector reads/writes on disks is
harder than just reading/writing in files with offsets. With
files you can also easily do things like using "head" and "tail"
to combine pieces. For example
 
  head -c 64K /dev/sdx 

gives you the first 64kiB of disk sdx, or

  tail -c 64K /dev/sdx | head -c 64K 

gives you the second 64kiB. And you can combine with cat and
">>".

Come to think of it you could do all the analysis just with 
shell-scripts.

> luksHeaderBackup produces a file that is 528384 bytes in size. This is
> more than 8 RAID chunks, so it was certainly hit by the new RAID
> superblock in 3 places (on disks #1, #2 and #3).

Yes, but see above that up to 192k may be intact in a rotating fasion. 
Depends on how the RAID code distributes the parity stripes. 
If you are lucky, one of the key-slots made it.

> luksDump says:
> 
> Version:        1
> Cipher name:    aes
> Cipher mode:    cbc-essiv:sha256
> Hash spec:      sha1
> Payload offset: 1032
> MK bits:        128
> MK digest:      b9 68 70 a2 ac ca f7 f6 f6 8f b8 ba 33 59 3c 61 f3 e0 68 98
> MK salt:        4a 42 a9 ab e0 74 0f ee 8a 98 5b f8 d7 80 f7 73
>                 da a4 dd 16 5f 2e 18 48 f9 28 c7 7e e9 07 5f bf
> MK iterations:  10
> UUID:           5852d626-0428-4382-bca6-c04350559ceb
> 
> Key Slot 0: ENABLED
>         Iterations:             141780
>         Salt:                   58 a9 bb e9 4d 31 03 54 1b b1 85 27 24 73 5f e0
>                                 63 52 18 cd 4f 3b ff fb 5f ed 26 b8 40 dd c7 b4
>         Key material offset:    8
>         AF stripes:             4000
> Key Slot 1: ENABLED
>         Iterations:             95596
>         Salt:                   41 fc a7 02 38 4d ff 6d d1 39 fb 6f 8f 3a 0f 0a
>                                 16 e0 e9 a6 b6 b2 86 e8 ae 01 f7 fc 41 6b 2e b4
>         Key material offset:    136
>         AF stripes:             4000
> Key Slot 2: ENABLED
>         Iterations:             109766
>         Salt:                   cd 00 34 39 60 d3 0b d3 d8 c5 b6 72 b3 a1 cd 01
>                                 77 a8 d4 84 0e bf 67 5c c2 73 b2 7e b7 ca de 75
>         Key material offset:    264
>         AF stripes:             4000
> Key Slot 3: DISABLED
> Key Slot 4: DISABLED
> Key Slot 5: DISABLED
> Key Slot 6: DISABLED
> Key Slot 7: DISABLED
> 
> FWIW, I know all three keyphrases but none of them work.

Now you have a chance of one of the keyslots being
intact on the removed disk or revoceralbe using all
disks. Try the thing above on the removed disk.

If this fails, you can start to try all 5 block combinations 
of the first 5 64kiB blocks from the removed and non-removed 
disks (LUKS header and 3 key-slots) and see whether any 
combination can be unlocked with one of the 3 passprases. 

This is about 10M combinations so trying for one of
the keyslots at a time and respect ordering would be
a good idea. 

BTW, if you manage to unlock any of the keyslots, the
next thing is to get and backup the master key, see FAQ 
item 6.10. 


Still, you may be screwed. My completey non-scientific guess is 
that you have something like 1/4 chance of recovering a working 
keyslot (and hence most of the data). You already _have_ a working 
header, keep that safe if you are going to to try to invest the 
effort.


Probably best value for effort: See whether any key-slot is 
intact on the removed disk, if not, cut your losses and use the 
backup.


Arno
-- 
Arno Wagner,    Dr. sc. techn., Dipl. Inform.,   Email: arno at wagner.name 
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
----
One of the painful things about our time is that those who feel certainty 
are stupid, and those with any imagination and understanding are filled 
with doubt and indecision. -- Bertrand Russell 


More information about the dm-crypt mailing list