[dm-crypt] plain: opening with a wrong password

Arno Wagner arno at wagner.name
Sun Feb 8 03:59:33 CET 2015

On Sat, Feb 07, 2015 at 18:27:48 CET, dennis at basis.uklinux.net wrote:
> On Fri, Feb 06, 2015 at 07:27:29PM +0100, Arno Wagner wrote:
> > >(b) Assuming a secure passphrase, wouldn't plain mode be more
> > > secure than luks against possible vulnerabilities in the hashing
> > > algorithm that may be discovered in the future?
> > 
> > No. First, plain mode also hashes. And second, basically all
> > potential vulnerabilities of modern hash functions (collisions,
> > reversing) do not apply to the use as pasword-hashing functions. 
> > You can hash passwords with MD5 and be perfectly secure, while MD5
> > is fully broken for things like signing.
> Thank you for answering my questions. I take your point about
> plausible deniability, but your remarks about hashing have raised
> further questions for me. I had been given to understand that
> passphrase hashing makes a dictionary attack more costly or time
> consuming by forcing the attacker to evaluate the hash function for
> each passphrase attempted, and I have just checked the FAQ for
> confirmation. 

That is not what the hashing of the plain-text password is for.
You are confusing several things here:

1. Password hashing from ASCII string to binary key
2. Salting
3. Iterated hashing

The 1. step sinply serves to transform the entropy of your 
password/passphrase into something that can be fed as key to a
cipher. The cipher takes 128 bit or 256 bit of key material,
ideally with about 128 bit or 256 bit of entropy.
Your passphrase, on the other hand, may be, say 10...50 
ascii characters, but on a bit-level ASCII has low entropy.
If it is, for example, an anglish sentence, you get only
about 0.2...0.3 bit/bit of entropy in it. And if you just cut 
it down to, say, 128 bit, you end up with a 26 bit key for
the cipher, which is trivially broken.

Hashing with a crypto hash is the solution. The initial hash  
compresses the passphrase into a concentrated form that is all 
binary and has the same entropy as the input up to the maximum 
possible. You lose a tiny bit (0.00...001 bits or so) due to hash
collisions, but they are so rare as not to matter. If your 
passphrase is shorter, the hash still makes you an all binary 
key with the entropy nicely dirtributed over all bits.

Ok, so what is salting for? Salting meians instead of
hash(password) you do hash(pasword+salt) with the salt random,
non-secret and appended to the password. This servers to
defeat dictionary attacks. A dictionary attack takes 
a large table of potentiall hasswords, and stores all
hashes of these in a table sorted by the hash valye.
This means given a hash, you can retrieve the password
(if it was in the initial table size) in O(log(tablesize))
time by binary search. That is exceptionally fast and
can be done en-masse. The only potential problem is the 
table size. So what salting does is that for each potential
salt valyue, you need a separate table. That makes pre-
computing the table and re-using the table for several 
attacks infeasible, and you are down to directly trying
all hash(password+salt) for the known salt. Note that for
attacking a single password, these tables (called "rainbow
tables) do not help you. In fact, making a rainbow table
iis more effort than the direct attack (called "brute force"),
as you need to sort and store the table. But if you were
to attack, say, 250 Million or 8 Billion passwords, then
rainbow tables make this possible while individual attacks
would be infeasible or exeptionally expensive.

And then we have the hash iteration. The has iteration
served to drive the cost of each individual password
try up. So for brute force, you try a lot of hash(password).
If you hash is, say, a single sha256, each step takes, say
2us. That means you can try something like 500'000 passwords
per sesond on a typical PC on each CPU core. What iteration 
does is to use hash(hash(.....hash(password)...)) instead,
with LUKS so that it takes 1 second of iteration on container
creation. Suddenly, each attack try takes 1 second on said
srtandard PC CPU and the attacker can try 1 password per
second. Here you has a small entopy loss in each iteration,
dure to collisions. Modern hash functions have very few
collisions, so for iteration couns of a few million these
do not matter at all, but just to be safe, iteration is not
done directly, but via PBKDF2, that incorporates salting and
hashing in such a way that the entropy loss is prevented.

Of course, there is still the problem that specialized hardware
and garphics cards can hash faster. This is currently in the
process of being solved by the "Password Hashing Competition"
The result will likely be a memory-hard password hashing function,
that iterates in such a way that it needs a specific amount of 
memory (e.g. 1GB would be a sensible value on a PC) in order
to have normal speed and will suffer exponential speed-down if
less memory is available. Graphisc cards usually only have 32...64k
of fast memory per "CPU" in there and will not be faster for
these functions. FPGAS and other ASICS suffer the same problem: 
you cannot put a lot of memory in them. 

> It would seem to follow that a hash algorithm
> sufficiently prone to collisions would diminish security by not taking
> full advantage of the available key space, possibly to the point of
> making a well informed search of the key space more practical than a
> dictionary attack. 

Indeed. That is why a crypto-hash is used. They are not probe to
collisions at all, as that is one of their central design criteria.
Even broken algorithms like MD5 do not have excessive collisions,
but they have some that are (relatively) easy to find. The worst
for MD5 are that you can cleverly manipulate the plain-text in
order to get a different one with the same hash-value. This
allows you to change an X.509 certificate. Note that this is
not easy to do and only some changes are possible, but in some
cases you can, for example, turn an ordinary certificate into
a CA certificate: http://www.wired.com/2008/12/berlin/

Note that this requires you to have the plain-text that was hashed
and hence does not impact password hashing at all.

> In the degenerate case of a totally stupid hash
> algorithm that hashes every passphrase to exactly the same key, the
> attacker need only try that particular key and not even evaluate the
> hash function. In a less extreme case where the algorithm maps low
> entropy passphrases to some keys with higher probability than others,
> some of the attacker's work is done for him if he starts with the more
> probable keys. My conclusion would have been that if the passphrase is
> initially at least as secure as a random key, then hashing can never
> increase security but may decrease it. If this is a misconception, can
> you please correct it?

Indeed. You are thinking of hash functions, but not hash functions as
used in crypto. For convenience, the term "crypto" is dropped from
"crypto-hash" when discussing cryptographic applications. This is
admittedly a constant source of confusion for newcomers until somebody
explains it to them. When discussing anything crypto, you say
"hash-function" for cryoptographic hash functions and you say
"conventional hash function" when you mean non-cryptographic ones.

For a conventional hash function, such as used in hash-tables, 
collisions are a nuisance, but unavoidable in most cases. Hence 
you try to avoid them, but not very hard. Although modern 
variants like SIP-hash or SpookyHash are pretty good at it. 

Cryptograohic hash functions, on the other hand, critically need
high collision resistance (i.e. rare and hard to find collisions 
even if somebody actively searches for them) and hence are entirely 
different birds. They basically do the same as normal hashes, but 
with some very strict additional requirements, among them collisions 
resistance, hard to invert, etc.

Now, why not just use only crypto-hashes, also in hash-tables?
Simple: Crypoto hashes would work well in that application, but 
they are much slower than conventional hashes and much more 
complex to implement. Hence they are used only in special cases 
for hash-tables.

I hope this clears things up a bit.


Arno Wagner,     Dr. sc. techn., Dipl. Inform.,    Email: arno at wagner.name
GnuPG: ID: CB5D9718  FP: 12D6 C03B 1B30 33BB 13CF  B774 E35C 5FA1 CB5D 9718
A good decision is based on knowledge and not on numbers. -- Plato

If it's in the news, don't worry about it.  The very definition of 
"news" is "something that hardly ever happens." -- Bruce Schneier

More information about the dm-crypt mailing list