[cryptography] philosophical question about strengths and attacks at impossible levels
Marsh Ray
marsh at extendedsubset.com
Fri Oct 15 04:33:03 EDT 2010
On 10/14/2010 10:09 PM, Zooko O'Whielacronx wrote:
>
> This part is a technical mistake, see
> http://cr.yp.to/hash/collisioncost-20090517.pdf .
> I wouldn't be
> surprised if you are right that this is how NIST came up with the
> requirement for a 512-bit hash function though.
NIST may not really need a reason to standardize a 512 bit function. If
somebody somewhere is going to use 512 bit hashes, regardless of whether
or not this size is justified, it's probably better for everyone's sake
that they're at least standardized.
Apparently SHA-2-384 is what they use for TOP SECRET material:
http://www.nsa.gov/ia/programs/suiteb_cryptography/index.shtml
Of course, they don't say what they use for the even more secret stuff.
SHA-2-512 is derived from SHA-2-256 mostly by doubling the internal
values from 32 to 64 bits, and SHA-2-384 is a truncation of that. So
even if the actual need were only for a 384-bit hash, it would have been
harder _not_ to end up with a SHA-2-512 as a side effect.
Obviously, SHA-3 can't go without a 512 bit version if the older SHA-2
has one.
This is maybe a simpler explanation for why we have 512 bit hash
functions. But admittedly, it's not as interesting as the idea of a
captured space alien machine out at Area 51 generating exhaustive
rainbow tables for SHA-2-256.
> I think if you want a hash function to be much less likely to fail at
> pre-image resistance (without being much more likely to fail at
> collision-resistance)
Do you know a way to improve one without improving the other?
> then you need to increase the amount of
> computation performed (while keeping the state size the same or
> larger).
There's an old saying among engineers: "Any engineer can make a bridge
that stands. It takes a great engineer to make a bridge that stands just
barely." I.e., using a minimum cost of materials.
So let's turn the question around: say you're an engineer designing a
hash function suitable for use in applications such as RFID tags. Every
picowatt-microsecond of power you consume is going to take away from the
RF transponder power and limit the data rate. Needless to say, you have
a limited budget for your state size and the number of computations you
can perform.
So how do you design a hash function that provides the best security
given limited resources? What's the most efficient conversion from N
bits of state and M CPU cycles to security? Shouldn't there be something
in the SHA-3 family for this engineer?
I don't think SHA-3 is being developed because of a real weakness in
SHA-2. It's probably because we can do so much better in terms of
overall efficiency that environments for which SHA-2 is way too
expensive will be able to move off of MD5, MD4, or whatever hacked-up
LFSR they're using now.
> For example, you could take SHA1 and double the number of
> rounds. This would take twice as much CPU and it would be *much* less
> likely to fail at pre-image resistance than SHA1.
It really depends.
The rounds of SHA-1 are not all the same like they are in SHA-2, so
there are various ways you could go about doubling them. Given that
there are significant attacks against them already, are you sure you'd
want to spend your resources on just more of the same?
What if an attack focused only the first or the last rounds, or both of
them, but independently? You'd have the same calculations in the same
places.
So in any case, instead of the current SHA-1, a 160-bit output length
hash function that has attacks against it that are worrying but not
quite practical, you would end up with a 160-bit output length hash
function that has half the throughput, cannot take advantage of any
off-the-shelf hardware acceleration, is not approved for use in any
setting which approves such things, is incompatible with every existing
standard, and still has attacks against it but they are probably more
impractical.
> (Whether it would be
> *more* likely to fail at collision resistance I'm not sure.)
The rounds form a basic block cipher, so there's no information loss in
that part so doubling the rounds shouldn't increase collisions. But
there may be other attacks that become available.
But if you decided to go about it by simply repeating every 64 bytes of
message data as it was fed into the input side (perhaps you had to use a
library function and couldn't modify it), then that would actually
degrade the entropy present in the output a little.
> Likewise I think if you are willing to throw more state at it
> (double-width pipe, big fat sponge) and throw a proportionally greater
> amount of CPU at it, you are can make any reasonable hash function
> much less likely to fail at collision-resistance.
Not long ago I would have unequivocally agreed with you. But I got
interested, and started reading about it and now believe the reality is
more complicated.
For example, the paper describing the Skein function describes the
process used for selecting an optimal sequence of rotation constants for
each round. It was sort of a heuristic algorithm run separately for the
different internal state sizes.
If you tried to widen a function without a detailed knowledge of the
decisions that went into its design, you'd likely make some suboptimal
choices. Look at the SHA-2-256 vs SHA-2-512 rotation constants, how did
they decide those changes? Why did they increase the rounds to 80 and
not 96? There's also the possibility that the original designer just got
a bit lucky, so even if you followed the same principles when extending
it, you'd be rolling the dice again.
SHA-3 had 51 entrants, when a winner is selected, if you widened it, how
would your modified version fare in a new SHA-4 competition specifically
for costlier designs?
> So I do think there
> is a (fuzzy) trade-off between computational efficiency and safety
> here.
Several have tunable parameters where the safety-cost trade-off can be
decided late in the process.
I think the most relevant definition of efficiency is something like:
safe, effective security
-per-
computational cost (cycles, die area, power, etc.).
If we get a function that optimized according to that definition then we
can use it as a building block to satisfy any other need.
What I would like to see next is approved ways of combining hash
functions such that they provide redundant safety and allow us to apply
more CPU power like you were saying. A lot like the standardized HMAC
and block cipher modes.
A standardized algebraic notation could allow us to interchange data.
For example:
MD5 + SHA-1 = the combination of MD5 and SHA-1
(SHA-2-512 + SHA-3-512)/256 = combo with reduced 256 bit output
SHA-1*2 = SHA-1 applied twice
The scheme might define a partial or total ordering based on security.
Perhaps the values assigned to hash primitives could be reconfigured
over time to reflect the discovery of new attacks. A protocol such as
TLS could negotiate the best combination of security and efficiency by
taking into consideration the greatly varying capabilities of each
endpoint (e.g., some endpoints have hardware acceleration for some
algorithms).
>This implies that the fastest SHA-3 candidates are the least
> safe. ;-)
I'm sure we'll get one that's very fast and unquestionably safe.
- Marsh
More information about the cryptography
mailing list