#####                                             #       #
#     #  #         ##     ####    ####            ##      ###
#        #        #  #   #       #               # #       #
#        #       #    #   ####    ####             #
#        #       ######       #       #            #       #
#     #  #       #    #  #    #  #    #            #      ###
 #####   ######  #    #   ####    ####           #####     #

 #####
#     #  #         ##     ####    ####      #     ####     ##    #
#        #        #  #   #       #          #    #    #   #  #   #
#        #       #    #   ####    ####      #    #       #    #  #
#        #       ######       #       #     #    #       ######  #
#     #  #       #    #  #    #  #    #     #    #    #  #    #  #
 #####   ######  #    #   ####    ####      #     ####   #    #  ######

#     #
##   ##  ######   #####  #    #   ####   #####    ####
# # # #  #          #    #    #  #    #  #    #  #
#  #  #  #####      #    ######  #    #  #    #   ####
#     #  #          #    #    #  #    #  #    #       #
#     #  #          #    #    #  #    #  #    #  #    #
#     #  ######     #    #    #   ####   #####    ####


Some interesting books:
Douglas R. Stinson
	Cryptography: Theory and Practice
	CRC Press 1995, 0-8493-8521-0.
Bruce Schneier
	Applied Cryptography
	John Wiley, 1996, 0-471-12845-7.


1       Symmetric Key cryptosystems		Stinson Chap 1
1.1	Some classical Cryptosystems
1.1.1	Description of methods.
1.1.2	Some cryptographic attacks
1.2	Perfect Secrecy.			Stinson Chap 2
1.3	The DES algorithm.			Stinson Chap 3
1.3.1	Description of the Algorithm.
1.3.2	Background and criticism.		Schneier Chap 12
1.4	Attacks on DES				
1.4.1	Known Plain Text, Brute Force.
1.4.2	Ciphertext-only attack, Brute force.
1.4.3	Hellman's Cubic Root Attack.
1.5	Double and Triple DES.
1.5.1	Meet-in-the-Middle attack
1.5.2	3DES triple-DES.
1.5.3	Design Myths of DES
1.6	Attacks on DES machinery.
1.6.1	Differential Cryptanalysis.
1.6.2	Differential fault Analysis
1.7	Conclusions


1       Symmetric Key cryptosystems
-----------------------------------

Suppose Alice and Bob are communicating but don't want their
communication to be overheard by others; specifically, they are
bothered by an opponent we shall call Oscar.  Alice and Bob may hide
their message using a CRYPTOSYSTEM S = (P,C,K,E,D):
 P = Plaintext alphabet
 C = Ciphertext alphabet
 K = set of keys
 E = a collection {Ek} (k in K) of functions P --> C
 D = a collection {Dk} (k in K) of functions C --> P
     satisfying D_k(E_k(x)) = x.

Alice and Bob agree secretly on a KEY k and further send y = Ek(x)
instead of x; Bob, knowing k, may apply Dk and retrieve x.  We call
the symmetric because Alice and Bob use the same key, or, to put it
differently, knowing Ek implies knowing Dk.  Oscar cannot find x
from y in the same way as Bob since he doesn't know the key used.

Usually the symbols in P, C, and K are bitstrings of a certain length
and moreover, P and C are usually the same.  Assume |P| = |C| = m; we
shall interpret symbols as numbers modulo m.


1.1	Some classical Cryptosystems
------------------------------------

Cryptography has a long history, starting back at the ancient
Egyptians; we mention here a few tricks that have been used through
the ages in order to describe some principles.

1.1.1	Description of methods.

SHIFT system.
The key is an element of Zm, and encryption means shifting m positions
through the alphabet:
	Ek(x) = x+k
	Dk(y) = y-k.
Example: with k=3 APEKOP becomes DSHNRS so Oscar won't understand.

AFFINE system.
Uses a linear transformation; k = (a,b) where a has an inverse mod m.
	Ek(x) = ax+b
	Dk(y) = (y-b).a^    (a^ means the inverse of a).

These two methods are both special cases of
PERMUTATION:
The key is a permutation pi of Zm and
	Ek(x) = pi(x)
	Dk(y) = pi^-(y).

1.1.2	Some cryptographic attacks

Why do we bother about having something more complicated than SHIFT?
Well, because it isn't good enough.  Despite the effort of Alice and
Bob Oscar will try to read the message.  To find out why he can be
ever succesful, assume Oscar has overheard the SHIFT-ciphertext
DSHNRS.
We shall adopt Kerckhoff's principle:
	OSCAR KNOWS WHAT CRYPTOSYSTEM IS BEING USED;
	ONLY THE KEY IS UNKNOWN TO HIM.

Couldn't Alice and Bob do better by keeping their algorithm secret as
well as their key?
1. This is not always possible in practice, for example if the
   algorithm is implemented in a computer program that can be freely
   bought by everyone.
2. Even if A and B do not reveal their algorithm, the possibility that
   the working of the method leaks out cannot be neglected.  And
   because it is not easy to change algorithm (it IS easy to change
   key!) it is generally agreed that SAFETY should not rely on
   secrecy of algorithms but only on secrecy of keys.
3. By making their algorithm known, Alice and Bob can rely on a vast
   society of cryptographers that will warn them if there is something
   wrong with their method.

Concluding, hiding algorithms is seen as insensible.

And so Oscar receives DSHNRS and knows it is a shifted text; because
the ciphertext is Oscar's only information, we call this a Ciphertext
Only Attack (COA).  He will shift the letters back one by one
 	DSHNRS
 	CRGMQR
 	BQFLPQ
 	APEKOP

and after three shift will learn a valid word (of Dutch langage in
this case, but by Kerckhoff's principle Oscar knows Dutch!  Alice
isn't very polite to Bob, because `Apekop' means `head of a monkey' so
Oscar assumes Alice and Bob are having a little dispute).  In the
worst case, Oscar needs about m shifts before seeing the plaintext.
In fact, Oscar is trying all the keys and there are only m of them.

This attack illustrates what is meant by Brute Force: simply trying
all the keys, and this is feasible for every system if the keyspace
is small or the opponent can mobilize a lot of computing power.

AFFINE
(phi(m) denotes the number of integers in [1..m-1] that are relatively
prime to m; we discuss number theory in Class 2 in more depth.)
The AFFINE system with m symbols has m.phi(m) keys and for m=32 this
means 512 which is within feasibility of a modern computer but I shall
demonstrate a more subtle attack.  Now
assume that Oscar knows a small fragment of the plaintext: the first
two letters, x1 and x2 say, and the ciphertext is y1 y2 y3 ...

From	y1 = a.x1 + b
 and	y2 = a.x2 + b
(a linear system of two equations with unknown a and b)
we find y1 - y2 = a (x1 - x2)

So   a = (y1 - y2) / (x1 - x2)     (if x1-x2 has an inverse)
and  b = y1 - a.x1

Oscar is unlucky if x1-x2 has no inverse, but if he also knows x3 he has
has more chances because x1-x2, x2-x3, or x3-x1 may be invertible, and
this attack is successful knowing only surprisingly few letters of
plaintext.

The attack can be extended to a COA with statistical analysis, because
not all letters are equally frequent.  Blank occurs most often (25%),
followed by e (13%) and t (12%; the exact distribution is language
dependent).  In a ciphertext of a few lines, Oscar may pick the symbols
with highest occurrence and guess them to be blank and e and carry out
the above calculations.  He will find the key in a few tries.

In this case we observe that some `internal structure' of the system
can be used to have an attack with a complexity that is much lower
than a search of the key space, and the existence of such attacks is
considered a weakness of the algorithm.
 
SUBSTITUTION
Unfortunately, even the most general system we've seen isn't very
strong.  The number of keys (32! = 2.6E35) is large enough to
withstand an attack that tries them all, but as we saw for AFFINE some
well-applied intelligence can be far more effective than `Brute
Force'.
Any permutation, though permuting the symbols, leaves the highest
symbol frequency, the second and third highest, etc, intact (though
they move to another letter).  And by counting the relative
frequencies it is easy for Oscar to bring back the blank, e and t in
position, after which he can guess his way to the other letters.


Still more complicated systems work on blocks of letters, combining
substitutions of letters by others of the alphabet with permuting
them with other letters in the text.  But these systems didn't
withstand computerized efforts to break them, so we should look for
something more!
 
1.2	Perfect Secrecy.
------------------------

We saw that in the attack against the shift system it was used that
the attacker can distinguish meaningful messages from random strings.
In fact this difference makes that most cryptosystems reveal the
content of even very short messages.  Or, rather, they do not hide
them PERFECTLY but only COMPUTATIONALLY.

Definition.
A system is COMPUTATIONALLY secure if, given a ciphertext Y but not
the key, it is computationally infeasible to extract the plaintext X.
A cryptosystem is PERFECTLY secure if, given a ciphertext Y, for
each plaintext X (of the same length) there exists a key K such
that D_K(Y) = X.

Understandably, perfect secrecy is stronger because the ciphertext
alone, without the key DOES NOT CONTAIN the plaintext information,
while in a computationally secure system the ciphertext may contain
this information.

Lemma: If system S is safe, |K| >= |P|.
Proof: Perfect secrecy states that, given Y, the function D: K -->P
given by D(k) = Dk(Y) is surjective.  []

So, in order to send a 100kB file perfectly secure we need a 100kB key
also; surprisingly, such a system exists: the one-time pad.
Alice and Bob share a (long) file of random bits, and XOR with this
file for encryption and decryption:  k = k1 ... kn

For message X = x1 .. xl, A computes y1..yl with yi = xi XOR ki
For cipher Y = y1..yl, B computes yi XOR ki and finds xi back.
The next message uses fresh bits, ie, Alice starts with k{l+1}.

Oscar, intercepting Y, still has no clue regarding X, because for each
file X it is possible that the key Y XOR X was used.

Because the size of this key is impractical, it is usually the case
that a small key protects a larger amount of data and we shall now 
study how well it can protect it.
Not all n-bit strings are meaningful in the language used by A and B.

Definition: A language L of bitstrings has ENTROPY alpha if there are
2^{alpha n} strings of length n.  The redundancy rho is 1-alpha.

Suppose a message Y = Ek(X) was intercepted; of course Dk(Y) = X, but
what happens if Y is `decoded' with a different key k'?  If a string
outside L is found the key can be rejected, but if the result is a
meaningful string, we call k' a spurious key.  WE ASSUME THE
CRYPTOSYSTEM MESSES UP THE STRUCTURE OF STRINGS SUFFICIENTLY TO MAKE
Dk'(Ek(X)) A RANDOM BIT STRING.

Lemma: The probability that k' is spurious is 2^{alpha n} / 2^n.
Proof: There are 2^{alpha n} strings in L out of a total of 2^n.  []

So the expected number of spurious keys in the key space is (at most)
	2^b . 2^{alpha n} / 2^n   =   2^b / 2^{n.rho},
and if n > b/rho, we expect less than one spurious key!
In this case, it is most likely that there is only one key that can be
used with Y and give a meaningful result; in other words, there exists
only one plaintext that is the decryption of Y.

The number b/rho is called the UNICITY DISTANCE of a cryptosystem /
languag combination and it is usually surprisingly low.  Natural
languages have high redundancy: their entropy is below 25%, so rho is
0.75.  The subtitution system, with 26! = 2.58E22 = 2^75 keys has a
unicity distance of about 100 bits (circa 20 characters)!  

The difference between perfect and computational security can be
illustrated.  Indeed, for messages longer than the unicity distance, no
perfect secrecy exists because each message has a UNIQUE decryption.
Finding it may require the inspection of all 2^b keys, which should be
infeasible so that we do have computational secrecy.

Exercises:
---------
What happens to the unicity distance if Alice and Bob compress their
messages before encryption?  Assume their language is 25% entropic and
their key is 80 bits.  What is the unicity distance and how does it
change with a 2-fold, three-fold, four-fold or five-fold compression?
 
Prove that SHIFT is perfectly secure as long as you encrypt only a
single character.

1.3	The DES algorithm.
-------------------------

Around 1970 it was recognized that cryptography should move from the
military and diplomatic apparatus to the civil industry because banks
and industries needed cryptography to do business over computer
networks.  The methods must be PUBLICLY KNOWN to allow cooperation
between various businesses.  This is widely accepted in the scientific
community (Kerckhoff!) but the idea to show your algorithms was quite
controversial in the world of the military and the American
government.  The American NBS, national bureau of standards, invited
proposals for a cryptographic standard.  The proposal of IBM was the
only one, hence `chosen' as the Data Encryption Standard, DES.

Brief DES History:
	1970	IBM has cryptosystem LUCIFER
	1972	Banks want cryptography and ask NBS for standard.
	1974	IBM develops LUCIFER into system that encrypts
		128b blocks using 112b key.
		NSA `works' on DES and presses IBM to halve the block
		and key size!!
	1976	DES adapted as standard
	1978	First inventions of PUBLIC KEY cryptography.
	1980	Invention of Hellman's Cubic Root Attack
	1990	Invention of Differential Cryptanalysis
		Proposal for IDEA
	1996	Invention of Differential Fault Analysis

1.3.1	Description of the Algorithm.
------------------------------------
The DES algorithm uses a 56 bit key and operates on blocks of 64 bits
of data.  In other words, DES specifies a collection of 2^56 functions
of {0,1}^64 to itself and their inverse.

The data undergoes 16 iterations, each driven by a 48b iteration key;
for the iteration key a selection of the 56 master bits is made (which
bits are used in each iteration is fixed and public).
In each iteration (<a href=sh2-2.jpg>sheet 2-2</a>), half of the data is passed unchanged
and combined with the key to mess up the other half; so
	Y1 = X1
	Y2 = X2 XOR f(X1,K)
And this iteration is self-inversive; indeed, with
	Z1 = Y1
	Z2 = Y2 XOR f(Z1,K)
we find Z1,Z2 = X1,X2.

Sheet 2-3 shows how you can iterate this three times, and 2-4 shows
the layout of the complete machine with its 16 rounds.
If you were to build a machine, I would give you some extra details.
Such as, which bits of the key are used in each iteration, and I would
specify f.  But you can find this information in any book.

1.3.2	Background and criticism.
--------------------------------

The algorithm was designed with hardware implementation in mind.  It
contains a lot of bit-wise operations, and operations like a shift or
permutation of bits, actually costing nothing when implemented in
hardware.
Current hardware chips are extremely fast: there are 10$ chips that
perform circa 100 million en/decryptions (1E8) per second.

DES is a VERY GOOD ALGORITHM: there is no better attack known than to
try all the keys (Brute Force Attack), but here we have the main DES
problem: the key size is too small to withstand a serious BFA.  The
original proposal used 112b keys on 128b data blocks, but an American
Agency, NSA, said that in the standard the width of data and keys
should be smaller.  Critics say they wanted this so that they would be
able to read all encrypted data themselves, and there is truth in
this; we shall study the attacks below.

Meanwhile, DES is tremendously popular in the banking and
administration world, and is not likely to disappear very soon.
Around 1990 a new algorithm, called IDEA, was developped.  This
algorithm is a little bit like DES, but it has 128 bit key and 64b
data, and it operates on the data in 16b blocks rather than individual
bits, making it more suitable for software implementation.


1.4	Attacks on DES
----------------------

Most of the attacks launched agains DES are very general in the sense
that they treat the DES algorithm as a `block box' and hence apply
against ANY cryptosystem.

1.4.1	Known Plain Text, Brute Force.
First assume Oscar has somehow learned the plain version X of ONE
intercepted ciphertext Y; how long does it take to test all keys k if
Ek(X) = Y?
Well, there are 2^56 = 7E16 keys; so with 10000 of the chips (an
investment of circa 100 k$) you can try 1E12 per second and finish in
7E4 seconds, just under a day.
The conclusion is that today DES is simply not strong enough against a
modestly rich opponent.  In 1974 the computational effort for a BFA
was restricted to a very powerful organization, and at the time there
was one: NSA.

1.4.2	Ciphertext-only attack, Brute force.
Even without knowing some plain text, Oscar is not really going to
have a bad time, because the unicity distance is deplorably small;
assume Oscar knows A and B communicate in ascii, which is 1/8 redundant.
Then the unicity distance is 56/(1/8) bit, or just 7 blocks!
So Oscar must decrypt seven blocks (instead of just 1) with every key
and test if the result is ascii.
A BRUTE FORCE ATTACK ON CIPHERTEXT ONLY IS JUST SEVEN TIMES MORE WORK
THAN A KNOWN PLAINTEXT ATTACK!
In seven days the job is done.
Compression (BEFORE encryption) helps, but no miracles should be
expected.


1.4.3	Hellman's Cubic Root Attack.
Mathematicians would view Encryption as a function on TWO parameters,
mapping a plaintext & key combination on a ciphertext:
	E: P x K --> C
Alice and Bob, having fixed k, view it as a function on plaintext
because the variable key is not of interest to them:
	Ek : P --> C.
Oscar however, does not know the key and there is an attack based on
the effect of encryption on a SINGLE plaintext; Oscar views E as a
one-way function:
	Ex:  K --> C.
(I call it one-way because Oscar doe not have any computable inverse
at his diposition.)
Oscar fixes some arbitrary x and at some point seduces Alice to
encrypt x, and overhears the result Y = Ex(k): we call this a Chosen
Plaintext Attack.

DISCUSSION: How can you ever make a chosen plaintext attack?
	(Fake Secret, Bank Card, Version number)

In fact, Hellman's algorithm is a general strategy to retrieve
inverses of one-way functions.  First consider these two extreme
strategies for Oscar:

1. Time intensive, no storage.
After hearing Y = Ex(k), Oscar tries all
2^56 possible k.  We have seen that, though this is on the border of
feasibility for some opponents, generally very time consuming.


2. No time, memory consuming:
Oscar PRECOMPUTES all 2^56 values of Ex(k), and after hearing Y can
retrieve k from his table.
Storing 2^56 pairs of 112bits takes 5E17 Bytes of storage, which is
500 million Gigabyte disks.  Not exactly feasible.

Hellman's method takes a way inbetween by trading memory for time;
the PRECOMPUTATION remains the same 2^56, but time and memory of the
actual attack are between these limits, with the same product.  The
attack is based on iterating the function Ex, but it is a bit annoying
that it takes a 56b key and has 64b output; therefore Oscar combines
Ex with an (arbitrary) REDUCTION function R that selects 56 of the 64
bits.  R.Ex is function that maps 56b on 56b, and let y=R(Y).

Suppose Oscar wants to spend M memory and T time on an attack; write S
= M.T.

Precomputation of one attack:
1. Choose M random strings (56b) X10...XM0.
2. For each i, iterate REx T times on Xi0, ie, compute a chain
	Xi0, Xi1, Xi2, ... XiT.
3. For each i, store Xi0 and XiT.

Now Oscar evaluated the function S times, namely on
	Xij    for i=1..M  and j=0..T-1.
When Oscar gets y = R(Ex(k)), he can find k in T time IF the value of
k did appear among the S values Xij.  Observe that
	IF k = Xij, THEN RE^{T-j}(k) = RE^{T-j-1}(y) = XiT
but not necessarily vice versa.

1. Iterate REx on y T times:
   Compute y, RE(y), RE(RE(y)), etc, until RE^T(y).
2. For each RE^j(y), check if it equals XiT for some i.
3. If so, compute k' = RE^{T-j-1}(Xi0) and check if Ek(X)=Y.

There may be some `false hits' in step 1 where RE^j(y) = XiT
but the key k' = RE^{T-j-1}(Xi0) does not work.  Fortunately the
expected number of false hits is very small (the expected number of
good hits also, By the way!!) so this does not influence the
complexity significantly, and the search time is about T.

The success probability equals the fraction of the keys present among
the S values Xij; unfortunately, this number can be even smaller than
S because of overlap.  Probability theory reveals that the number of
different values stays well above 0.8S as long as MT^2 < N ( = 2^56).

So we take M = T = N^1/3, yielding a table with N^2/3 values, about as
many of these different; the succes probability is then approximately
S/N, or 1/(N^1/3).  To have more success, Oscar launches N^1/3 of
these attacks independently, namely, with different reduction
functions!

As the cubic root of 2^56 is about a million, the idea is:
  - There are a million tables, each made with a different R.
  - Each table has a million pairs Xi0 XiT.
  - To search one table you iterate a million times.

Each table is about 16 MB which fit main memory; iterating a million
times plus table search takes one tenth of a second with good hardware
and with only 1000 machines you search all million tables in about 2
minutes.
The only problem is the total size of a million tables: 16000GB, so
you need a couple of optical disks here.
Alternatively, use a bit less memory and some more time!

I like to remark that Oscar's overall computational effort is still
proportional to N (the key space); the trick only allows him to do the
work IN ADVANCE and spend only little time after the interception of
Y.

QUESTION.  This is a Chosen Plaintext attack, but the value of x is in
no way special.  Why didn't I call it a Known Plaintext attack then?

1.5	Double and Triple DES.
------------------------------

The conclusion is clear: even though algorithmically sound, DES stinks
because the key size is simply TOO SMALL.  The American solution is:
"OK, if it stinks, let us do it twice and see if the smell disappears!"

Alice and Bob now communicate with a 112b key K = (k1 k2) and use
these 112b as two DES keys:  Y = Ek2(Ek1(X)).  We call this double-DES
or 2DES.
In general, we can have 56.j bit masterkeys and iterate DES j times
with a different key.

What will happen?  First consider iteration of the SHIFT system:
Alice first shifts over k1 positions, then over k2.
QUESTION: What will happen?

THEOREM:  The set of m encryption functions Ek used in the SHIFT
system is a group.
Most notably, shifting twice is the same as a single shift by a
different amount and will not make an attack any more difficult.
Alice has m^2 different ways of computing, but computes only m
different functions.

Concerning DES it has been shown in 1992 by Campbell and Wiener that
it is not a group, and with 2^112 different keys we are indeed
expecting about 2^112 different functions.

Considering brute force or even Hellman's cubic root attack, a key
size of 112b would be utterly sufficient for the decennia to come.
All `millions' in the attack become million-squares!

Unfortunately for the 2DES users (there are none BTW), breaking 2DES
does NOT require a computational effort proportional to the key space!


1.5.1	Meet-in-the-Middle attack
This attack shows that repeating any (symmetric) encryption method
does not increase the complexity of a BFA; the method is very general,
it can be applied against other algorithms.

We consider a Known Plaintext Attack, ie, assume that Oscar knows
several X and Y s.t. Y = Ek2(Ek1(X))  (we shall see that Oscar needs
about 3 or 4 64b blocks only).  Though the key-space has size 2^112,
Oscar finds the key with a computational effort of only 2^56 !!

Oscar makes a RIGHT and LEFT list, the right with all 2^56 possible
values of k1 and the LEFT with the 2^56 possibilities for k2.

Oscar takes the first pair (X,Y) and ENCRYPTS from X and DECRYPTS from
Y; he computes the N values Ek1(X) for all N k1, and the N values
Dk2(Y) for the N possible k2.
Each list contains circa 2^56 strings of length 64, ie, less than 1/256
of all strings occur in these lists.  However, the encryption method
of Alice ENSURES that there exists a single value Z such that
	Z = Ek1(X) = Dk2(Y)
for the k1, k2 actually used!!

So Oscar will trim the lists, preserving from RIGHT ONLY the keys k1
for which Ek1(X) appears in LEFT, and from LEFT
ONLY the k2 for which Dk2(Y) appears in RIGHT.
This reduces the size of each list by circa 1/256, so the lists become
2^48 long.

Step 2: Take the second pair and compute Ek1(X') for the remaining k1
and Dk2(Y') for the remaining k2.  As each list now only contains 2^48
strings, we expect only a 1/2^16 fraction of the strings in RIGHT to
appear in LEFT also and vice versa, so the lists will be reduced by a
factor of 2^16 and become 2^32 long each!

Step 3 uses the third pair (X",Y") and the lists are reduced to
contain the true key plus a few bastard values that can be eliminated
using the fourth pair.

We emphasize that this attack is NOT practically feasible due to the
enormous amount of storage required.  But it feels a bit uncomfortable
that the computational effort for breaking 2DES is not more than the
effort for DES!


1.5.2	3DES triple-DES.
So DES has a bad smell that doesn't disappear if apply it twice...
... we'll have to apply it thrice.

The key K is a 168 bit string (k1,k2,k3) and we apply DES three times
with different keys:

	EK(X) = (Ek3(Dk2(Ek1(X)))
	DK(Y) = (Dk1(Ek2(Dk3(Y)))
The second encryption is in fact a decryption, this was done to have
backward compatibility with DES: setting k1=k2=k3=k makes EK the same
as Single DES Ek.
A meet in the middle attack is possible, for example by focussing on
the intermediate result after TWO encryptions; but this would require
a precomputation with the 2^112 possible values for k1 and k2.

The bottom line is that 3DES offers a security (2^112 work) that we
had naively expected to achieve with 2DES.
Needless to say, 2DES is NOT used (it offers no significant
improvement over DES) but 3DES is tremendously popular with the
banks.

1.5.3	Design Myths of DES
The NSA has been criticized from the start about having halved the key
size of DES.  As we have seen, 56b is just TOO SMALL because it allows
all kinds of smart attacks that somehow view the algorithm as a black
box.  Indeed, we have not yet seen any attack that actually looks at
the inside of the algorithm and does something that is better than
these `black box' attacks.  This shows that the algorithmical ideas
brought into DES are very strong!  It is just a pity that the key size
was defined so small.
NSA was also criticized for their work on the f functions used inside
DES: it was a frantically denied (by NSA) public secret that NSA has
dictated to IBM how f should work, but they refused to reveal the
design criteria.  There was a public fear that persisted for about
twenty years:
i)   The key size had been chosen this small deliberately so that
     NSA (biggest computer owner in the world) would be able to
     read all DES communication in the interest of national
     security.
ii)  The design of the f function contains a dirty trick, only known
     to NSA, allowing to crack DES even faster.


1.6	Attacks on DES machinery.
--------------------------------

Actually there exist two attacks that DO use the internal working of
the DES algorithm.

1.6.1	Differential Cryptanalysis.
In 1990, two Israelis Biham and Shamir considered a simplified version
of DES, where only 3 rounds were used INSIDE the algorithm instead of
the actual 16.
They discovered that it is possible to launch a Chosen Plaintext
attack with pairs of plaintext, and using some of these pairs, read
about 48 of the key bits by analysing the output.

	Number of	Chosen		Known PT
	Rounds		PT Pairs	pairs
	---------------------------------------------
	3		3		2^17

Moreover, it was found out that the attack can be generalized to
encryption with more rounds:

	8		2^14		2^44

The cryptography scene got in a frenzy, which gradually sank away when
the complexity of the attack became clear.

	12		2^31		2^47
	14		2^39		2^51
	16		2^47		2^55.1

When this became known, NSA revealed the design secrets of DES.
NSA claimed that DFA was already known inside the NSA walls as early
as 1975 (and before that!).
The f function had been modified by them so as to have the numbers in
the last column grow as fast as possible.
16 is the first number of rounds where the KPA complexity exceeds the
normal brute force attack searching all keys!  This explains why 16
was chosen to be the number of rounds.

1.6.2	Differential fault Analysis
Another attack of the `internal' type was discovered in 1996: the idea
here is to have a DES chip (with unknown key) encrypt or decrypt some
string TWICE, but to induce in some way a bit error in one of the
computations, preferrably in the final iterations.  By comparing the
two outcomes, information about the key can be retrieved.

Though the Differential Fault Analysis can be shown to be
mathematically sound and quite effective, the practical application is
in the first place limited to situations where the attacker has the
possibility to `torture' the computing machinery a bit, such as a
smart card inserted in Oscar's shop's cardreader.
Second, it remains doubtful whether random single-bit errors can
indeed be induced reliably.


1.7	Conclusions
-------------------

The DES algorithm is called a SYMMETRIC algorithm because the keys
held by Alice and Bob are the same.  All `classical' encryption
methods (used in WWII and the cold war) are of the symmetric type and
DES can be seen as the ultimate symmetric algorithm, barring the
reservations concerning its key size.  DES has dominated our
perception of symmetric algorithms during the last two decades, and
its hegemony hasn't ended (thanks to the invention of triple DES).

The algorithm is simple and allows extremely fast hardware
implementations: the lastest chip generation achieves Gigabit per
second rate at 300$ cost.
Software implementsations are slower, because DES wasn't optimized for
software implementations.
Around 1990 a new algorithm was proposed, IDEA, that appears to be as
strong as DES, while operation on the data in chunks of 16b rather
than bit-wise, thus facilitating software implementations.  IDEA has
128b keys and encrypts in 64b blocks.