##### ##### # # # # ## #### #### # # ### # # # # # # # # # # # # #### #### ##### # # ###### # # # # # # # # # # # # # # ### ##### ###### # # #### #### ####### # ###### # # # # # # ##### # # ###### # # # # # # # # # # # # # ###### # # ##### ### ##### # # # # # # ### # # # # # # # # # ### # # # # # #### ##### ### # # ###### # # # ## ## ###### ##### # # #### ##### #### # # # # # # # # # # # # # # # # ##### # ###### # # # # #### # # # # # # # # # # # # # # # # # # # # # # # # # ###### # # # #### ##### #### 2.1 Number Theory 2.1.1 Modular Arithmetic: the Ring Zm. 2.1.2 The multiplicative group Zm*. 2.1.3 The structure of Zm* for prime m. 2.1.4 Chinese Remainder Theorem. 2.1.5 Factoring large numbers. 2.2 Public Key Cryptography: Complexity Overview. 2.2.1 Complexity Theory about symmetric cryptography. 2.2.2 Public Key Cryptography. 2.2.3 The Knapsack system(s). 2.3 Number-Theoretic Public Key systems. 2.3.1 ElGamal's Cryptosystem. 2.3.2 RSA 2.3.3 Rabin's Square system. 2.3.4 Shamir's Key-less system. 2.4 Information Services in the Electronic Market Place. 2.4.1 RSA Decryptions. 2.4.2 Oblivious Transfer 2.5 Exercises. 2.1 Number Theory --------------------- Because modern cryptographic methods are deeply rooted in the theory of numbers, I will now treat some of the relevant theory. Don't worry too much if you don't understand everything in detail. 2.1.1 Modular Arithmetic: the Ring Zm. Consider the set Z of integers and let m be a positive integer; compute, for each number, the remainder after division by m (the quotient doesn't matter): Z: .. -m-1 -m -m+1 ... -1 0 1 2 .. m-1 m ... 2m ... rem: m-1 0 1 m-1 0 1 2 m-1 0 .... 0 .. Though there are infinitely many number, there are only finitely many residues (the numbers 0 .. m-1) and we group the numbers according to the remainder. _ DEFINITION: The residue class of x, x is {y: y mod m = x mod m}. The collection of residue classes is Zm. Because there are m different remainders, Zm has m members. The aritmetic operations of Z can be applied to Zm: _ __ _ __ ___ _____ ___ _____ LEMMA: If a=a' and b=b', then a.b = a'.b' and a+b = a'+b' As a consequence, we can define addition on residue classes. To add classes A and B, we can pick a from A and b from B, add them, and take the residue class of the result. The lemma states that the result is independent of the particular a and b chosen. Multiplication is defined similarly. To enhance confusion, we shall omit the bars that distinguish a number from its residue class. So we shall freely write 2 = 8 when computing in Z3, while 2=8 would normally sound as a curse to any self-respecting mathematician. We find that Zm is an algebraic structure that allows addition and multiplication, has identities for both, inverses wrt addition, and satisfies the same distribution laws as Z. Such a structure is called a RING. 2.1.2 The multiplicative group Zm*. We shall now consider division, as the operation that `undoes' a multiplication: division by a means you are given b = a.x and want to get x back, i.e., solve a.x = b (in Zm, of course). This is not always possible, not even for non-zero a: consider 6.x = 1 mod 15 as an impossible example. In other cases the solution is not unique, as for 6.x = 9 (mod 15) with solutions 4, 9, 14. (Don't say that 19 is also a solution, it is already listed! 6 = 19, remeber?) LEMMA: ax=b has a unique solution if a has a multiplicative inverse, i.e., there exists an a' s.t. a.a' = 1. PROOF. Assume there is such an a' and compute x = a'.b. Indeed we find a.x = a.a'.b = 1.b =b, so x is a solution. Now assume that y is also a solution. Then a.y = b and a.x = b, so a.y = a.x, or: a.(x-y) = 0. This implies that a'.a.(x-y) = a'.0 = 0, so 1.(x-y) = 0 which gives x=y, so x is indeed the only solution. [] We call the collection of divisible a Zm*: DEFINITION: Zm* = {a in Zm : Exists a' s.t. a.a' = 1 }. THEOREM: a in Zm* if and only if gcd(a,m) = 1. PROOF of ==> : Let a' be the multiplicative inverse and now a.a' = 1 (in Zm) means (in Z) a.a' - k.m = 1. As the gcd divides a and m, it divides a.a' - k.m, hence 1. So g=1 because 1 is the only divisor of 1. PROOF OF <== : The Extended Euclides algorithm to compute the gcd g also returns u and v satisfying a.u + m.v = g. If g=1, we read a.u + v.m = 1, so a.u = 1 in Zm. [] Zm* is closed under multiplication: a1, a2 in Zm* implies a1.a2 in Zm*. Also, Zm* has an identity for multiplication because 1 in Zm*. We say that Zm* is a multiplicative group. This set of residue classes is NOT CLOSED under addition, so addition is not formally a defined operation on Zm*. The chances of getting outside of Zm* are of course related to the `density' of Zm* in Zm and we shall study the size of Zm* later but define it already: DEFINITION (Euler phi function): phi(m) = |Zm*|. With multiplication, we also have exponentiation: a^k = a.a....a.a.a . Lagrange looked at what would happen if we exponentiate to the size of the group: THEOREM (Lagrange): b^phi(m) = 1. PROOF. Enumerate the phi members of Zm* x1 x2 x3 x4 ... ... x_phi and call their product S (S is in Zm*). Now multiply each number by b bx1 bx2 bx3 bx4 ... ...bx_phi and call the product of these numbers T. First, bring the b's outside the product and observe that T = b^phi . S . Second, recall that multiplication by b is undoable, hence the phi numbers in the second row are all different. That implies that these numbers are exactly all mebers of Zm*, hence the same numbers as in the first row, only permuted. Then, of course T = S. So S = T = b^phi . S and this implies b^phi = 1. [] Lagranges proof is very elegant and concise and mathematicians like these kinds of proof. We shall gain insight considerably by proving the same theory again introducing and using the notion of ORDER of an element. LEMMA: For each b there exists a positive k s.t. b^k = 1. PROOF. Enumerate the powers of b 1=b^0, b=b^1, b^2, .... b^i, ... b^j, .... As Zm* is finite, we SHALL find a number twice, so b^i = b^j, and then b^{j-i} = 1. [] DEFINITION: The smallest positive k for which b^k = 1 is called the ORDER of b, denoted ord(b). Understandably, not only b^ord(b) = 1, but also every b^y where y is a multiple of ord(b), equals 1. SHEET 4-2 computes several orders in Z21* and all orders we find are divisors of 12 which is phi(21). This, ladies and gentleman, is no coincidence. LEMMA: For each b, ord(b) | phi(m). PROOF. I shall cut Zm* in pieces, each as big as ord(b). Define a relation ~ on Zm*, where x~y means there exists an i such that x = b^i.y. This is an equivalence relation, hence it partitions Zm* in equivalence classes and each class has ord(b) members. [] (SHEET 4-2 shows the 2 equivalence classes for b=5 and m=21). COROLLARY (Lagrange!): b^phi(m) = 1. PROOF. We just showed phi is a multiple of ord(b), so b^phi = 1. [] 2.1.3 The structure of Zm* for prime m. Now assume m is a PRIME NUMBER (an odd one, actually; the even primes are barely used in cryptography). LEMMA: For prime m, phi(m) = m-1. PROOF: No number in 1..p-1 has a common divisor with m. [] Combining the size of Zm* with Lagrange's theorem gives us a nice result about prime numbers: LEMMA (Fermat's Little Theorem): If m is prime and a in Zm*, then a^(m-1) = 1. We also know that each number has an order dividing phi, but evaluating several orders in Z13* (see Sheet 4-3) reveals something very nice: Z13* contains an element (namely 2) whose order EQUALS the size of the group. This may sound like nothing special, but the consequences are enormous! (Actually, there are more numbers with this property, evaluate the orders of 6, 7, and 11.) Of course, an element of order phi(m) generates the whole Zm*, i.e., for every number a in Z13* there is an exponent i s.t. a = 2^i. An element of order phi is therefore called a GENERATOR and a group that has a generator is called a CYCLIC group. THEOREM (Euler): If m is prime, Zm* is a cyclic group. Given an odd number m, it is computationally feasible to find out if it is a prime number, and if so, to find a generator alpha of Zm*. DEFINITION. For prime m and alpha a generator of Zm*, the INDEX of b is the exponent i for which alpha^i = b. The index is an element of Z{m-1}. The index is sometimes called the DISCRETE LOGARITHM because it satisfies all nice rules we know for logarithms: product rule ind(a.b) = ind(a) + ind(b). power rule ind(a^k) = k.ind(a). base conversion ind_beta(b) = ind_alpha(b).ind_beta(alpha). Unlike its continuous counterpart, however, the discrete logarithm is (presumably) UNFEASIBLE. Given m, alpha, and i, it is easy to compute alpha^i by exponentiation. Given m, alpha, and a, it is conjectured to be infeasible to compute ind(a). So exponentiation is assumed to be a ONE-WAY function. This does not mean that indexes are worthless, because we can prove nice things with them without computing them. DEFINITION: Element a in Zm* is called a Quadratic Residu if there exist x with x^2 is a. THEOREM: Exactly half the numbers in Zm* are a QR, and each QR is the square of exactly 2 numbers. PROOF: a = x^2 in Zm* is the same as ind(a) = 2.ind(x) in Z{m-1} where m-1 is an even number. If ind(a) is odd, ind(a) = 2.ind(x) has no solution. If ind(a) is even, say ind(a) = 2.i, there are two solutions namely ind(x)=i and ind(x) = i+(m-1)/2. [] I'll interrupt our treatment of number theory with a small cryptographic protocol. The problem: Alice has a big machine that allows her to actually TEST if a given number b is a qr in Zm* and she offers a QR-test for money: send an Euro and a number and she returns a yes/no. Bob is interested in the service: he has a number b but does not know if it is a qr or not, However, his number is SECRET, he doesn't want Alice to know his number! The Solution: Bob mulitplies b with a RANDOM SQUARE, i.e., sends Alice the number b.x^2 for random x. (b.x^2) is a qr iff b itself is, so the answer of Alice applies for Bob's secret. Alice sees b.x^2 but not knowing x she cannot tell what b is. We call this operation BLINDING and blinding protects a secret PERFECTLY; no matter how much computing power Alice will mobilize, she cannot determine b from b.x^2 because the information simply isn't there. I must emphasize that Alice doesn't care if Bob blinds his number: he pays for the service and Alice is interested in making money out of her qr-tester, not to violate the privacy of her clients! We find that there are modes between `friend' and `foe', unlike in the historical contexts where either Alice is a friend and may everything of Bob, or she is a foe and not trusted to anything. In the Electronic Marketplace we regularly do business in situations of partial trust. Extension: In this solution Alice will know at the end whether Bob's secret is a qr or not. Extend the solution so that Alice will gain NO KNOWLEDGE about Bob's secret. Actually, Alice doesn't need a big machine; it suffices for her to use some software implementing Euler's criterion: THEOREM (Euler): b is a qr iff b^(m-1)/2 = 1. PROOF. Observe that (even though we cannot really compute indices!) ind( b^(m-1)/2 ) = (m-1)/2. ind(b). If b is a qr, its index is EVEN and so (m-1)/2.ind(b) = 0 in Z{m-1} so ind( b^(m-1)/2 ) = 0 and b^(m-1)/2 = 1. If b is a non-qr, its index is ODD and (m-1)/2.ind(b) = (m-1)/2 so ind( b^(m-1)/2 ) !=0 and b^(m-1)/2 != 1. [] Exercise: Show that for a non-qr b^(m-1)/2 = -1. 2.1.4 Chinese Remainder Theorem. Besides prime moduli, a second case of modular aritmetic that is of interest for cryptography is the situation where m is the product of two primes p and q. LEMMA phi(pq) = (p-1)(q-1). PROOF. Of the pq-1 numbers 1 .... pq-1, p-1 numbers are a multiple of q, and q-1 numbers are a multiple of p, and we subtract these to find phi = pq-1 - (p-1) - (q-1). (There are no COMMON multiples of p and q in this range). [] If you seek a number in Zm and know its remainder mod p and mod q, your number is uniquely determined and can be computed efficiently. THEOREM (Chinese Remainder Theorem): Let p and q be relatively prime; the equations x = ap mod p x = aq mod q have a unique solution modulo pq. PROOF (non-constructive). Map the m numbers of Zm on their residues mod p and mod q, i.e, define f: Zm --> Zp x Zq by : x --> (x mod p , x mod q). Is it possible that two numbers have the same image, i.e. f(x) = f(y)? f(x) = f(y) ==> (x mod p , x mod q) = (y mod p , y mod q) ==> p | (x-y) AND q | (x-y) ==> pq | (x-y) (use p&q are coprime) ==> m | x-y , so x=y in Zm. So f is INJECTIVE, but as |Zm| = | Zp x Zq | this implies it is SURJECTIVE as well, or: f is bijective. Indeed for every (ap,aq) in Zp x Zq there exists a UNIQUE x in Zm with f(x) = (ap,aq). [] To construct x from ap and aq, let yp = q' in Zp (the inverse) and mp = q.q'. Now mp = 0 mod q, and mp = 1 mod p. Similarly, have mq = p.p'. Given (ap,aq), take x = mp.ap + aq.mq in Zm; it is easy to verify that x mod p =ap and x mod q = aq. Here is an important application of CRT. THEOREM: A qr in Zm is the square of exactly four numbers.. PROOF. Let a be a square, say a=x^2, and consider the equation a=y^2. Now y^2 = x^2 mod m <==> y^2 = x^2 mod p AND y^2 = x^2 mod q <==> [ y=x mod p OR y = -x mod p] AND [ y=x mod q OR y = -x mod q] <==> (1) y = x mod p AND y = x mod q (2) OR y = -x mod p AND y = -x mod q (3) OR y = x mod p AND y = -x mod q (4) OR y = -x mod p AND y = x mod q Each of the four cases determines one number uniquely by CRT. [] Looking closer, we see that (1) describes y=x and (2) describes y=-x. Call the solution to (3) z, and see that then (4) describes -z. 2.1.5 Factoring large numbers. Given p and q it is easy to compute their product m. However, given a large m, product of two large primes, it is VERY DIFFICULT to find the two factors, even though they are uniquely defined by their product. CONJECTURE: Factoring is infeasible. We shall derive a few consequences. LEMMA: Given m, it is infeasible to compute phi(m). PROOF. As phi(m) = (p-1)(q-1) = m - p - q + 1, we derive m - phi(m) + 1 = p+q, and knowing their SUM and PRODUCT would reveal the factors. So a computation producing phi(m) would find the factors as well, which cannot happen in a feasible computation by the conjecture. [] LEMMA: Given m, it is not feasible to compute two numbers x and z with the same square but not z = -x. PROOF. z^2 = x^2 implies (x-z)(x+z) = 0 in Zm; as neither x-z or x+z is 0 in Zm, it follows that one is 0 in Zq and the other in Zp. So (x+z) is a multiple of either p or q, and then gcd(x+z,m) is a factor. So this will not show up in a feasible computation. [] The best attempts in factoring all try to get hold of two numbers with the same square. LEMMA: Given m and a q.r. a, it is not feasible to find a square root x of a. PROOF. If you can find square roots mod m, try the following. Take a random z and compute a root ix of z^2. Because z^2 has four roots and z is a random one, with probability 1/2 x and z form a pair as in the previous lemma. [] EXERCISE: Alice offers a Discrete Log (for some alpha, p) service over the internet: Send her an Euro and a number b, she will return you ind_alpha(b). Bob is interested in knowing the index of his number b, but b is secret. How will Bob use Alice's service without revealing b? 2.2 Public Key Cryptography: Complexity Overview. ----------------------------------------------------- Is there life after DES and what does it look like? We have already seen in Class 1 that symmetric cryptosystems do not offer perfect secrecy for messages beyond the unicity distance; for long messages the content is only protected computationally. Can one ever prove that some computation is expensive? That is precisely what we have complexity theory for. Can complexity theory prove something like Given a ciphertext Y encoded with a b-bit key and longer than unicity distance, it requires exponential time (c^b for some c>1) to find the plaintext X. 2.2.1 Complexity Theory about symmetric cryptography. However we observe that if Oscar could `guess' the key it takes him only polynomial time to run Ek and verify the correctness of his guess; that is, an attack against a symmetric cryptosystem falls in the class of POLYNOMIALLY VERIFYABLE problems, a class called NP in complexity theory; see SHEET 5-3. On the ground level of the picture we find polynomial problems such as sorting, and in the sky we find exponential problems like Chess variants. In `outer space' we find doubly exponential ones like Presburger Arithmetic decision. The `NP-building' consists of all problems that have polynomially verifiable solutions, and the problems in its highest floor are called NP-Complete (NPC) problems. We shall now place some cryptographic and number-theoretic problems in this picture. 1. Breaking a symmetric system is in NP. 2. Factoring is in NP because checking factors is simply done by multiplication. 3. The Discrete Log is in NP, because given alpha, i, and b, it is polynomial to VERIFY if log_alpha(b) =i , namley check if alpha^i = b. 4. If f is ANY polynomially computable function, its inverse is in NP because one can check the `guess' f_1(y) = x by seeing if f(x) = y. Intuitively, FINDING a solution is much more difficult than CHECKING one; despite all efford, complexity theory has never been able to prove this as a fact. WE DO NOT KNOW IF THE NPC BUILDING IS HIGHER THAN ITS GROUND FLOOR. This is an inherent danger of cryptography: if it would ever turn out that NP = P, all cryptographic attacks on systems using polynomial encryption/decryption are themselves polynomial! So in cryptography we say that only PESSIMISTS believe P=NP. 2.2.2 Public Key Cryptography. In 1976 a young student, Martin Hellman, was assigned by his professor Whitfield Diffie the task of designing a cryptosystem. Like most brilliant students, Hellman was very bad in reading the problem statement very carefully and he misinterpreted Kerckhoff's Law. Hellman thought the ENCRYPTION ALGORITHM was known to Oscar (instead of just the CLASS of algorithms). The encryption function Ek being known to Oscar, with the outcome Y, implies that encryption is a ONE-WAY function, i.e., one like a data-valve (sheet 5-1). To decrypt, Bob should possess the key to a trapdoor in the valve, i.e., Bob possesses some secret information related to the encryption algorithm Ek. Hellman envisioned that what we need is: 1. Keys come in PAIRS: k = (kp,ks), where kp is a PUBLIC part and ks is a SECRET part. 2. For each k, Ek uses the public key and is a ONE-WAY function. 3. For each k, Dk is efficiently computable, and uses ks. 4. (2) implies that it not feasible to compute ks from kp. Some attacks are now imaginable, and the algorithm should be secure against them. 5. The algorithm must be secure against a Chosen Plaintext Attack (CPA) because Oscar kan produce (X,Y) pairs with chosen X abundantly. 6. The plaintext alphabet P must be large, because otherwise Oscar may try a Brute Force attack over the Plaintext, ie, try each X to see if Ek(X) = Y. Understandably, Hellman's ideas were met with skepticism. Critics didn't believe that it was possible to have safe encryption while giving away the plaintext IMPLICITYLY (by giving the algorithm that computes the ciphertext). However as we have just seen, using symmetric encryption beyond unity distance ALSO gives the information away implicitly. In other words, PERFECT SECRECY does not exist (with small keys, that is), so public key encryption may be as safe as symmetric encryption. 2.2.3 The Knapsack system(s). Knowing complexity theory, we see that one-way (polynomial) functions ONLY exist if P != NP. Assuming this, a one-way function is readily constructed using an NPC problem, such as Subset Sum. PROBLEM Subset Sum: Given a1 .... an and a number S, find a subset of the ai with sum S. If P != NP, there exists no polynomial algorithm to solve this for all ai. Consequently, the function f(x1,...xn) = SUM xi.ai hides the 1-es in the bitstring x in a one-way fashion in its output. But a one-way function ALONE does not make a public-key cryptosystem: we need a trapdoor (inverse) in addition! Now though Subset Sum is NPC in the general case, for some special cases efficient solutions exist. DEFINITION: Sequence s is superincreasing if sj > SUM i= 946 so x9 = 1 and Y <-- 1643 - 946 = 697 Y >= 450 so x8 = 1 and Y <-- 697 - 450 = 247 Y >= 215 so x7 = 1 and Y <-- 247 - 215 = 32 Y < 103 so x6 = 0 Y < 45 so x5 = 0 Y >= 21 so x4 = 1 and Y <-- 32 - 21 = 11 Y >= 9 so x3 = 1 and Y <-- 11 - 9 = 2 Y < 5 so x2 = 0 Y >= 2 so, and so forth. The idea of one of the first public key systems was that Bob makes up a superincreasing sequence, but disguises it for Alice (and Oscar!) to look like an arbitrary one. Bob disguises his sequence with a SECRET prime exceeding the sum of all si, for example 2003, and a random multiplier m in Zp*, like 1289. Bob's PUBLIC sequence a is formed with ai = si.m mod p: ( 575, 436, 1586, 1030, 1921, 569, 721, 1183, 1570). Encrypting 101100111, Alice finds f(101100111) = 6665. To decrypt, Bob reduces modulo his secret 2003, divides by his secret 1289 and finds 1643, which he solves using the above method. This system, by Merkle and Hellman, impressed everybody at the time, but the system contains a very serious flaw: breaking the system is not as difficult as the subset sum problem, because the Subset Sum instances generated by the algorithm are only a SUBSET of all possible instances. The assumption P != NP implies that the WORST CASE for the Subset sum problem is super-polynomial, but that does not imply that the hidden superincreasing instances are super-polynomial. Indeed, the system was broken in 1980. 2.3 Number-Theoretic Public Key systems. ------------------------------------------- The assumption P != NP only implies that the WORST CASE for Subset Sum is super-exponential. The complexity landscape for the number-theoretic (conjectured) one-way functions is almost flat: The factoring and discrete log assumptions state, that finding p and q from their product, or indices modulo p, are infeasible for (almost) EVERY p (and q) and this explains the success of number-theoretical cryptosystems: take p and q and multiply them, and (barring a few restrictions I'll mention in a minute) there you have an almost certainly intractible instance of FACTORIZATION. A small warning is in place: whenever choosing a prime number for cryptographic use, make sure that p-1 has at least one large prime factor. This is done as follows. In stead of testing a range of 150 digit integers x, x+1, x+2 , ... for primes, start with a (say) 80-digit prime p' and search the numbers kp'+1, (k+1)p'+1, (k+2)p' + 1 , ... for primes (with 70 digit k!). 2.3.1 ElGamal's Cryptosystem. Recall the use of a blinding factor in the small protocol for computing quadratic residuity. In the ElGamal Cryptosystem, Alice sends data to Bob after multiplication by a blinding factor, chosen in such a way that Bob can find this factor and reveal the message, but Oscar cannot. Part 1: Key generation. Bob uses a large prime p (for which Discrete Logs are intractible) and a generator alpha of Zp*. He then chooses a RANDOM exponent a and computes beta = alpha^a. Bob makes p, alpha, beta PUBLIC but keeps a SECRET. Part 2. Encryption. Alice has a message x for Bob. She chooses a blinder as beta^k, where k is a random number; so her secret message x becomes y2=x.beta^k. To inform Bob about the blinder, Alice also sends y1=alpha^k. Alice sends the pair (y1,y2) to Bob. Part 3. Decryption. Bob knows (as the only one!) the relation between alpha and beta, and this allows him to find the blinder beta^k from Alice' hint alpha^k: indeed, beta=alpha^a implies beta^k= (alpha^k)^a, so Bob computes x = y2.(y1^a)^-1 (i,e., divides by the blinder y1^a). ATTACKS ON ELGAMAL. What about Oscar? For him, finding x is equivalent to finding the Blinder B because he knows their product, y2. Call a' and k' the inverses of a and k mod (p-1). Then, Oscar knows alpha = B^{a'.k'} beta = B^k' y1 = B^a' that is, he knows three powers of his target (not the exponents!!) where the third exponent is the product of the other two exponents. If Oscar were able to compute Logarithms in Zp*, he was done; assume he computes indices with basis gamma. 1. Compute a' as ind(alpha)/ind(beta). 2. Compute ind(B) as i = ind(y1) / a' 3. Compute B as gamma^i. Fortunately, we know (Hm, formally I should say `assume') that oscar cannot compute logarithms. It has not been shown, but is conjectured that solving B from B^{a'.k'}, B^k', and B^a' is computationally as hard as computing logarithms. REMARKS AND IMPLEMENTATION The system wastes communication bandwidth because for each message Alice has to send 2 numbers. Encryption is nondeterministic because of the random selection of k. As a consequence, one plaintext has many different ciphertexts. Bob can use a private p, but there is no reduction in security if a large community of users share the same p and even the same generator alpha; then p and alpha are system parameters, the secret key of Bob is a random number a, and his public key is beta = alpha^a. 2.3.2 RSA The RSA system, invented in 1978 by Ron Rivest, Adi Shamir, and Leonard Adleman, employs the fact (conjecture) the exponentiation modulo a composite is one-way for everybody not knowing the factors. Part 1. Key generation. Bob chooses large primes p and q and computes n=pq, and phi(n). Bob chooses a random number e in Zphi(n)* and computes d = e^-1 in Zphi(n)*. Bob publicizes n and e, and keeps d secret. Now both encryption and decryption are an exponentiation, where the secret exponent d undoes exponentiation by the public e. Part 2. Encryption is the function x --> x^e. Part 3. Decryption is the function y --> y^d. The reason why the d-th power is the same as the e-th root is that e.d = 1 in Zphi(n). It is basically the same rule as `taking the cube root is raising to the power 1/3'. ATTACKS ON RSA. What can Oscar do? 1. Try to factor n? No way, we have assumed this to be unfeasible. But you may argue that we have given Oscar more information, namely the exponent e. However, revealing e is like firing a blank cartridge, because e is essentially a random number; if knowledge of e helps, Oscar could also take just n and select a random number himself to factor n. 2. Try to find phi(n) in some way? No way, we have seen this to be impossible as a consequence of the intractibility of factoring. 3. Try to find d without knowing phi(n)? No way. There is a little calculation with INPUT n, d, e s.t. e.d =1 (mod Zphi(n) ) OUTPUT the factors of n (DeLaurentis, 1984). The existence of this trick implies that it is impossible to find d. It has two practical consequences, too: C1. The same n cannot be shared by several users: knowing `his' d and e, Bob can factor n and read all messages encrypted with the same n (but different e). C2. Would Bob ever leak his secret exponent d, it does not suffice to choose fresh e and d, because his modulus is now compromised. 4. Oscar finds any way to extract e-th roots in Zn. It is not known if such a method exists, but if a procedure to extract e-th roots can be used to find d, phi, or p and q, no such procedure exists. IMPLEMENTATION. Current RSA hardware handles about 600kbs, so it is 1500 times slower than DES! There is no loss of security if a fixed, small e is used: 3, 17, and 65537 are popular public exponents. Encryption is then quadratic in the key length, while arbitrary exponentiations (decryption) are cubic. Decryption can be quadrupled in speed by using the Chinese Remainder Theory: instead of one big exponentiation Bob performs two exponentiations on numbers of half the length. Chips using this have been shown to be vulnerable to a differential fault attack; Sheet 6-1. The setup here is that a smartcard proves its authenticity by showing that it knows the secret key matching some public (n,e). The verifyer gives the card an x, and the card outputs y = x^d, after which the verifier checks if the output is indeed the e-th root of the x. If the card demonstrates to be able to perform this computational tour-de-force, V will believe that the card is real. If the chip internally exploits the CRT to speed up the computation, Oscar is going to have his lucky day. He gives x = z^e to the chip, and should receive z back. During the computation, Oscar applies a modest amount of stress (cosmic rays, heat, undervoltage) to the chip in the hope that a single bit-error occurs during the computation. The most likely thing to happen is that one of the exponentiations fails, and y will equal z only modulo one of the factors! Oscar now finds this factor, and factors the modulus. The best attacks know are factoring attempts of the modulus. Currently the best algorithms executed on networks of the most powerful computers are almost able to factor 155 digit numbers (129 is the current record). So best is to have a 310 digit one (1024 bits). 2.3.3 Rabin's Square system. The bottom line is that for RSA a lot of attacks have been shown to be impossible (or rather: equivalent to factoring) but we have not shown that reading a message is impossible. Wouldn't it be nicer to have a system for which reading a single message is provably impossible? The answer is, surprisingly, NO! We have shown that computing square roots is infeasible (modulo a composit) and this is what Shamir's Square system is based on: Part 1: Key generation. Bob chooses two large primes, p and q, SUCH THAT P+1 AND Q+1 ARE A MULTIPLE OF 4. Bob announces n=pq, but keeps p and q secret. Part 2: Encryption. Alice encrypts with x --> x^2 mod n. Part 3: Decryption. Bob has to extract square roots mod n; to compute root(y), he uses CRT to split the compuation over Zp and Zq. Because p+1 is a multiple of 4, the roots of y in Zp are found as: C1 = y^{(p+1)/4} and C2 = -C1. (You can verify C1^2 = y using that y is a q.r., so y^{(p-1)/2} = 1!) Bob may combine the roots mod p and mod q in four different ways, but usually only one message will make sense. ATTACKS ON RABIN'S SYSTEM Lemma: If Oscar can find the decryption of a single message he has encrypted himself, he finds p and q with 50% probability. Proof. With 50% chance, Osar finds a root that is different from the number he started with. [] So the good news is that there is provably no way to read encrypted messages (unless factoring is easy), but there is some bad news also. What happens if Oscar manages to have unwatched access over Bob's decryption engine, and what happens if some cryptographic protocol requires Bob to decrypt a message for Oscar (as in the identification procedure)? Lemma: The algorithm is vulnerable against a Chosen Ciphertext Attack of O(1) X,Y pairs. Of course, Bob will give a number of which he knows a square root, and retrieves the secret key with fair probability of success! Actually, this practical vulnerability against a Chosen Ciphertext Attack is THE SAME PROPERTY as the theoretical strength against a Ciphertext Only Attack. The conclusion is eminent: IT IS VERY NICE IF YOUR SYSTEM IS STRONG; IT IS NOT NICE IF IT IS PROVEN TO BE STRONG. 2.3.4 Shamir's Key-less system. A symmetric system (Sheet 5-2) is like putting your letter in a box for which Alice and Bob share a key. A public Key system is like Bob handing padlocks to his friends, enabling them to lock a box (and still know what is inside!), but he holds the key and is the only one to open the box. But it is possible to communicate without shared information at all; Alice locks the box with her lock/key and sends it to Bob. Bob puts his lock on the box and sends it to Alice, Alice removes her lock and send it back, Bob removes his lock and reads the message. Here is the math behind it; there is a large prime p (publicly known) and Alice randomly picks an exponent a with inverse a' (over Z(p-1) ). Bob chooses b and b' similarly. Then Alice computes y = x^a and sends y to Bob; Bob computes z = y^b and sends z to Alice; Alice computes t = z^a' and sends t to Bob; Bob computes u = t^b' and finds x back!! Oscar sees x^a, x^b, x^ab, so three powers of his target where the third exponent is the product of the other two; this is similar as for ElGamal. 2.4 Information Services in the Electronic Market Place. ------------------------------------------------------------ An information economy is based on information services like storing, selling, and computations on data. The economical activitites are offered and performed over networks. Cryptographic techniques are used to achieve a mutual protection of client and server during the transaction. We shall now study two cases where the privacy of costumer Bob is protected cryptographically while requesting service from Alice. 2.4.1 RSA Decryptions. Alice has published an RSA key pair (n,e) and offers decryptions with the matching secret key: send her an Euro and a number b and she will compute b^d for you. (This setup may sound a bit strange to you now but we shall see later that this is an extremely basic service!!) Bob has a number y that he would like to see decrypted, but he doesn't want Alice to know y. As we have seen (in the case of the small q.r.-test), he can BLIND y and here blinding works as follows. 1. Bob chooses a random number c and computes b = c^e. 2. Bob sends y' = y.b to Alice (with the money!). 3. Alice takes the money and computes x' = y'^d. 4. Bob divides x' by c and finds x. Now observe that x = x'/c = y'^d / c = (y.c^e)^d / c = y^d.c / c = y because c^{e.d} = 1!! Alice gets no clue with regard to the actual value y of Bob because blinding conceals y PERFECTLY: LEMMA: Given y', for EVERY y there exists an c s.t. y' = y.c^e. Proof: Take c = (y'/y)^d. [] This appears as a bit strange, because the value computed by Alice could in fact be the disguise of anything! Saying that y' can be the blinded version of EVERY y, implies that the returned value x' can be UNBLINDED to every value. How can Alice be sure that Bob is not getting two results for the price of one? The reason is that, even though given y' a blinder EXISTS for every y, Bob only KNOWS it for one particular value, namely the value he started with. As you can see, finding the blinder for arbitrary values requires computing d-th powers, which Bob cannot do. 2.4.2 Oblivious Transfer Alice is selling secrets. In the context of the electronic marketplace, a secret is not always something you are supposed to keep for yourself: any valuable (=tradeble!) piece of information is called a secret. Alice has k secrets in her catalogue, and each one is the same price. For example: one Euro pays you the weather forecast for any of the next seven days. (Searching the internet you will find out quickly that the secrets are mostly pictures of nude persons or very explicit sexual activities.) Bob wants to buy a secret, but he doesn't want to reveal which one it is; Alice doesn't care, as long as he is paying for the secret. This is how Alice and Bob proceed: 0. Alice will usually put the data on her WWW server but each secret is encrypted with a different key (presumably DES). This allows us to reduce the `size' of the secrets to the size of a DES/3DES key. 1. Alice encrypts her k secrets with RSA; her (n,e) are public. 2. Bob picks the secret he wants to know and asks Alice to decrypt this one; only he sends it to Alice in blinded form!! The blinding will perfectly conceal for Alice which of her own secrets she is opening for Bob. 2.5 Exercises. ----------------- 1. Alice is a novice in the Electronic Market place. She spotted enormous public interest in Square Roots modulu a certain composit n. She invested millions of Euros in a large computation, revealing her the factors of n, and started offering a square root service mod n. She had expected to make a lot of money, but all what happened was that she received two numbers and then found out a cheaper service was opened. What had happened? 2. Alice is a wizard of the Electronic Market place. She spotted enormous public interest in Discrete Logs modulo a certain prime p. She invested millions of Euros in a large machine that can compute them, and started offering a Discrete Log service mod p. a. Bob wants to use her service but doesn't want to reveal his number. Describe how Bob can blind his input. b. Bob doesn't really trust the reliability of Alice. How can he verify the results returned by Alice?