El Gamal Encryption

 

El Gamal Encryption

El Gamal Encryption was designed by Taher Elgamal as an alternative to the RSA encryption algorithm. It is a public key (asymmetric) encryption algorithm that is based on the Diffie-Hellman Key Exchange.

Structural Differences Between El Gamal and RSA

While RSA depends on the difficulty of factorising the product of two large prime integers (also known as the prime factorisation problem), El Gamal relies on the difficulty of the Discrete Log Problem (computing discrete logarithms) in a large prime modulus.

Advantages and Disadvantages of El Gamal

Advantage: The same plaintext input gives you different ciphertext outputs each time the algorithm is run, which means that no matter how many times you give it the same input, the output will always be different.

Disadvantage: The ciphertext is $Length(2M)$ where $M$ is the message being encrypted. This means that the ciphertext produced will be twice as long as the original input.

The following explanation of El Gamal encryption assumes a brief understanding of set theory, Mathematics, Algebra, and cryptography

Mathematical Notation

  • $\mathbb{Z}$: The set of integers
  • $\mathbb{R}$: Set of real numbers
  • *: Kleene star: Zero or more of whatever it is superscripted to
  • $\mathbb{Z}_n^*$: The set of integers up to $n$ Zero or more times.
  • {$0,1$}$^*$: 0 or 1, in any combination, 0 or more times. (This includes the empty string)
  • {$0,1$}$^n$: 0 or 1, in any combination, $n$ times.
    • For example: {$0,1$}$^3$ =>$\lambda$, 000,001,010,100,110,101,011,111
    • Hash function notation: {$0,1$}$^*$=> {$0,1$}$^n$ meaning it takes a string of 1s and 0s of any length, and converts it to a string of fixed length $n$.

Definitions

Primitive roots: A primitive root (aka primitive element, or generating number) is a number modulo n, which finds that every integer relatively prime to n (ie $gcd(\alpha,n)=1$) is congruent to a power of that number $\alpha$ modulo n

In other words: \(\mathbb{Z}_n^* = {a \in \mathbb{N}: 1 \leq a \lt n, gcd(a,n)=1}\)

This means that for any set of integers up to and including n, a generating number ($\alpha$) is one where every power of $\alpha$ is congruent to a number within the set of integers up to n, that is relatively prime to n (meaning they have no common factors.)

Order of: when a number is said to be “to the order of”, a number is an order of another when it is raised to the power of another, which makes it congruent to 1 mod p.

An order is the smallest integer that will be equal to 1 mod p.

The Algorithm

  1. Find a suitable prime number p
  2. Find an element g of order p-1 (ie $g^{p-1} =1 (mod. p)$)
  3. Alice chooses a secret exponent, $n_1$ and computes $h_1=g^{n_1}$ then makes h1 public
  4. Bob does the same thing with another exponent $n_2$, computes $h_2$ and makes it public.
  5. With these, Alice and bob can arrive at the same secret without any further communication:
  6. Alice takes $h_2$ and raises it to the power $n_1$ (ie: $h_2^{n_1}$) (takes bobs public output and raises it to her secret exponent
  7. Bob does the same thing with Alice’s output and his secret exponent
  8. This means that both end up with ${g^{{n_1}^{n_2}}}$

The difficulty of this problem is the discrete logarithm problem.

Encryption

  1. Alice wants to send a message M (an integer mod P) to Bob
  2. To do this, Alice computes $C = M* h_2^{n_1} (mod. p)$ where M is the message, C is the cipher text and H2N1 is the key they both computed.
  3. To decrypt, bob will compute: $D= h_1^{n_2} (mod. p)$
  4. Now we find that D is also equal to $D= h_2^{n_1} (mod. p)$
  5. Bob then computes the inverse of D modulo P and multiplies it by C to recover M
  6. Ie: $M = D^{-1}(mod. p) * C$

One major flaw here is that the entity $h_2^{n_1}$ is used every time. What if someone already knows a message that Alice is going to send? And once A encrypts it gets the cipher text? Charlie can take the message M and compute the inverse of it to get H2N1, and can then subsequently read all the messages Alice sends from there.

Randomised encryption

To introduce randomness to the encryption, the following steps are taken:

  1. Alice takes a random integer k.
  2. She then computes $g^k =r$
  3. To encrypt the message, take Bob’s public key that was sent, and compute $C=M*h_2^k$ then send (r,C) to bob
  4. K is then discarded, meaning that the ciphertext will be different for every different value of k

The deception is as follows:

  1. Bob Computes $r^{n_2}$ using his private key, then compute $C/r^{n_2}$
  2. This works because $r^{n_2} = g^{kn_2} = h_2^k$

This is difficult to solve as given $h = g^n$ and $g$ the problem to retrieve n is called the discrete logarithm problem, and for well chosen sets this is extremely hard to solve.

Attacks on DLP

Given an element g of order p-1 with $h = g^n$

Bruteforce: Trying all the possible exponents n (this on average needs p/2 attempts) This means that if P is really big, then it can be extremely difficult.

Silver-Pohlig-Hellman reduction: this shows that the difficulty of the DLP is dependent on the size of the largest prime factor of P-1

More sophisticated approaches:

Pollards $\rho$ method : this principle is the same as for factorisation and relies on the birthday paradox: create a sequence random enough for the birthday paradox to apply but with enough structure that we can detect a collision easily. The birthday paradox has a O($\sqrt{n}$) where n is the number of samples generated.

$\mathbb{Z}_{10} = {0,1,2,3,4,5,6,7,8,9}$

\[{0,0,0}{0,1,0}{0,1,4},{3,6,9}\]