Cryptography lecture

Cryptography

Cryptography is an essential feature of modern life. The field of cryptography aims to ensure the confidentiality, integrity, and authenticity of information by employing robust encryption techniques. Cryptography is used in almost all modern communication systems.

Our goal in this lecture is to be able to use mathematics to communicate securely over an insecure network.

Along the way we will learn about some key mathematical concepts, including:

  1. Modular arithmetic
  2. Classical cryptography
  3. Public Key cryptography

Mathematics and Problem Solving

2. Cryptography

David Gundry

Learning Outcomes

    Understand and apply the mathematics behind the Caesar Cypher
    1. Solve equations involving the modulus operation
    2. Evaluate whether a congruence holds under modulus
    3. Perform addition under modulus
    4. Perform subtraction under modulus
    5. Calculate the additive inverse of a number under modulus
    6. Evaluate simple encryption and decryption cypher functions, such as those for caesar and atbash
    Understand and apply the mathematics behind the Affine Cypher
    1. Define prime numbers
    2. Find prime factors of a number
    3. Find the greatest common denominator of two numbers
    4. Perform multiplication under modulus
    5. Apply division in modular arithmetic and calculate the multiplicative inverse of a number under modulus
    6. Perform division under modulus
    7. Encrypt and decrypt strings using affine cypher
    Understand the basics of some modern encryption techniques
    1. Encrypt and decrypt strings by creating a one time pad
    2. Encrypt and decrypt using a simple stream cypher by evaluating provided random number generator
    3. Describe the basic ideas of public key cryptography
    4. Evaluate both sides of a Diffie-Hellman Key Exchange

Learning Outcomes

  1. Understand and apply the mathematics behind the Caesar Cypher
  2. Understand and apply the mathematics behind the Affine Cypher
  3. Understand the basics of some modern encryption techniques

Introduction

A key tool in mathematical cryptography is modular arithmetic. Modular arithmetic is also widely used for computer science.

Modular arithmetic is useful for cyclic systems, such as calendars. It is useful for cryptographic systems where you are rotating values within a fixed alphabet. The computational properties of exponentiation and logarithms in modular arithmetic (i.e. it is easy to compute the exponential but hard to compute the logarithm) make it useful for public-key cryptography. Public-key cryptography is essential in securing the modern Internet.

Modular arithmetic is also useful when we are dealing with systems that can only represent a finite set of numbers and 'overflow' if the numbers get too big, such as storing numbers as a fixed number of bits in a computer.

Modular Arithmetic

Overview

Our goal is to be able to understand the necessary mathematics to communicate securely over an insecure channel. We will work towards our goal step by step.

First we will look at classical cryptography and understand how to mathematically represent a basic cypher. We will learn modular arithmetic in order to encrypt and decrypt using the Caesar and Affine cyphers.

We will then turn to modern cryptography. We will learn how to use one-time pads and stream cyphers and pseudorandom number generators

Finally, we will address the problem of securely agreeing on the keys for our encryption when an eavesdropper is listening to everything we say. To do this, we will see the example of Diffie Hellman key exchange.

Overview

  1. Classical Cryptography
  2. Modular Arithmetic
  3. Prime Numbers
  4. Modern Cryptography

Classical Cryptography

Classical cryptography refers to the traditional methods of encryption and decryption that were practised before modern computer-based methods.

Classical cryptography has a long and rich history, with roots dating back thousands of years. Early civilizations, such as the Egyptians, Greeks, and Romans, developed rudimentary methods of encryption to protect messages from being intercepted. These methods often involved simple substitution ciphers, where each letter or symbol in the plaintext was replaced with a different letter or symbol in the ciphertext according to a predefined rule or key.

During World War II, the Enigma machine was used by the German military to encrypt their communication. The Enigma machine employed rotor-based encryption, creating a complex and constantly changing substitution cipher. Although the Enigma machine was initially considered unbreakable, it was eventually deciphered by the efforts of cryptanalysts, most notably the British codebreakers at Bletchley Park, led by Alan Turing.

Classical cryptography laid the foundation for modern cryptographic techniques. While classical methods are no longer considered secure for protecting sensitive information against sophisticated attacks, they hold historical significance and provide insights into the evolution of cryptographic principles. The study of classical cryptography helps in understanding the challenges faced by early cryptographers and the subsequent advancements that led to the development of stronger encryption algorithms and protocols in the modern era.

Classical Cryptography

Atbash

Let's start by considering a simple cypher. Indeed, one of the simplest.

Atbash is a simple cypher dating from biblical times. It was originally used to encrypt the Hebrew alphabet. Its function is simple. Each letter is replaced the letter found by reversing the alphabet. a is replaced by z, b is replaced by y, and so on, as shown in the table below:

PlaintextABCDEFGHIJKLMNOPQRSTUVWXYZ
CiphertextZYXWVUTSRQPONMLKJIHGFEDCBA

Atbash is one of a class of cyphers called monoalphabetic substitution cyphers, because each letter is always substituted for the same letter, wherever it occurs in the plaintext.

Atbash, as you can work out for yourself, is reversible. By applying the same process to an encrypted word, the plaintext is recovered. Thus if I was to give you the word:

Cyphertext = shibboleth

Plaintexthsryylovgs
\( X \)718172424111421618
\( 25-X \)18781114114197
Cyphertextshibboleth

To study such cyphers mathematically, it can be helpful to express them as a mathematical function. To do so, we represent each letter of the alphabet is a number. a is 0, b is 1, c is 3, etc. For the present cypher, we will call the value of a given letter $X$. We will express the encrypted form as the result of the function $E(X)$.

\[ E(X) = 25-X \]

Thus the encrypted form of a, which is represented by the value 0, is $E(0) = 25 - 0$. 25 is the value that represents z, the last letter in the alphabet.

We could write any number of functions to encrypt text. Once we have decide on a way to convert plaintext symbols into numbers (e.g. by representing a with 0, etc.) we can then perform any computations on these numbers that we like. However, not all computations are going to be equally useful as cyphers. For instance, while the cypher $E(X) = 0$ (replace all letters with 0 or a) is unbreakable, the plaintext is completely unrecoverable!

Atbash

Cyphertext = shibboleth

Plaintexthsryylovgs
\( X \)718172424111421618
\( 25-X \)18781114114197
Cyphertextshibboleth

Caesar Cypher

One problem that we quickly encounter when playing with cyphers is that we only have 26 letters in the alphabet (and any alternative alphabet we might choose is finite). The cypher text must fit within the range of values that can be represented with our alphabet.

The Caesar cypher is a simple shift cypher used by Julius Caesar in his private correspondence. To use it, find the value for each letter in the plain text and add a number, such as 3. This is the cypher's key. In order to prevent the letters at the end of the alphabet falling off the end, we will make the alphabet 'wrap around' through the use of the modulus operation $mod$. Thus the Caesar cypher can be written as the function:

\[ E(X) = (X + k) \mod{m} \]

With a key of $k=3$, the letter z becomes c. The 'wrapping' is achieved by the modulus operation.

Play with the tool below to see how changing the value of $k$ affects the cyphertext. Observe how the values 'wrap around' at the end of the alphabet due to the mod operation.

k =

Plaintextabcdefghijklmnopqrstuvwxyz
\( X \)012345678910111213141516171819202122232425
\( X + 3 \)345678910111213141516171819202122232425262728
\( (X + 3) \mod 26 \)345678910111213141516171819202122232425012
Cyphertextdefghijklmnopqrstuvwxyzabc

To understand the mathematics behind these sorts of cyphers we need to understand the mod operation and modular arithmetic.

Caesar Cipher

Cyphertext = kvuoryjv

k =

Plaintexthsryylovgs
\( X \)718172424111421618
\( X + 3 \)1021202727141724921
\( (X + 3) \mod 26 \)10212011141724921
Cyphertextkvubboryjv

Caesar Cipher Function

\[ E(X) = (X + k) \mod{26} \]

Modulus Basics

We saw that to understand cryptography we need to understand modular arithmetic.

You will be familiar with performing arithmetic on a number line. The numbers run from negative infinity, past 0, to positive infinity. Now imagine we cut the number line so it only runs between 0 and a finite number, say 11. Now bend the number line into a circle. Now what happens if you start with the last number (here, 11), and add 1 more? You wrap around back to 0!

If this is hard to imagine, realise that the same idea applies when performing arithmetic on a 12-hour clock. For example, ten o'clock plus three hours is one o'clock, not thirteen o'clock.

When working under modulus, we only work with integer values from 0 (inclusive) to our modulus (exclusive). That is, we are performing arithmetic on a finite set of integers. For example, when we add numbers together, we do not always get a larger value because the values 'wrap around', as illustrated in the figure below:

+ () = 4

01234567891011
\[ \mathbb{N}_{12} = \{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 \} \]
\[ A + B \pmod{12} \]\[ 2 + 2 \pmod{12} \equiv 4 \]

If we are performing modular arithmetic, we write the modulus we are using at the end in brackets. For example, a simple addition under modulus $5$ might be written:

\[ 2 + 4 \pmod{5} \equiv 1 \]

The symbol $\equiv$ will be explained below.

Modulus Basics

+ () = 4

01234567891011

What is a finite set of integers?

Under modulus $n$ we use the numbers $0 \leq x \leq (n-1)$

\[ \mathbb{Z}_6 = \{\overline{0}, \overline{1}, \overline{2}, \overline{3}, \overline{4}, \overline{5}\} \]

Modulus Operation

You might have noticed there is a relationship between modular arithmetic and the remainder of a division. For example, if I were to perform the integer division of 12 and 4, I would find that the remainder is the same as the value of 12 under the modulus 4:

\[ 12 \div 4 = 3 \text{ remainder } 0 \]
\[ 12 \mod 4 = 0 \]

As I have done here, we can apply modulus as an operation when working in otherwise normal arithmetic. We do this using the $\\mod$ operation.

The modulus operation, written mod (and written in most programming languages as %) takes two arguments $a$ and $b$, and returns the remainder from dividing $a$ by $b$. Note that mathematically, mod returns a positive integer, however most programming languages treat % as the remainder, which can be negative and in some situations a decimal.

To find $x \mod m$ of any x, you can follow the following process:

  1. If \( x < 0 \), repeatedly add $m$ until \( 0 \leq x < m \)
  2. If \( x \geq m \), repeatedly subtract $m$ until \( 0 \leq x < m \)

Remember, your result will always be in the range \( 0 \leq x < m \)

Modulo Operation

The modulo operation finds the remainder of a division

\[ ( 5 + 3 ) \mod{6} = 2 \]

Modulo Operator

Many programming languages we use the symbol % for mod

15 % 6 = 3

Programming languages will return fractional values for %

3.5 % 2 = 1.5

Modulus Algorithm

To calculate the value of $n \mod m$,

  • If \( n > 0 \), subtract $m$
  • If \( n < 0 \), add $m$

Until $0 \leq n \leq (m-1)$

When finding $n \mod m$, remember your answer must be in the range $0 \rightarrow m-1$

Modular Addition

Calculate the following additions under modulus (without using the tool above).

    Modular Addition

    Calculate the following additions under modulus.

      Congruence

      A key property of modular arithmetic is the equivalence of different (non-equal) numbers. You might have noticed the following when performing addition under modulus (if you didn't take a look at the questions above again):

      • Under modulus $m$, the result of adding $x$ to a number is the same as adding $(x+m)$

      In other words, if we start at a number, and count on the value of our modulus, we get back to the place we started! We can count on as many multiples of our modulus as we like, but every whole multiple of our modulus will get us back to the same place. This indicates a fundamental equivalence between different numbers in modular arithmetic.

      Two numbers in our modular arithmetic $a$ and $b$ are equivalent (called congruent) if they differ by a multiple of the modulus (i.e. if $a-b = kx$). Congruence is written using the symbol $\\equiv$. We can express this 'getting back where you started' property with the formula:

      \[ x \equiv x+km \pmod{m} \]

      It may be helpful to visualise this using the circle below. The modulus 12 is the set of natural numbers smaller than 12, including 0: \( \{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 \} \). The central circle below contains all these numbers. Outside are shown the first three different numbers that are congruent to each middle number under this modulus (though in principle this could be continued indefinitely). For example, 12, 24, and 36 are all congruent to 0 modulo 12.

      01224361132537214263831527394162840517294161830427193143820324492133451022344611233547

      This gives us sets of numbers called congruence classes. In the above there are $12 congruence classes, which we might label \( {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11} \).

      The congruence class $0$ is the set of numbers \( \{ 0, 12, 24, 36, 48, 60, 72, \dots \} \)

      The congruence class $1$ is the set of number \( \{ 1, 13, 25, 37, 49, 61, 73, \dots \} \)

      The congruence class $2$ is the set of number \( \{ 2, 14, 26, 38, 50, 62, 74, \dots \} \)

      The congruence class $3$ is the set of number \( \{ 3, 15, 27, 39, 51, 63, 75, \dots \} \)

      ...and so on.

      When we are working under modulus, the different elements of a congruence class are completely equivalent.

      You could also think about modular arithmetic being as arithmetic on these classes, For example, under a modulus of 12, if you take something in the class \( 11 \) (\( \{11, 23, 35, \dots \} \)) and add something from the class $1$ (\( \{1, 13, 25, \dots \} \)) and you will get something from the class $0$ (\( \{0, 12, 24, \dots \} \)).

      Definition of Congruence

      \[ a \equiv b \pmod{n} \]
      \[ (a \mod n) = (b \mod n) \]

      Congruent numbers 'wrap around' to the same value

      Congruence

      If $a \equiv b$, the difference between $a$ and $b$ will be a multiple of $n$.

      \[ a - b = kn \]

      Congruences

      All integers are congruent to a number in our modulus, e.g. for modulus 6

      \( 0 \equiv 6 \equiv 12 \equiv 18 \equiv \dots \)\( 1 \equiv 7 \equiv 13 \equiv 19 \equiv \dots \)\( 2 \equiv 8 \equiv 14 \equiv 20 \equiv \dots \)\( \dots \)\( 5 \equiv 11 \equiv 17 \equiv 23 \equiv \dots \)

      Congruence is Reflexive

      Every number (that exists under the modulus) is congruent to itself, mod n:

      \[ a ≡ a \pmod n \]

      Congruence is Symmetric

      If $a$ is congruent to $b$, then $b$ must be congruent to $a$

      $a ≡ b \pmod n$ if $ b ≡ a \pmod n$ for all $a, b$, and $n$.

      Congruence is Transitive

      If $a$ is congruent to $b$, and $b$ is congruent to $c$, then $a$ must be congruent to $c$

      If \( a \equiv b \pmod{n} \) and \( b \equiv c \pmod{n} \), then \( a \equiv c \pmod{n} \)

      Modular Subtraction

      We have already seen modular addition. Modular subtraction is similar.

      To calculate \( a - b \pmod{m} \) we can first calculate $a-b$ and then use the mod operation. Remember the result will be non-negative.

      (Note: The arrow below is drawn backwards. SVGs are hard to work with.)

      - () = 0

      01234567891011
      \[ A - B \pmod{12} \]\[ 2 - 2 \pmod{12} \equiv 0 \]

      Modular Subtraction

      Modular Subtraction

      - () = 0

      01234567891011

      Additive Inverse

      Subtraction can also be seen as the addition of the additive inverse of a number.

      The additive inverse of a number is the number you need to add to $a$ to get 0. Usually this is straightforward. The additive inverse of 5 is -5 in normal arithmetic because 5 + (-5) = 0.

      However when working under modulus, we do not have negative numbers! Still, because of the 'wrapping' effect, there exists an additive inverse for every number under modulus.

      For all \( w \in \mathbb{Z}_n \) there exists a $z$ such that \( w + z \equiv 0 \pmod{n} \).

      For example, the additive inverse of \( 5 \pmod{7} \) is \( 2 \) because \( 5 + 2 \equiv 0 \pmod 7 \).

      Additive Inverse

      The additive inverse of a number is what you add to get 0

      \[ w + (- w) = 0 \]

      The additive inverse under modulus $n$ is a positive number in \( \mathbb{Z}_n \)

      • Relies on the "wrapping" effect to get to 0
      • Whatever you need to add to get to the modulus

      Additive Inverse

      For each \( w \in \mathbb{z}_n \), there exists a z such that

      \[ w + z \equiv 0 \pmod n \]

      We can rearrange to get

      \[ z \equiv -w \pmod n \quad \text{and} \qquad z = n - w \]

      Modular Subtraction Questions

      Answer the following questions using additive inverses (without using the tool above).

        Modular Subtraction

        Answer the following questions using additive inverses

          Cracking the Caesar Cypher

          A brute force attack is when you try every possible key.

          Imagine we receive a password encrypted using the Caesar cypher that we want to decode. We do not know the key, but we might guess it is an English word.

          vzrugilvk

          As we saw above, the caesar cypher encryption function is

          \[ E(X) = X + k \mod {m} \]

          We can rearrange this formula to identify what the original letter must have been:

          \[ X = E(X) - k \mod {m} \]

          We can express this as a decryption function. This is purely a relabelling of the equation above. By comparing it to the encryption function you can see the subtraction operation is the inverse of the addition operation (and vice versa).

          \[ D(X) = X - k \mod {m} \]

          As we do not know the key, we can try all of the possibilities. As the alphabet contains 26 letters, our modulus is 26, thus there are only 26 possible keys (all other keys are congruent with one of these 26).

          keypossible plaintextkeypossible plaintext
          0vzrugilvk13imehtvyix
          1uyqtfhkuj14hldgsuxhw
          2txpsegjti15gkcfrtwgv
          3swordfish16fjbeqsvfu
          4rvnqcehrg17eiadpruet
          5qumpbdgqf18dhzcoqtds
          6ptloacfpe19cgybnpscr
          7osknzbeod20bfxamorbq
          8nrjmyadnc21aewzlnqap
          9mqilxzcmb22zdvykmpzo
          10lphkwybla23ycuxjloyn
          11kogjvxakz24xbtwiknxm
          12jnfiuwzjy25wasvhjmwl

          We find that the plaintext is an English word when $k=3$, so this is likely the original password.

          CypherTextvzrugilvk
          $X$2125172068112110
          $(X - 3) \mod 26$18221417358187
          Plaintextswordfish

          A known plaintext attack is when you know (or guess) part of the original message (the plaintext).

          We receive a message encrypted using the Caesar cypher that we want to decode.

          l oryh brx

          We might guess that this is an English sentence. We might further guess that the first word l is an English word, such as "I" or "a". We will try "I" first.

          Guessing that i was encoded as l, and knowing the caesar cypher encryption function is

          \[ E(X) = X + k \mod {m} \]

          We can substitute values of $X$ and $E(X)$ to solve for $k$. We know that i = 8, which is what we assume $X$ to be, and l = 11, which is what we assume $E(X)$ to be.

          \[ 11 = 8 + k \mod {m} \]

          We can rearrange to find $k$:

          \[ k = 11 - 8 \mod {m} \]

          Thus, if our guess that l is i is correct, the key would be $ k = 3$.

          By decrypting using this key, we find what looks like the original message: i love you

          CypherTextloryhbrx
          $X$11141724711723
          $(X - 3) \mod 26$81114214241420
          Plaintextiloveyou

          Decoding Caesar

          \[ E(X) = (X + k) \mod{26} \]
          \[ D(X) = (X - k) \mod{26} \]

          Affine Cypher

          Let us see a cypher that is a little more interesting than the Caesar cypher. The table below illustrates the Affine Cypher. This is another monoalphabetic substitution cypher, which is defined by the encryption function:

          \[ E(X) = aX + b \]

          This cypher has two keys, $a$ and $b$.

          Plaintextabcdefghijklmnopqrstuvwxyz
          \( X \)012345678910111213141516171819202122232425
          \( 3X + 3 \)3691215182124273033363942454851545760636669727578
          \( (3X + 3) \mod 26 \)369121518212414710131619222525811141720230
          Cyphertextdgjmpsvybehknqtwzcfiloruxa

          To encode this cypher we need to perform multiplication, which is straightforward, as you will see. However, decoding the cypher is a little more challenging. The inverse of multiplication is division, and modular division is rather counter-intuitive. First we will understand multiplication and then we will consider the inverse of multiplication, division.

          Affine Cypher

          Affine Cipher

          The Affine Cipher is defined by the function

          \[ E(X) = (aX + b) \mod{26} \]

          Cyphertext = yfcxxktovf

          Plaintexthsryylovgs
          \( X \)718172424111421618
          \( 3X + 3 \)24575475753645662157
          \( (3X + 3) \mod 26 \)24522323101914215
          Cyphertextyfcxxktovf

          Modular Multiplication

          Multiplication is repeated addition, so other than the wrapping property we have already described, multiplication under modulus works as normal.

          We can calculate \( a \times b \pmod{m} \) by first calculating $a \times b$, and then using the modulus operation.

          x () = 4

          01234567891011

          Modular Multiplication

          x () = 4

          01234567891011

          Multiplicative Inverse

          To decode the Affine cypher we will need to perform the inverse of multiplication, which is division.

          The multiplicative inverse of a number $a$ is the number you need to multiply by $a$ to get 1. This can be notated $a^-1$.

          \[ a \times a^{-1} = 1 \]

          Like additive inverse, this is straightforward for arithmetic on the real numbers (i.e. where we can have fractional answers). For instance, the multiplicative inverse of $ 5 $ is \( \frac{1}{5} \), because \( 5 \times \frac{1}{5} = 1 \).

          However, when we are working under modulus $m$, we can only use those whole numbers that are permitted under our modulus (i.e. between 0 and $m-1$). This should be clear if you think about the Affine cypher. If we tried to decrypt text encoded with the affine cypher, and the inverse of multiplication gave us a decimal value, it would not correspond to a letter in our alphabet!

          Recall that the multiplicative inverse is the number that you need to multiply by to get to 1. Under modulus, this number must be a whole number. It happens that this may in fact be larger than dividend because the wrapping effect can still lead to a result of 1.

          For example, the multiplicative inverse of \( 5 \pmod7 \) is $3$, because $5 \times 3 \equiv 1 \pmod{7}$. This is defined because 5 and 7 are co-prime. Their greatest common denominator is 1.

          \[ 5^{-1} = 3 \]

          Note that, under a modulus of $n$, the multiplicative inverse of a number $w$ is only defined if $n$ and $w$ are co-prime (i.e. the greatest common divisor of $w$ and $n$ is $1$). Otherwise it is undefined. We will come back to primeness and co-primeness of numbers later in the lecture.

          In summary, for all \( w \in \mathbb{Z}_n \), where $w$ and $n$ are co-prime, the multiplicative inverse $a$ is the value such that:

          \[ a \times w \equiv 1 \pmod{n} \]

          Multiplicative Inverse

          The multiplicative inverse of a number \( a \) is notated \( a^{-1} \)

          \[ a \times a^{-1} = 1 \]

          The multiplicative inverse under modulus cannot be \( \frac{1}{a} \) as we do not have fractional values

          Modular Multiplicative Inverse

          Sometimes there is a number that we can multiply by to get 1, e.g.

          \[ 7 \times 3 ≡ 1 \pmod{20} \]

          For each \( w \in \mathbb{Z}_n \),

          \[ aw \equiv 1 \pmod{n} \]

          The multiplicative inverse $a$ is defined only if $gcd(w, n) = 1$

          Modular Division

          Now we have worked out the modular multiplicative inverse, we can perform modular division, which will let us decrypt the affine cypher.

          Observe that division by a number $w$ is the same as multiplication by the multiplicative inverse of $w$. For instance:

          \[ a \div w = \frac{a}{w} \]
          \[ = a \times \frac{1}{w} = a \times w^{-1} \]

          Thus to divide by a number $w$ under modulus $n$, we multiply by the multiplicative inverse of $w$, if it is defined. If the multiplicative inverse of $w$ is not defined (because $w$ and $n$ are not co-prime) then the division is not possible (its result is undefined).

          \[ a \div w \pmod{n} = a \times w^{-1} \]

          When we want to perform the inverse of multiplication, which is division, we can instead multiply by the multiplicative inverse. We can apply this to decrypting the Affine cypher.

          Division

          Division is multiplication by the inverse, e.g.

          \[ 4 \div 2 = 4 \times 2^{-1} \]
          \[ 10 \div 3 = 10 \times 3^{-1} \]

          Question

          What is \( 5 \div 3 \pmod{11} \)?

          Modular Division

          What is \( 5 \div 3 \pmod{11} \)?

          Finding inverse

          The inverse of \( 3 \pmod{11} \) is 4, because

          \[ 3 \times 4 \equiv 1 \pmod{11} \]

          Performing division

          Thus

          \[ 5 \div 3 \equiv 5 \times 4 ≡ 9 \pmod{11} \]

          Test for Co-primeness

          We must check that divisor ($w$) is co-prime with the modulus ($n$)

          \[ a \div w \pmod n \]

          \( w^{-1} \pmod{n} \) is only defined when \( gcd(w, n) = 1 \)

          Affine Cypher Decryption

          Imagine we receive a password encrypted using the Affine cypher that we want to decode.

          crdwjb

          The affine cypher encryption function is

          \[ E(X) = (aX + b) \mod m \]

          We can rearrange this formula to identify what the original letter must have been.

          \[ X = a^{-1}(E(X) - b) \mod m \]

          We can express this as a decryption function. As you can see, the inverse of the multiplication operation is the multiplicative inverse.

          \[ D(X) = a^{-1}(X - b) \mod m \]

          Assuming we know the keys are $a=3$ and $b=1$, we can work out that $a^1 = 9$ (because $3 \\times 9 \\equiv 1 \\pmod26$) and evaluate the decryption function for each letter of the cyphertext.

          Cyphertextcrdwjb
          $X$21732291
          $3^-1(X - 1) \mod 26$914187200
          Plaintextjoshua

          Note that the key $a$ must be co-prime with the modulus $m$ for the decryption function to work. Thus it is essential when picking keys for the affine cypher to be able to check whether two numbers are co-prime.

          Decoding Affine

          \[ D(X) = a^{-1}(X - b) \pmod{26} \]

          The Affine cypher has two keys, $a$ and $b$

          To calculate the modular inverse (\( a^{-1} \)), the value of $a$ must be co-prime with the modulus 26

          Prime Numbers

          As we have seen in calculating the multiplicative inverse, primeness and co-primeness are particularly important in modular arithmetic. Prime numbers are incredibly important for cryptography, so it is worth taking some time to understand them.

          Number theory is a branch of mathematics that is interested in the patterns that we can observe in the numbers. Number theory most often deal with just the natural numbers. Considering the natural numbers, we might ask: are all numbers essentially the same, or do they have some interesting structure?

          One way in which the natural numbers have structure is in their divisibility. Some numbers can be divided by other numbers and give a result that is itself a natural number. For instance, $6 \div 3 = 2$. Thus we might say that $2$ divides $6$ (written $2 \mid 6$). Note that not all numbers fit this relationship. For instance, $4$ does not divide $6$ and give a result in the natural numbers. On the other hand, $1$ divides every number. When considering divisibility, certain patterns start to emerge.

          A prime number, or a prime, is a number that is only divisible by itself and $1$. For instance, $2$, $3$, $5$, and $7$ are prime numbers. There is no natural number $x$ such that $x \mid 7$, or any other prime for that matter. A prime number has no factors (other than 1 and itself), thus it is not possible to divide a prime number by a whole number to get a whole-numbered result. It's important to note that 1 is not considered to be a prime number.

          Prime Numbers

          Prime Numbers

          A prime number is a whole number greater than 1 whose only factors are 1 and itself.

          Large prime numbers are essential to many important cryptographic algorithms

          • Current largest is \( 2^{82,589,933}-1 \), which is 24,862,048 digits long
          Graph of the number of digits in the largest known prime for the years 1950-2020 showing an increase over time

          Composite Numbers and Factor Trees

          All other numbers are composite numbers, because they can be expressed as the product of some set of other numbers. Composite numbers can be visualised as a prime factor tree.


          The number 56 can be expressed as 2 x 2 x 2 x 7, all prime numbers.

          To draw a prime factor tree, check whether the value can be divided by any prime numbers, starting with 2 and working upwards. If you find that it can be divided by a prime, make this the first branch of your factor tree. One branch is the prime factor, the other is the result of dividing by this prime factor. Repeat the process until your final result is prime.

          Composite Numbers

          Numbers that have more than two factors are composite numbers

          \[ 12 = 2 \times 6 \]
          \[ 6 = 2 \times 3 \]

          Every number has a unique set of prime factors

          = 2 x 2 x 2 x 7

          Sieve of Erathosthenes

          The Sieve of Erathosthenes dates from the 3rd Century BC and is a method for finding primes.

          1. Write down some integers starting at 2.
          2. Start from the leftmost number, which is 2. Strike out all other numbers that are divisible by this number (i.e. 2)
          3. Go to the next number, skipping any that have been removed. Repeat the above step, striking out all other numbers that are divisible by it. Continue to apply rule for each number in turn.
          4. All numbers that survive to the end are primes.
          234567891011121314151617181920

          (Note: this is still a bit buggy, I hope it gives the idea)

          The full set of primes up to 20 is \( \{2, 3, 5, 7, 11, 13\} \)

          Euclidean Algorithm

          When using the Affine cypher, it is necessary to know whether two integers are co-prime. If you pick a key for the Affine cypher that is not co-prime with the modulus, the decryption function will not work. Two numbers are co-prime when their greatest common divisor is $1$.

          The greatest common divisor (GCD) of two integers is the largest integer that divides both numbers.

          The Euclidean Algorithm is a way of finding the GCD of two numbers $a$ and $b$. This algorithm is given as pseudocode below.

          1 2 3 4 5 6 7 function Euclid(a: integer, b: integer): integer { if (b == 0) return a; else return Euclid(b, a mod b); }
          abb = 0a mod b
          false4
          144false2
          42false0
          20true
          GCD2

          Greatest Common Divisor

          Greatest Common Divisor (GCD) is the largest integer that divides a pair of integers, e.g.

          \[ gcd(12, 18) = 6 \]

          If \( gcd(A, B) = 1 \) then $A$ and $B$ are co-prime

          GCD can be found using the Euclidean Algorithm

          Euclidean Algorithm

          1. Take two numbers, $a$ and $b$
          2. Find $b \mod a$
          3. Repeat with
            • $a =: b$
            • $b =: a \mod b$
          4. Until $b = 0$
          5. The value of $a$ is the GCD
          abb = 0a mod b
          false6
          156false3
          63false0
          30true
          GCD3

          Euclidean Algorithm

          Pseudocode

          1 2 3 4 5 6 7 function Euclid(a: int, b: int): int { if (b == 0) return a; else return Euclid(b, a mod b); }
          abb = 0a mod b
          false6
          156false3
          63false0
          30true
          GCD3

          Modern Cryptography

          Classical cryptography relies on manual encryption methods, or - as in the case of the Enigma machine - mechanical devices. Modern cryptography leverages the power of computers and sophisticated mathematical algorithms to provide robust security. However, while this increases the security of modern cryptography, modern cryptography must also be resistant to attacks from adversaries with significant computational resources.

          Modern Cryptography

          One-Time Pads

          The concept behind one-time pads dates back to the early 20th century. The one-time pad is a form of encryption that provides theoretically perfect security when a perfectly random key is used. However, its strength and its weakness is the size of its key, which must be at least the same length as the message.

          Say we had a message such as

          the quick brown fox jumps over the lazy dog

          To avoid revealing the lengths of words, it is common practice to write the message in 5-character blocks. Of course, we can also encode spaces as just another character. For how, however, we will continue to only think about encrypting letters of the alphabet

          thequ ickbr ownfo xjump sover thela zydog

          Now let us use a key that is the same length as the cypher. Here I have picked the first line of the William Gibson novel Neuromancer.

          thesk yabov ethep ortwa sthec oloro ftele

          The process of using a one-time pad is simple. As before, we replace each letter with a number, which we do to the plaintext and the key. $a=0$, $b=1$, etc. We then apply a simple function. To encrypt the $i$th letter of the plaintext, $P_i$, we simply add the $i$th letter of the key, $K_i$, and take the modulus.

          \[ E(P_i,K_i) = (P_i + K_i) \mod 26 \]

          Plaintextthequ ickbr ownfo xjump sover thela zydog
          197416208210117142213514239201215181421417197411025243146
          Keythesk yabov ethep ortwa sthec oloro ftele
          19741810240114214197415141719220181974214111417145194114
          Cyphertextmoiie gclpm spujd lanip khcit hssco erhzk
          121488462111512181520931101381510728197181821441772510

          To perform this on bits instead of letters, the XOR operation can be used. Any operation could be used so long as you can perform the inverse to recover the plaintext if you know the key. As we saw with the Caesar cypher, the inverse of addition is subtraction.

          Note that for a one-time pad to be secure it is essential that the key is random (unlike here, where an English sentence was used). Also, the key must not used more than once. If either of these requirements are broken, an attacker could potentially break the encryption.

          One Time Pad

          A One Time Pad provides perfect encryption (in theory) when a random key of the same length as the plaintext is used

          Example

          Let $P_i$ be the numerical value of the $i$th letter in the plaintext

          Let $K_i$ be the $i$th value of the key

          \[ E(P_i,K_i) = (P_i + K_i) \mod 26 \]

          Plaintextthequ ickbr ownfo xjump sover thela zydog
          197416208210117142213514239201215181421417197411025243146
          Keythesk yabov ethep ortwa sthec oloro ftele
          19741810240114214197415141719220181974214111417145194114
          Cyphertextmoiie gclpm spujd lanip khcit hssco erhzk
          121488462111512181520931101381510728197181821441772510

          Stream Cypher

          It is inconvenient to have to use a key the same length as the message. Both parties to the encryption - the sender and the receiver - need to have the same key. Thus if you want to send a gigabyte of data, you must already have shared a gigabyte-long key. This key must have been securely communicated. This is rarely practical.

          Imagine you want to encrypt a mobile phone conversation. The amount of data encrypted could be very large, however you do not want to have to store gigabytes of key on the phone storage. For this purpose, a related approach called a stream cypher is used.

          By generating the key using a pseudorandom number generator (PNRG), a smaller key can be used. This key is used to generate a pseudorandom keystream. This keystream is combined with the plaintext message in the same way as a one time pad. PRNGs produce deterministic output based on the seed, ensuring that the same seed will always produce the same sequence of random numbers. The steps are as follows:

          1. Seed a PNRG with the key
          2. Generate the first number
          3. Add this number to the first letter (byte, etc.) of the plaintext
          4. Generate the second number and repeat

          What has been achieved here is to compress the key that we used in our one time pad. For example, instead of needing a gigabyte-long key to encrypt a gigabyte of data, we might use a 128-bit number to seed our pseudo random number generator. However, the very fact that the key can be compressed means that it cannot be truly random, and thus cannot be perfectly secure. Secure random number generation is an essential feature of modern cryptography.

          Stream Cypher

          Problem

          Key size is very large for one time pad

          Solution

          Use a pseudorandom number generator (PRNG) to generate a keystream

          The value of $K_i$ is the $i$th generated value

          The PRNG must be deterministic

          Random Number Generation

          Secure random number generation is an essential component of cryptographic systems. Randomness is crucial to prevent adversaries from predicting or guessing cryptographic secrets, as any pattern or predictability could be exploited to break the encryption.

          To achieve secure random number generation, cryptographic algorithms rely on various sources of entropy, such as physical processes or unpredictable events, to create an initial seed. This seed is then processed through a cryptographic algorithm known as a pseudo-random number generator (PRNG) to generate a stream of random numbers. However, to achieve true security, PRNGs must be periodically reseeded with fresh entropy to maintain randomness.

          Random number generation

          Randomness is critical in the strength of encryption

          • Poor randomness introduces vulnerabilities to statistical analysis, for example frequency analysis

          Approach

          Use a cryptographically secure PRNG algorithm

          Periodically re-seed with sources of entropy

          Key Exchange

          All of the schemes we have seen so far rely on two parties being able to share a key. The key might be long (as with a one-time pad), or short (as with a stream cypher or Caesar cypher). Unfortunately, it is rarely possible to agree this key in person, safe from potential eavesdroppers. Usually the key needs to be agreed over an insecure communications channel. However, what is to stop an attacker listening for your key?

          The problem is as follows:

          * How do you communicate over an insecure channel with someone with whom you've never met, when you haven't already exchanged keys?

          There are secure key exchange protocols, such as Diffie Hellman Key Exchange, which works using modular arithmetic.

          Key Exchange

          Conundrum

          If you meet a person in advance of encoding messages you can

          • Agree on a secure key
          • Or establish a One Time Pad, etc.

          How do you communicate over an insecure channel with someone with whom you've never met, when you haven't already exchanged keys?

          Diffie Hellman Key Exchange

          Now that you understand modular arithmetic, the mathematics behind Diffie Hellman key exchange are relatively simple.

          Alice and Bob publicly agree on two numbers, $p$ (which must be a large prime number to be secure, usually at least 2048 bits) and $g$. They each secretly choose a number, which is called their private key. Alice picks a number $a$ and Bob picks a number $b$.




          Public informationvalueSecret informationvalue
          modulus ($p$)19Alice's private key ($a$)4
          base ($g$)4Bob's private key ($b$)4

          From this, Alice and Bob each generate a new number called a public key. They do this by finding the modular exponent of the agreed base $b$ with their own private key. The modulus used is $p$, a large prime number they have publicly agreed upon. They share their public keys with each other, making them public and visible to potential eavesdroppers.

          \[ A = g^a \mod p \]\[ A = 4^{4} \mod 19 = 9 \]\[ B = g^b \mod p \]\[ B = 4^{4} \mod 19 = 9 \]
          Public informationvalue
          Alice's public key ($A$)9
          Bob's public key ($B$)9

          Alice calculates the shared secret using her private key.

          \[ s = B^a \mod p \]\[ s = 9^{4} \mod 19 = 6 \]

          Bob calculates the shared secret using his private key.

          \[ s = A^b \mod p \]\[ s = 9^{4} \mod 19 = 6 \]

          Now Alice and Bob have agreed on the same number $s$ without this number ever being made public. This can be used as the key for another encryption algorithm, such as a stream cypher.

          Modular Exponentiation

          Modular exponentiation is a key operation used in Diffie Hellman Key Exchange. This is because it is computationally quick to calculate the exponent of a number but hard to reverse this process by calculating the logarithm.

          Exponentiation under modulus can be calculated as normal, i.e. to perform $a^b \\mod p$ we can first calculate $a^b$, and then find the modulus under $p$.

          However, if $a$ and $b$ are very big, as is often case in public-key cryptography, this is computationally inefficient as the numbers involved in the calculation are enormous.

          Diffie Hellman Key Exchange

          • Method of exchanging keys over an insecure channel
          • Commonly used
            • TLS
            • IPsec
            • SSH
            • PHP
          • Works using modular arithmetic

          Diffie Hellman Key Exchange

          Alice and Bob agree upon

          • modulus ($p$) (prime usually at least 2048 bits)
          • base ($g$)

          Alice and Bob choose private keys

          • Alice ($a$)
          • Bob ($b$)

          4153687576287365984259382475698437658276348 7912837582736592873684273684728938572983759 2834759348759384759284759287395872495872987 3958729835792875982795837529876348273685729 8435793487958279385792873954877239759283759 2478593867045986792384737826735267354762356 8734869386945673456827659498063849024875809 6039479027945982730187439759284620950293759 2870495029380589209839458720948602984912837 5029480193710924801935810379958109375019385 0791395710937597019385089103951073058710393 7019347019380918039840918049810938019850139 8401983509183501983091079180395810395190395 1809358109385019840193580193840198340918093851098309180019

          ▲ what 2048 bits looks like as a decimal!

          Diffie Hellman Key Exchange

          Alice generates a public key and sends to Bob

          \[ A = g^a \mod p \]

          Bob generates a public key and sends to Alice

          \[ B = g^b \mod p \]

          Public

          • modulus ($p$) (large prime)
          • base ($g$)

          Private

          • Alice's private key ($a$)
          • Bob's private key ($b$)

          Diffie Hellman Key Exchange

          Alice calculates the shared secret

          \[ s = B^a \mod p \]

          Bob calculates the shared secret

          \[ s = A^b \mod p \]

          Public

          • modulus ($p$) (large prime)
          • base ($g$)
          • Alice's public key ($A$)
          • Bob's public key ($B$)

          Private

          • Alice's private key ($a$)
          • Bob's private key ($b$)

          Diffie Hellman Key Exchange

          Alice and Bob now both know $s$ without it ever being transmitted

          $s$ can now be used as the key for a symmetrical encryption algorithm

          Public

          • modulus ($p$) (large prime)
          • base ($g$)
          • Alice's public key ($A$)
          • Bob's public key ($B$)

          Private

          • Alice's private key ($a$)
          • Bob's private key ($b$)
          • Shared secret ($s$)

          Summary

          Further Reading

          Algorithms for finding prime numbers: https://www.baeldung.com/cs/prime-number-algorithms

          The Neal Stephenson novel Cryptonomicon. Stephenson's novel follows two parallel groups of codebreakers, one in World War II, another in the present day. Cryptonomicon explores cryptography, mathematics, and computer science. It explores themes of cryptography, data security, and the development of a decentralized digital currency.

          Summary

          Classical Cryptography

          • Atbash
          • Caesar Cypher
          • Affine Cypher

          Modular Arithmetic

          • Modulus
          • Arithmetic
          • Multiplicative Inverse

          Prime Numbers

          • Prime and composite numbers
          • Factor trees
          • Euclidean Algorithm

          Modern Cryptography

          • One Time Pads
          • Stream Cyphers
          • Key Exchange
          LecturePracticalAssessment