Friday, December 6, 2013

Encryption 101: Back to basics

Jack Hammond
Junior Developer
Egress Software Technologies Ltd.
Having introduced the concept of encryption in my last blog post through the PlayFair Cipher, I’m now going to look at a few of the oldest known ciphers, which demonstrate the fundamentals of encryption.

At the most basic level, encryption is primarily about providing two properties:
  • Confusion – The relationship between the plaintext (input) and the ciphertext (output) should be as difficult as possible to figure out, thus making the key difficult to crack
  • Diffusion – Any changes to the plaintext, even just a single letter, should produce wide, sweeping changes to the resulting ciphertext

A brief history

Encryption has been used for thousands of years – for example, there are reported cases dating back to 2,500BC of hieroglyphs being altered in Ancient Egypt to conceal information. In 500BC, meanwhile, Rabbis hid information using the Atbash cipher – a very simplistic cipher in the sense that it simply reverses the alphabet, so ‘A’ becomes ‘Z’, ‘B’ becomes ‘Y’, and so on.

Plaintext: ‘This message has been encrypted using the Atbash cipher’

Ciphertext: ‘Gsrh nvhhztv szh yvvm vmxibkgvw fhrmt gsv Zgyzhs xrksvi’

While these ciphers wouldn’t take long to crack with today’s technology, they demonstrate encryption in a very basic form, providing a solid foundation for a better understanding of the subject.

The cipher of a Roman Emperor

Julius Caesar is perhaps one of the most famous Roman Emperors – however what people may not know is that he has a cipher named after him: The Caesar cipher.

Like the Atbash cipher, the Caesar cipher has a very simple method of encryption. However, while the Atbash cipher only allows the use of a single ‘key’ (‘A’ will always encrypt to ‘Z’, ‘B’ will always encrypt to ‘Y’, etc), meaning that if someone knows a document has been encrypted using the cipher, it would be fairly trivial for them to decrypt it; the Caesar cipher provides 25 different ‘keys’. How? Well, it’s what’s known as a substitution cipher or, more specifically, a simple substitution cipher.

The Caesar cipher uses the alphabet and a numerical offset to encrypt data, hence the 25 possible ‘keys’ (you can’t have 26, as if you moved a letter 26 times, it would ultimately end up back in its starting position and break the encryption).

With this in mind, let’s take a look at a simple example that will use the key of ‘3’ to encrypt the letter ‘E’ to its corresponding ciphertext letter.

Letter shifting

It is generally accepted that encryption is done by shifting all the letters to the right (positive shift) and decryption is done by shifting all the letters to the left (negative shift), with ‘Z’ looping back round to ‘A’. Of course there is no steadfast rule, and you’re free to encrypt and decrypt in any way you like!

Plaintext
A
B
C
D
E
F
G
H
I









Ciphertext
D
E
F
G
H
I
J
K
L

Looking at the example above, we can see that if the letter ‘E’ is encrypted by the Caesar cipher with a 3 shift, then it will encrypt to the letter ‘H’. To decrypt this, we simply use a -3 shift.

Some more examples

So, now we know how the Caesar cipher works, let’s look at encrypting a whole sentence:


Plaintext alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZ

Example 1

Shift: 1

Ciphertext alphabet: BCDEFGHIJKLMNOPQRSTUVWXYZA

Ciphertext message: J DBNF, J TBX, J DPORVFSFE.

Example 2

Shift: 10

Ciphertext alphabet: KLMNOPQRSTUVWXYZABCDEFGHIJ

Ciphertext message: S MKWO, S CKG, S MYXAEOBON.

Weaknesses in the cipher

Look at the two examples above, do you notice anything about them? Specifically, the letter ‘I’ is quite revealing.

As you’ve probably noticed, the letter ‘I’ is always encrypted to the same letter: on the 1 shift, ‘I’ always becomes ‘J’; on the 10 shift, ‘I’ always becomes ‘S’. This repetition means the Caesar cipher is vulnerable to one cryptanalysis method known as Letter Frequency Analysis – working out what a letter could be by how often it appears.

Every sentence tells a story

Within the English language, certain letters appear in sentences more often than others do. Using this knowledge, we can 'brute force' a Caesar cipher and get the original message, even if we don't know the key - although this involves some patience, of course!




The above graph shows the usual distribution of letters, otherwise known as letter frequency. By comparing the frequency of letters in some ciphertext, it can give clues to the key used to encrypt it, and we can then try that to see what it decrypts to, adjusting the number of letters shifted up or down accordingly, until we find a solution that reads like a normal English sentence.

These early examples prove just how far back in history encryption has its roots, and while they’re clearly very simple to crack by today’s standards, they demonstrate well the principles of encryption. Moreover, with every new algorithm invented, complexity and difficulty levels increased nearly exponentially – up to the point where currently ciphers are being designed that use the fundamentallaws of physics to provide a near perfect form of encryption… In theory at least!

Your turn to crack the code (try these at your desk!)

  1. Using a key of ‘7’, encrypt the following phrase:
    • Rome was not built in a day  
    • Answer:  Yvtl dhz uva ibpsa pu h khf)
  2. Using a key of ‘15’ encrypt the following phrase:
    • Experience is the teacher of all things
    • Answer: Tmetgxtcrt xh iwt itprwtg du paa iwxcvh)
  3. The following phrase has been encrypted using a key of ‘20’, decrypt it:
    • Cz sio gomn vlyue nby fuq, xi cn ni mycty jiqyl: ch uff inbyl wumym ivmylpy cn. 
    • Answer: If you must break the law, do it to seize power: in all other cases observe it)

Further Reading

If this has post has piqued your interest in cryptography, it may be worth looking at some of the topics below:
  • Avalanche Principle; An advancement on the fundamental Diffusion property of encryption
  • ROT-13; A special form of Caesar that acts as its own mathematical inverse. The aspect of the mathematical inverse will become more prominent with modern-day ciphers such as RSA