The Story of Encryption

10 min readDec 7, 2019

Recently I read a book titled “The Code Book” by Simon Singh. I am so fascinated by the description of the events (leading to where we stand today) that I am motivated to pen down the key learning. It will serve two purposes, first, a quick future reference for me and other readers, second, it will help me in internalizing the concepts better. The book is a lucid description of the tussle between the code-makers (aiming to communicate securely) and the breakers (snooping into everything) — proving that story of encryption is not new, but as old as the civilization itself. It has evolved constantly to the needs of time and was driven by various factors, both economic and political.

Early Days

It all started with tactics which are as simple as hiding information in wooden tablets covered by a layer of wax to others using transposition and/or substitution. In transposition, each letter of a word retains its identity but changes its position, whereas, in substitution, each letter changes its identity but retain its position. In other words, while transposing a word for secrecy, you simply rearrange the letters based on a predetermined understanding between the sender and the receiver (“cow” can become “owc”), and in the case of substitution, you replace the letters of the word by a different letter (“cow” can become “ihp”, “c”, “o”, “w” is replaced by “i”, “h” and “p” respectively). The method used to encrypt the message is called the “algorithm”, which in turn uses the “key” to create the cipher text. Hence, for the recipient to be able to decrypt the message, he must be aware of both — the “algorithm” and the “key”.

Teething Troubles

While using “transposition” the simplicity of the “key” used is very important. For example, if “caesar shift key” is used (shifting the letter by three places), then it will be possible to generate only 25 distinct ciphers (the English language has 26 alphabets) — making it easy to break, but allowing all permutations for transposition will make the “key” complex — making is difficult to remember/handle. An easy solution to this problem is to use substitution algorithm to encrypt the message. Hence, the “key” used just overlaps on top of the 26 English alphabets, mapping each letter to the “key”. Then the letters of the “plain text” are just replaced by the letters mapped to the “key” to generate the “cipher text”. But, the problem with this method is that the algorithm is susceptible to “frequency attack” — Each letter of the English alphabet repeats itself at a predictable rate in a sufficiently large text. This fact can be used to easily identify the original letters in the “cipher text”. However, “frequency attack” does not work for shorter text as they are too less to provide an opportunity for measuring the frequency of repeating letters. To make “frequency attack” difficult, various techniques were used, i.e mixing letters with “Nulls”, “Numbers” and/or using “Codes” which maps specific “letters” to “words” to confuse the attackers. However, these techniques are not very useful as it makes the “key” very complex and difficult to handle.

Improved Algorithm

The shortcomings of the simple monoalphabetic substitution cipher of resisting “frequency attack” led to the development of “Vigenere cipher” which used not one, but 26 distinct cipher alphabets to encrypt a message. The idea is to map the 26 plain alphabets of the English language with a row of the same set of 26 alphabets each shifted by a one. This leads to a 26x26 matrix of rows which has a one to one mapping with the “plain text”. In other words, the letter “a” can have 26 different substitutions depending upon which row is being used for encrypting the “plain text” at any point time. But how do we identify a particular row to be used out of the 26 rows? This is decided based on the chosen “key” which is a secret between the sender and the recipient. This “key” is overlapped on the message in a repetitive manner, and the row starting with the individual letter of the keyword is used for mapping the message to create the cipher text. The recipient has to reverse the process in order to extract the message. In absence of one on one mapping between letters of “plaintext” with the “ciphertext”, this method of encryption is immune towards “frequency attack”.

Practical Difficulties

But, the mapping of a single letter to multiple letters makes “Vigenere cipher” was too complicated to use. That may be the reason that this flawless system largely remained dormant. To overcome the shortcomings of the “Vigenere cipher”, cryptographers searched for intermediate cipher — one that is relatively simple to use but reasonably harder to crack through “frequency attacks”. One such example is “homophonic substitution cipher”, where each of the 26 letters is mapped to symbols and the number of these symbols depends upon its (the letter’s) frequency of occurrence in the normal English language, but once a symbol is identified it remains constant throughout the encryption process — making it simpler than the “Vigenere cipher”. However, all these ciphers including the “great cipher” (mapping letter to syllables) got cracked, forcing the cryptographers to adopt the more complex but secure “Vigenere cipher”.

Confidence Shattered

This so-called secure cipher was finally broken by none other than Charles Babbage who is credited for laying the architecture of the modern computer. Now the cryptographers could no longer guarantee secrecy. Though they fought back to regain control in the communication war, but nothing of great significance emerged during the latter half of the nineteenth century. Most of the new ciphers including ADFGVX introduced by germans on 5th March 1918 were broken. But, with the invention of the radio, the need for an effective cipher became very important than ever before, as unlike conventional telegraph, the radio signals can be easily tapped by the enemy.

“Key” is the Key

Then towards the end of the first world war, the scientist in America discovered that “Vigenere cipher” can be used as the basis for a new, more formidable form of encryption which can provide perfect security. The idea was to make the “key” as long as the message. By doing so the technique developed by Babbage for breaking “Vigenere cipher” will not work. As the foundation of Babbage’s technique was the repeated nature of the “key” used to encrypt the message in the “Vigenere cipher”. The only condition was that they “key” used for encryption should not contain any meaningful words but has to devoid of structure and random. Also, the “key” cannot be used more than once. In fact, it was proven mathematically that it is impossible to crack messages send through “Vigenere cipher” encrypted by “keys” generated randomly and not used more than once.

Birth of Enigma

But, the system so developed suffered a huge setback — that is is impossible to generate a large number of random keys which will get used only once. Also, supplying these keys was a big logistic problem. This led to the development of cipher machines the most famous one called the “Enigma” by Arthur Scherbius. It was nothing but an improved version of “Vigenere cipher” with a capability to deal with long “keys” generated randomly. It consists of multiple scramblers connected in series working together to generate the ciphertext, such that there is no correlation between what got generated earlier. The operator wanting to send message must rotate the scramblers to a particular starting position which should be mirrored by the recipient. This was dictated by the code book which is available to everyone using the machine wanting to communicate to each other. Since only one key per day is required, managing the distribution of keys is a much simpler solution than hundreds required to make the “Vigenere cipher” secure. After initial reluctance, the German military adopted it and ended by buying 30K machines — making them virtually invincible during the 2nd world war.

Cracking the Enigma

The start of breaking the enigma code began by activities of french secret service to gain access to the accurate replica of the wiring of the German military version. However, this was not enough, as the strength of the cipher system does not depend upon keeping the machine secret, but the “keys” it uses to encrypt the messages. The “key” was distributed to the operators of the machine to be used for the day for sending all the messages. This created a “risk”, as the enemy snooper could use the large volume of data sent over the day to deduce the “key” and thereby breaking the system. So, as an added precaution the operators were instructed to transmit a “new key” at the start of each message using the “day key”. This would have prevented the enemy snooper enough information to able to deduce the “day key”. However, there was a fatal flaw. The operators were instructed to send the “message key” twice at the start of every message in order to prevent mistakes caused by interference in the radio link. This created the necessary patterns and supported “cillies” (predictable patterns by due to human typing), laid the foundation for cracking the Enigma. The Germans continued to evolve the machine, by adding additional scramblers and making changes in the wiring of the machine. But, the mathematicians working on the problem designed electromechanical bombe to able to automatically tune the machine to recognize the “key” setting to decipher the message.

Solving Key Distribution

All the encryption system that we have discussed till now has to deal with the need to physically distribute keys between the sender and the recipient. This makes the system inflexible and prone to attacks, as the keys can get compromised during the process of transit. No matter how secure a cipher is in theory, in practice, it can be undermined by the problem of “key distribution”. Diffie and Hellman were first to tackle it by focusing on “one-way” functions — where the output cannot be used to deduce the inputs used to generate it. They used the “Mod” function in mathematics to discover a system where there is no need to physically distribute keys between communicating parties. But, for the process to work computers of both the communicating parties should remain on, as the secure key generation process requires the softwares to exchange information in real time. This made the system inconvenient to use, as for one to send an email to the other he has to call him to switch on his computer.

Public Key Cryptography

The “keys” used in the Diffie — Hellman system are symmetric in nature. In other words, both the sender and the receiver are using the same key to encrypt and decrypt the message. However, Rivest, Shamir and Adleman (RSA) came out with a more innovative way of communicating securely. They used “keys” which are asymmetric in nature. This means that the sender is using a different key to encrypt the message from the key used by the receiver to decrypt it. This is a direct gift of mathematics, which makes it impossible to deduce the factors of a number (N) generated by multiplying two prime numbers (p, q) which are sufficiently large. The recipient chooses the numbers (p, q) called “private key” and then shares the number (N) to the whole world as his “public key”. Anyone wanting to send the message to him encrypt it with (N) which only the sender alone can decrypt (as he alone knows the values of p and q). This was the most significant discovery in the history of encryption, as it made the “problem of distributing keys”, a history. This system of asymmetric cryptography is called as RSA and is a form of public-key cryptography.

Pretty Good Privacy

The founders of RSA patented their product and hence it was not available to the general public easily. Also, the NSA had created some barrier to its distribution. To enable this to be used by the public at large who had no knowledge of software and cryptography, Phil Zimmermann (an activist) wrote a software, which can be easily downloaded from the internet. This software also solved the basic problem of the complexity of the RSA code to be dealt by conventional commercial computers. To overcome this problem he used the RSA only to exchange the “key”. Once that is accomplished, all further message was transferred using symmetric cipher system, which is far more efficient than RSA. This brought the system in the hands of the common man. In this process, Phil had to deal with charges levied by NSA for export violation of defense equipment barred legally by law. Also, there was a patent infringement issue associated with Phil’s decision to distribute it freely on the internet. Finally, after a lot battle, Phil was able to overcome both these problem and the world got benefited by his actions — making the PGP software the most downloadable software from the web.

The Future

The RSA system is protected by the inability of the modern computers to compute the factors of the product of two sufficiently large prime numbers. Even after 2000 years of research, no one has been able to break this riddle. So, the only way to crack the RSA code is to design a machine which can compute at a rate which is billion times faster than the present day computers. This is only possible by designing a whole new method of computing called “quantum computers”. The difference between them and the conventional is that the former processes bits in parallel, whereas the latter in series. But, designing a machine capable of handling quantum bits is still a challenge. The world will become totally transparent to nations able to solved this problem first, as they will be able to break all codes currently running on the internet. This will bring us back to the challenge that our forefathers have faced earlier. The same “quantum computer” which will be used to break the system will finally come to the rescue. The encryption systems generated by these computers will be invincible not because no further technological development is possible, but due to the manner in which these quantum systems work (which is based on Heisenberg’s uncertainty principle). However, before we finally stabilize to the new system the current system will be totally shaken up — causing a lot of anxiety and disturbance. Once this happens, how smooth we will be able to manage this transition, only the time will tell.