Enigma Replica – Substitution encryption

Substitution encryption How it works? Construction

One of the first encryption algorithms consisted in linking each letter of the message with another one using a correspondence list. It is known as the monoalphabetic substitution (A \Leftrightarrow B, B \Leftrightarrow D, etc). This encryption scheme gives 26! alphabets (around 1026): therefore, more than 1026 possibilities to encrypt a given message (and 1/1026 chances for an attacker to guess the correct combination using a brute-force method).

The first description of this encryption appears in the Kamasutra: Vatsyayana wrote it at the 5th century based on publications from the 4th century B.C. In this book, a chapter is dedicated to mlecchita-vikalpà or the art of secret writing (allowing women to hide their relationships).

Encryption cryptanalysis

Cryptanalysis is the art of decrypting a ciphertext without knowledge of the secret key. A common technique is the frequency analysis: in each language, some letters have higher frequencies than others. If the ciphertext uses substitution encryption, frequency analysis will give us the plaintext without knowing the alphabet used.

Disclaimer: from this point, all plaintext/ciphertext will be for the French language. The same technique can be used for other languages.

Here is a sample ciphertext:

JTVMNKKTVLDEVVTLWTWITKTXUTLWJ ERUTVTWTHDXATLIUNEWV. JTVIEVWELOWENLVVNOEDJTVLTPTXYT LWTWUTSNLITTVQXTVXUJXWEJEWTON KKXLT.

Let’s observe letters frequency for this example:
sub_freq-ciphertext

The second histogram presents the general letters frequencies for a text written in French.
sub_freq-french

We can observe that t is the most frequent letter in the ciphertext. Let’s replace it by e:

JeVMNKKeVLDEVVeLWeWIeKeXUeLWJ ERUeVeWeHDXAeLIUNEWV. JeVIEVWELOWENLVVNOEDJeVLePeXYe LWeWUeSNLIeeVQXeVXUJXWEJEWeON KKXLe.

Then, v and w are the next most frequent letters. We ty both cases with the letter s:

#1. With v:

JesMNKKesLDEsseLWeWIeKeXUeLWJ ERUeseWeHDXAeLIUNEWs. JesIEsWELOWENLssNOEDJesLePeXYe LWeWUeSNLIeesQXesXUJXWEJEWeON KKXLe.

#2. With w:

JeVMNKKeVLDEVVeLsesIeKeXUeLsJ ERUeVeseHDXAeLIUNEsV. JeVIEVsELOsENLVVNOEDJeVLePeXYe LsesUeSNLIeeVQXeVXUJXsEJEseON KKXLe.

The best result seems #1: V \Leftrightarrow s and T \Leftrightarrow e. W is quite frequent as well. We’ll try to replace it by a, i or t.

#3. With a:

JesMNKKesLDEsseLaeaIeKeXUeLaJ ERUeseaeHDXAeLIUNEas. JesIEsaELOaENLssNOEDJesLePeXYe LaeaUeSNLIeesQXesXUJXaEJEaeON KKXLe.

#4. With i:

JesMNKKesLDEsseLieiIeKeXUeLiJ ERUeseieHDXAeLIUNEis. JesIEsiELOiENLssNOEDJesLePeXYe LieiUeSNLIeesQXesXUJXiEJEieON KKXLe.

#5. With t:

Jes MNKKesLDEsseLt et IeKeXUeLt JERUes et eHDXAeLIUNEts. Jes IEstELOtENLssNOEDJesLePeXYe LtetUeSNLIeesQXesXUJXtEJEteON KKXLe.

Let’s continue with t: the next letter is L. We’ll try to replace it by a, i or n:

#6. With n:

JesMNKKes nDEssent et IeKeXUentJ ERUeseteHDXAenIUNEts. JesIEstEnOtENns sNOEDJesnePeXYe ntetUeSNnIeesQXesXUJXtEJEteON KKXne.

#7. With i:

JesMNKKesiDEsseit etIeKeXUeitJ ERUeseteHDXAeiIUNEts. JesIEstEiOtENissNOEDJesiePeXYe itetUeSNiIeesQXesXUJXtEJEteON KKXie.

#8. With a:

JesMNKKesaDEsseatetIeKeXUeatJ ERUeseteHDXAeaIUNEts. JesIEstEaOtENassNOEDJesaePeXYe atetUeSNaIeesQXesXUJXtEJEteON KKXae.

n seems to be the best solution, we can identify classic French letter groups… In the word nDEssent, we can see the conjugated verb naissent. Let’s try to the following substitutions D \Leftrightarrow a and E \Leftrightarrow i:

JesMNKKes naissent et IeKeXUentJ iRUeseteHaXAenIUNits. JesIistinOtiNns sNOiaJesnePeXYe ntetUeSNnIeesQXesXUJXtiJiteON KKXne.

In the word Jes IistinOtiNns, we recognize Jes disctinctions: let’s try I \Leftrightarrow d, O \Leftrightarrow c and N \Leftrightarrow o.

Jes MoKKes naissent et deKeXUentJiRUeseteHaXAendUoits. Jes distinctions sociaJes ne PeXYentetUeSondeesQXesXUJXtiJitecoKKXne.

MoKKes is similar to homme. Following substitutions are performed: M \Leftrightarrow h and K \Leftrightarrow m;

Jeshommes naissent et demeXUentJ iRUeseteHaXAendUoits. Jesdistinctions sociaJesnePeXYe ntetUeSondeesQXesXUJXtiJite commXne.

In the word commXne: X \Leftrightarrow u.

Jes hommes naissent et demeuUent JiRUeseteHauAendUoits. Jesdistinctions sociaJesnePeuYe ntetUeSondeesQuesuUJutiJite commune.

Then, in demeuUent: U \Leftrightarrow r.

Jes hommes naissent et demeurent JiRres et eHauA en droits. Jes distinctions sociaJes ne PeuYent etre Sondees Que sur JutiJite commune.

Finally, we obtain:

Les hommes naissent et demeurent libres et égaux en droits. Les distinctions sociales ne peuvent être fondées que sur l’utilité commune.

(this text is from the French declaration of human and civic rights, established in 1789)

However, this method has a limit: the text to  be decoded must be long enough to get accurate letter frequencies. From a cryptanalysis point of view, a 150-200 words text is long enough to get a correct solution. For a shorter text, it is more difficult… Try to guess this message:

Y’EGGEJG EHI UJZMVAE P DEGHEKEGEG EHI SVMOBYVFJE.

In order to help you, here is the letter frequency analysis:
sub_freq-test

Previous Next