Name(s)                                                                            :
Math 390, Cryptology Group Activity 3
  1. Select a short message (about 20 characters) and a short key (about 5 characters). record them below.

  2. message:
    key:
  3. Using your Vigenere square, encrypt your message (by hand). Record it below and on the board, under your group number. On the board, also record your key.

  4.  

     
     
     
     
     
     
     
     
     
     
     

  5. Use the Vigenere applet to check your work in part (2).

  6.  

     
     
     
     
     
     
     
     
     

  7. There are 8 groups doing this activity. If you are group n, decrypt (by hand) the message from group 2*n mod 9. To figure out how to decrypt by hand, look at your message's first character, your key's first character, and the first character of your cipher. Using the first characters of the cipher and key, locate the first character of the message (on the Vigenere square). This should suggest an algorithm for decrypting (similar, but different than the encryption method). Be sure that everyone in your group understands this process. Record the decrypted message below.

  8.  

     
     
     
     
     
     
     
     
     
     
     

  9. Check your work by using the Vigenere applet to re-decrypt the message from part (4).
  10. Write down a fairly long (say 30 words) plain text message (use regular old English for this, you want a lot of trigrams to be repeated in your message). You might want to add lines, or hash marks, every five or ten characters in this message.

  11.  

     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     

  12. Using the Kasiski applet, record all the trigrams in your message that are repeated at least twice (print out the output of running the Kasiski attack on your plain text message). How many of these repeated trigrams are there? If a trigram appears twice, it is repeated once, if it appears three times, it is repeated twice, and in general, a trigram that appears n times is repeated n - 1 times.

  13.  

     
     
     
     
     
     
     
     
     
     
     
     

  14. Select a keyword (or random characters) whose length is given by your group number below:
Group 1,2,3 4,5,6 7,8
key length 3 5 8
    Use the Vigenere applet to encrypt your long message. Record it below.
     
     
     
     
     
     
     
     
     
  1. Some of your repeated trigrams in your plain text message should have survived in your cipher (they are still repeated trigrams, but different trigrams than they were in the plain text message). Other trigrams from your plain text should have disappeared (the second occurrence of the repeated plain text trigram was encrypted to some different trigram than the first occurrence). How many repeated trigrams (in your cipher) are there? Where are they located (print out the output from the Kasiski applet again).

  2.  

     
     
     
     
     
     
     
     
     
     
     
     
     
     

  3. Use the Kasiski applet to locate all the repeated trigrams in your cipher. Break these into the two groups indicated below and count the number in each group.

  4. How many repeated trigrams disappeared?

    Is the percentage that survived roughly 1 / key length?

    Those that came from plain text repeated trigrams
     

    Those that are present in the cipher, but not in the plain text (so they came from the polyalphabetic nature of the Vigenere encryption).
     
     
     
     

  5. Which group, the first (from plain text) or second (from polyalphabetic bad luck) has the majority of the repeated trigrams?

  6.  

     
     
     
     
     
     
     
     
     
     
     
     
     

  7. Count the number of characters between each pair of repeated cipher trigrams of the first variety. Are these distances all divisible by the length of your key? You can do this fast with the Kasiski applet.

  8.  

     
     
     
     
     
     
     
     
     
     
     
     
     

  9. Count the number of characters between each pair of repeated cipher trigrams of the second variety. Are these distances all divisible by the length of your key? If not, what percentage are divisible by your key's length?

  10.  

     
     
     
     
     
     
     
     
     
     
     
     
     

    We will see a way to exploit the information above to help decide the key length of a Vigenere encrypted cipher later. Now, we'll explore another new notion that plays a key role in cipher text only, Vigenere attacks. The notion of  index of coincidence. Given a source of text (like plain old English, or affine encrypted English, etc.), the index of coincidence is defined to be the probability of reaching into the source text, selecting two characters at random, and having those two characters be the same. If a source consists of random characters, where each character, a to z, appears about 1 / 26 th of the time, then the index of coincidence (IC) is 1 / 26 = 0.03846 or so (the first character can be anything and the second is the same character 1 time in 26). On the other hand, everyday English is known to have an IC of about 0.065. That difference is enough to utilize in a significant way.

    Here is a quick and dirty way to estimate the IC for a source of text (assuming you have a fair bit of text that is). Take two independent Strings of text from the source, say s1 and s2; independent means that the characters in s1 have no influence on the characters in s2. For English, independent means that the Strings are located "not too close together" in the source text, say at least 10 characters apart. We've seen that certain digrams and trigrams appear much more often than others, so if you know one character of English, you can tell a fair bit about the two or three characters that follow it, but you'd be hard pressed to know much about which character will appear 10 characters from this one (other than it has about a 0.13 chance of being an "e", etc.). The two strings should have the same length, say N characters. Line them up above one another, and count the number of positions, say x, where the two characters above one another are the same. The IC should be about x / N. Here are the first several characters from a few sentences back:

    WewillseeawaytoexploittheinformationabovetohelpdecidethekeylengthofaVigenereencrypted cipherlaterNowwellexploreanothernewnotionthatplaysakeyroleinciphertextonlyVigenereatt

    We got 7 matches in 85 characters, giving an estimate of 7 / 85 = 0.08235 (which is high, but noticeably different than 0.038). In theory, I didn't need to take different sentences to estimate this IC, I could have just as well taken the top string, deleted the first 10 or 15 characters and lined it up with "itself". Let's see how IC's vary with different sources. Use the IC applet to get these estimates.
     

  11. Take your plain text message above and calculate its IC by running it against the same characters (itself), minus the first n characters, where n = 1, 2, 5, 10 and 20. Record the results below. Remember, English should have an IC near 0.065.
n 1 2 5 10 20
IC          
  1. Based on your results above, what value of n seems like a safe "shift" or number of characters to disguard (where did the IC's become relatively close to 0.065)?

  2.  

     
     
     
     
     
     
     
     
     
     
     

  3. Now take the same characters and shift (like Ceasar) encrypt them (use the shift applet). Check the IC of the plain text versus the shift encrypted text. Is it low (close to 0.038) or high (close to 0.065)?

  4.  

     
     
     
     
     
     
     
     
     
     
     

  5. Now let X = your plain text shift encrypted with some key, Y = all but the first 20 characters of your plain text shift encrypted with the same key, and Z = all but the first 20 characters of your plain text shift encrypted with a different key. Calculate IC(X,Y), IC(X,Z), and IC(Y,Z). Which were close to 0.065? Which were low?

  6.  

     
     
     
     
     
     
     
     
     
     
     
     
     

  7. Let X be defined as above (plain text with some shift encryption). Take your plain text from above, and delete the first 15 characters. Use the shift applet to "decrypt with all keys". This will print out all 26 shift decryptions of your plain text (minus the first 15 characters). Calculate the IC of X (make sure the first 15 characters are included) with each of the decryptions from the shift applet (these decryptions are missing the first 15 characters). Record the results below.
k = 0 1 2 3 4 5 6 7 8 9 10 11 12
IC                          

 
k = 13 14 15 16 17 18 19 20 21 22 23 24 25
IC                          
    While it's easy to read off the correct decryption, this depends on the fact that what you're decrypting is sensible English, and all of the characters are present. If you were decrypting every fifth letter, or an anagrammed version of English, reading would be challenging! Do the results above suggest another way to tell if a shift cipher was encrypted with a certain key? Could a machine perform this check?
     
     
     
     
     
     
  1. One last batch of calculations. Take your Vigenere encrypted long message and call it X. Let Xk be your Vigenere cipher text without the first 3*keyLength + k characters (delete 4 times the number of characters in your key, plus k = 1,2,3, ..., 25 characters from the beginning of your cipher. Calculate IC(X, Xk), for k = 1 to 25 (use the Index of Coincidence applet). Record the results below.
k = 1 2 3 4 5 6 7 8 9 10 11 12
IC                        

 
k = 13 14 15 16 17 18 19 20 21 22 23 24 25
IC                          
    Do you notice anything about the large values in the table above?