Thursday, October 2, 2008

DATA ENCRYPTION TECHNIQUES


Since you're interested in ENCRYPTION, maybe you'd like these:
Microsoft '.Net' - a billion dollar boondoggle?
The U.S. EPA is LYING to us. KYOTO 
TREATY is FALSE SCIENCE.

Introduction

Often there has been a need to protect information from 'prying eyes'. In the electronic age, information that could otherwise benefit or educate a group or individual can also be used against such groups or individuals. Industrial espionage among highly competitive businesses often requires that extensive security measures be put into place. And, those who wish to exercise their personal freedom, outside of the oppressive nature of governments, may also wish to encrypt certain information to avoid suffering the penalties of going against the wishes of those who attempt to control.
Still, the methods of data encryption and decryption are relatively straightforward, and easily mastered. I have been doing data encryption since my college days, when I used an encryption algorithm to store game programs and system information files on the university mini-computer, safe from 'prying eyes'. These were files that raised eyebrows amongst those who did not approve of such things, but were harmless [we w
ere always careful NOT to run our games while people were trying to get work done on the machine]. I was occasionally asked what this "rather large file" contained, and I once demonstrated the program that accessed it, but you needed a password to get to 'certain files' nonetheless. And, some files needed a separate encryption program to decipher them.
Fortunately, my efforts back then have paid off now, in the business world. I became rather good at keeping information away from 'prying eyes', by making it very difficult to decipher. Our S.F.T. Setup Gizmo application has an encrypted 'authorization code' scheme, allowing companies to evaluate a fully-featured version of the actual software before purchasing it, and (when licensed) a similar scheme authorizes a maximum number of users based on the license purchased by the customer. Each of these features requires some type of encrypted data to ensure that the 'lockout' works correctly, and cannot be bypassed, protecting both our interests and preserving the 'full-featured demo' capability for our customers.
Methods of Encrypting Data

Traditionally, several methods can be used to encrypt data streams, all of which can easily be implemented through software, but not so easily decrypted when either the original or its encrypted data stream are unavailable. (When both source and encrypted data are available, code-breaking becomes much simpler, though it is not necessarily easy). The best encryption methods have little effect on system performance, and may contain other benefits (such as data compression) built in. The well-known 'PKZIP®' utility offers both compression AND data encryption in this manner. Also DBMS packages have often included some kind of encryption scheme so that a standard 'file copy' cannot be used to read sensitive information that might otherwise require some kind of password to access. They also need 'high performance' methods to encode and decode the data.
Fortunately, the simplest of all of the methods, the 'translation table', meets this need very well. Each 'chunk' of data (usually 1 byte) is used as an offset within a 'translation table', and the resulting 'translated' value from within the table is then written into the output stream. The encryption and decryption programs would each use a table that translates to and from the encrypted data. In fact, the 80x86 CPU's even have an instruction 'XLAT' that lends itself to this purpose at the hardware level. While this method is very simple and fast, the down side is that once the translation table is known, the code is broken. Further, such a method is relatively straightforward for code breakers to decipher - such code methods have been used for years, even before the advent of the computer. Still, for general "unreadability" of encoded data, without adverse effects on performance, the 'translation table' method lends itself well.
A modification to the 'translation table' uses 2 or more tables, based on the position of the bytes within the data stream, or on the data stream itself. Decoding becomes more complex, since you have to reverse the same process reliably. But, by the use of more than one translation table, especially when implemented in a 'pseudo-random' order, this adaptation makes code breaking relatively difficult. An example of this method might use translation table 'A' on all of the 'even' bytes, and translation table 'B' on all of the 'odd' bytes. Unless a potential code breaker knows that there are exactly 2 tables, even with both source and encrypted data available the deciphering process is relatively difficult.
Similar to using a translation table, 'data repositioning' lends itself to use by a computer, but takes considerably more time to accomplish. A buffer of data is read from the input, then the order of the bytes (or other 'chunk' size) is rearranged, and written 'out of order'. The decryption program then reads this back in, and puts them back 'in order'. Often such a method is best used in combination with one or more of the other encryption methods mentioned here, making it even more difficult for code breakers to determine how to decipher your encrypted data. As an example, consider an anagram. The letters are all there, but the order has been changed. Some anagrams are easier than others to decipher, but a well written anagram is a brain teaser nonetheless, especially if it's intentionally misleading.
My favorite methods, however, involve something that only computers can do: word/byte rotation and XOR bit masking. If you rotate the words or bytes within a data stream, using multiple and variable direction and duration of rotation, in an easily reproducable pattern, you can quickly encode a stream of data with a method that is nearly impossible to break. Further, if you use an 'XOR mask' in combination with this ('flipping' the bits in certain positions from 1 to 0, or 0 to 1) you end up making the code breaking process even more difficult. The best combination would also use 'pseudo random' effects, the easiest of which would involve a simple sequence like Fibbonaci numbers. The sequence '1,1,2,3,5,...' is easily generated by adding the previous 2 numbers in the sequence to get the next. Doing modular arithmetic on the result (i.e. Fib. sequence mod 3 to get rotation factor) and operating on multiple byte sequences (using a prime number of bytes for rotation is usually a good guideline) will make the code breaker's job even more difficult, adding the 'pseudo-random' effect that is easily reproduced by your decryption program.
In some cases, you may want to detect whether data has been tampered with, and encrypt some kind of 'checksum' into the data stream itself. This is useful not only for authorization codes but for programs themselves. A virus that infects such a 'protected' program would no doubt neglect the encryption algorithm and authorization/checksum signature. The program could then check itself each time it loads, and thus detect the presence of file corruption. Naturally, such a method would have to be kept VERY secret, as virus programmers represent the worst of the code breakers: those who willfully use information to do damage to others. As such, the use of encryption is mandatory for any decent anti-virus protection scheme.
A cyclic redundancy check is one typically used checksum method. It uses bit rotation and an XOR mask to generate a 16-bit or 32-bit value for a data stream, such that one missing bit or 2 interchanged bits are more or less guaranteed to cause a 'checksum error'. This method has been used for file transfers for a long time, such as with XMODEM-CRC. The method is somewhat well documented, and standard. But, a deviation from the standard CRC method might be useful for the purpose of detecting a problem in an encrypted data stream, or within a program file that checks itself for viruses.
Key-Based Encryption Algorithms

One very important feature of a good encryption scheme is the ability to specify a 'key' or 'password' of some kind, and have the encryption method alter itself such that each 'key' or 'password' produces a different encrypted output, which requires a unique 'key' or 'password' to decrypt. This can either be a 'symmetrical' key (both encrypt and decrypt use the same key) or 'asymmetrical' (encrypt and decrypt keys are different). The popular 'PGP' public key encryption, and the 'RSA' encryption that it's based on, uses an 'asymmetrical' key. The encryption key, the 'public key', is significantly different from the decryption key, the 'private key', such that attempting to derive the private key from the public key involves many many hours of computing time, making it impractical at best.

There are few operations in mathematics that are truly 'irreversible'. In nearly all cases, if an operation is performed on 'a', resulting in 'b', you can perform an equivalent operation on 'b' to get 'a'. In some cases you may get the absolute value (such as a square root), or the operation may be undefined (such as dividing by zero). However, in the case of 'undefined' operations, it may be possible to base a key on an algorithm such that an operation like division by zero would PREVENT a public key from being translated into a private key. As such, only 'trial and error' would remain, which would require a significant amount of processing time to create the private key from the public key.

In the case of the RSA encryption algorithm, it uses very large prime numbers to generate the public key and the private key. Although it would be possible to factor out the public key to get the private key (a trivial matter once the 2 prime factors are known), the numbers are so large as to make it very impractical to do so. The encryption algorithm itself is ALSO very slow, which makes it impractical to use RSA to encrypt large data sets. What PGP does (and most other RSA-based encryption schemes do) is encrypt a symmetrical key using the public key, then the remainder of the data is encrypted with a faster algorithm using the symmetrical key. The symmetrical itself key is randomly generated, so that the only way to get it would be by using the private key to decrypt the RSA-encrypted symmetrical key.

Example: Suppose you want to encrypt data (let's say this web page) with a key of 12345. Using your public key, you RSA-encrypt the 12345, and put that at the front of the data stream (possibly followed by a marker or preceded by a data length to distinguish it from the rest of the data). THEN, you follow the 'encrypted key' data with the encrypted web page text, encrypted using your favorite method and the key '12345'. Upon receipt, the decrypt program looks for (and finds) the encrypted key, uses the 'private key' to decrypt it, and gets back the '12345'. It then locates the beginning of the encrypted data stream, and applies the key '12345' to decrypt the data. The result: a very well protected data stream that is reliably and efficiently encrypted, transmitted, and decrypted.

Source files for a simple RSA-based encryption algorithm can be found HERE:
ftp://ftp.funet.fi/pub/crypt/cryptography/asymmetric/rsa

It is somewhat difficult to write a front-end to get this code to work (I have done so myself), but for the sake of illustration, the method actually DOES work and by studying the code you can understand the processes involved in RSA encryption. RSA, incidentally, is reportedly patented through the year 2000, and may be extended beyond that, so commercial use of RSA requires royalty payments to the patent holder (www.rsa.com). But studying the methods and experimenting with it is free, and with source code being published in print (PGP) and outside the U.S., it's a good way to learn how it works, and maybe to help you write a better algorithm yourself.
A brand new 'multi-phase' method (invented by ME)

I have (somewhat) recently developed and tested an encryption method that is (in my opinion) nearly uncrackable. The reasons why will be pretty obvious when you take a look at the method itself. I shall explain it in prose, primarily to avoid any chance of prosecution by those 'GUMMINT' authorities who think that they oughta be able to snoop on anyone they wish, having a 'back door' to any encryption scheme, etc. etc. etc.. Well, if I make the METHOD public, they should have the same chance as ANYONE ELSE for decrypting things that use this method.
(Subsequent note: according to THIS web site, it could be described as an asynchronous stream cipher with a symmetrical key)
So, here goes (description of method first made public on June 1, 1998):

Using a set of numbers (let's say a 128-bit key, or 256-bit key if you use 64-bit integers), generate a repeatable but highly randomized pseudo-random number sequence (see below for an example of a pseudo-random number generator).
256 entries at a time, use the random number sequence to generate arrays of "cipher translation tables" as follows:
fill an array of integers with 256 random numbers (see below)
sort the numbers using a method (like pointers) that lets you know the original position of the corresponding number
using the original positions of the now-sorted integers, generate a table of randomly sorted numbers between 0 and 255. If you can't figure out how to make this work, you could give up now... but on a kinder note, I've supplied some source below to show how this might be done - generically, of course.
Now, generate a specific number of 256-byte tables. Let the random number generator continue "in sequence" for all of these tables, so that each table is different.
Next, use a "shotgun technique" to generate "de-crypt" cipher tables. Basically, if a maps to b, then b must map to a. So, b[a[n]] = n. get it? ('n' is a value between 0 and 255). Assign these values in a loop, with a set of 256-byte 'decrypt' tables that correspond to the 256-byte 'encrypt' tables you generated in the preceding step.

NOTE: I first tried this on a P5 133Mhz machine, and it took 1 second to generate the 2 256x256 tables (128kb total). With this method, I inserted additional randomized 'table order', so that the order in which I created the 256-byte tables were part of a 2nd pseudo-random sequence, fed by 2 additional 16-bit keys.

Now that you have the translation tables, the basic cipher works like this: the previous byte's encrypted value is the index of the 256-byte translation table. Alternately, for improved encryption, you can use more than one byte, and either use a 'checksum' or a CRC algorithm to generate the index byte. You can then 'mod' it with the # of tables if you use less than 256 256-byte tables. Assuming the table is a 256x256 array, it would look like this:

crypto1 = a[crypto0][value]

where 'crypto1' is the encrypted byte, and 'crypto0' is the previous byte's encrypted value (or a function of several previous values). Naturally, the 1st byte will need a "seed", which must be known. This may increase the total cipher size by an additional 8 bits if you use 256x256 tables. Or, you can use the key you generated the random list with, perhaps taking the CRC of it, or using it as a "lead in" encrypted byte stream. Incidentally, I have tested this method using 16 'preceding' bytes to generate the table index, starting with the 128-bit key as the initial seed of '16 previous bytes'. I was then able to encrypt about 100kbytes per second with this algorithm, after the initial time delay in creating the table.
On the decrypt, you do the same thing. Just make sure you use 'encrypted' values as your table index both times. Or, use 'decrypted' values if you'd rather. They must, of course, match.

The pseudo-random sequence can be designed by YOU to be ANYTHING that YOU want. Without details on the sequence, the cipher key itself is worthless. PLUS, a block of 'e' ascii characters will translate into random garbage, each byte depending upon the encrypted value of the preceding byte (which is why I use the ENCRYPTED value, not the actual value, as the table index). You'll get a random set of permutations for any single character, permuations that are of random length, that effectively hide the true size of the cipher.

However, if you're at a loss for a random sequence consider a FIBBONACCI sequence, using 2 DWORD's (like from your encryption key) as "seed" numbers, and possibly a 3rd DWORD as an 'XOR' mask. An algorithm for generating a random sequence of numbers, not necessarily connected with encrypting data, might look as follows:
unsigned long dw1, dw2, dw3, dwMask;
int i1;
unsigned long aRandom[256];

dw1 = {seed #1};
dw2 = {seed #2};
dwMask = {seed #3};
// this gives you 3 32-bit "seeds", or 96 bits total

for(i1=0; i1 < dw3 =" (dw1" dw1 =" dw2;" dw2 =" dw3;" pp1 =" (unsigned" pp2 =" (unsigned"> *pp2)
return(1);

return(0);
}

...

int i1;
unsigned long *apRandom[256];
unsigned long aRandom[256]; // same array as before, in this case
int aResult[256]; // results go here

for(i1=0; i1 < i1="0;" crypto1 =" a[crypto0][value]" dw1 =" {seed" dw2 =" {seed" dwmask =" {seed" i1="0;" dw3 =" (dw1" dw1 =" dw2;" dw2 =" dw3;" pp1 =" (unsigned" pp2 =" (unsigned"> *pp2)
return(1);

return(0);
}

...

int i1;
unsigned long *apRandom[256];
unsigned long aRandom[256]; // same array as before, in this case
int aResult[256]; // results go here

for(i1=0; i1 <>
{
apRandom[i1] = aRandom + i1;
}

// now sort it
qsort(apRandom, 256, sizeof(*apRandom), MySortProc);

// final step - offsets for pointers are placed into output array
for(i1=0; i1 <>
{
aResult[i1] = (int)(apRandom[i1] - aRandom);
}

...
The result in 'aResult' should be a randomly sorted (but unique) array of integers with values between 0 and 255, inclusive. Such an array could be useful, for example, as a byte for byte translation table, one that could easily and reliably be reproduced based solely upon a short length key (in this case, the random number generator seed); however, in the spirit of the 'GUTLESS DISCLAIMER' (below), such a table could also have other uses, perhaps as a random character or object positioner for a game program, or as a letter scrambler for an anagram generator.
GUTLESS DISCLAIMER: The sample code above does not in and of itself constitute an encryption algorithm, or necessarily represent a component of one. It is provided solely for the purpose of explaining some of the more obscure concepts discussed in prose within this document. Any other use is neither proscribed nor encouraged by the author of this document, S.F.T. Inc., or any individual or organization that is even remotely connected with this web site.

As a test, I created an application that I can't export outside of the U.S., to test the concept of the encryption algorithm described in prose, above. It has been optimized and modified in a few ways to improve "randomness" when keys are all 1's or all 0's, and to help prevent 'short cycle' repeating sequences from revealing where common character sequences (like white space) might be. I then encrypted the source file for the program itself (written in C++, by the way, as a command line filter app), and put the resulting file HERE (UUENCODE'd version HERE).
(NOTE: if you can actually decrypt this algorithm, you'll have the source file! That is, if you've got a few zillion lifetimes to waste in the effort of cracking it. G'head - I *dare* you to try! )

So, guys and gals, HAVE AT IT! Copy this HTML file, and send to your buds.

Conclusion

Because of the need to ensure that only those eyes intended to view sensitive information can ever see this information, and to ensure that the information arrives un-altered, security systems have often been employed in computer systems for governments, corporations, and even individuals. Encryption schemes can be broken, but making them as hard as possible to break is the job of a good cipher designer. All you can really do is make it very very difficult for the code breaker to decipher your cipher. Still, as long as both source and encrypted data are available, it will always be possible to break your code. It just won't necessarily be easy.

RELATED SITES (many with sample code, most outside U.S. to avoid that idiotic U.S. law)
Bletchley Park encryption infohttp://www.bletchleypark.net/
Site is named for the famous 'Bletchley Park' group that cracked Enigma in World War 2
More relevant information may be found here: http://www.bletchleypark.net/crypt/
PGP! http://www.pgpi.com/
GnuPG (open source PGP encryption) http://www.gnupg.org/
Blowfish (no royalty, free source)http://www.schneier.com/blowfish.html
Cyber Knights (new link) http://members.tripod.com/cyberkt/ (old link: http://netnet.net/~merlin/knights/)
Crypto Chamber http://www.jyu.fi/~paasivir/crypt/
SSH Cryptograph A-Z (includes info on SSL and https) http://www.ssh.fi/tech/crypto/
'funet' cryptology FTP (yet another Finland resource) ftp://ftp.funet.fi/pub/crypt/
A GREAT Enigma article, how the code was broken by Polish scientistshttp://members.aol.com/nbrass/1enigma.htm
FTP site in UK ftp://sable.ox.ac.uk/pub/crypto/
Australian FTP site ftp://ftp.psy.uq.oz.au/pub/
Replay Associates FTP Archive ftp://utopia.hacktic.nl/pub/replay/pub/crypto/
RSA Data Security (why not include them too!) http://www.rsa.com/
Netscape's whitepaper on SSL http://developer1.netscape.com/docs/manuals/security/sslin/contents.htm
U.S. 'Gummint' Regulations on Export of 'Strong Crypto'... click BOOOOOOOOO! to go there!
NOTE: New U.S. regulations for 'open source' encryption 'export' (via internet) can be found HERE
*NEW ADDITION* - Windows Crypto API 'back door' for N.S.A.? go HERE to find out more...

Back to MY web page
©1998-2004 by R. E. Frazier - all rights reserved
last updated: 1/26/04

The Algorithm

The following block cipher is in the public domain and is/will be 
published in: B.Preneel, ed., Proceedings of the 1994 K.U. Leuven 
Workshop on Cryptographic Algorithms, Lecture Notes in Computer 
Science, Springer-Verlag, 1995.  This posting has been agreed by 
David Wheeler. 

                            * * * * * 


                TEA, A TINY ENCRYPTION ALGORITHM 


By David Wheeler and Roger Needham, Computer Laboratory, 
Cambridge University, England - November 1994. 


Introduction 


We design a short program which will run on most machines and 
encypher safely.  It uses a large number of iterations rather 
than a complicated program.  It is hoped that it can easily be 
translated into most languages in a compatible way.  The first 
program is given below. 


It uses little set up time and does a weak non-linear iteration 
of enough rounds to make it secure.  There are no preset tables 
or long set up times.  It assumes 32 bit words. 


Encode Routine 


The routine, written in the C language, is used for encoding with 
a key, k[0] - k[3].  Data is in v[0] and v[1]. 


void code(long* v, long* k)   
{   unsigned long y = v[0], 
                  z = v[1], 
                  sum = 0,                           /* set up */ 
                  delta = 0x9e3779b9, 
                  n = 32;           /* a key schedule constant */ 


    while(n-- > 0)                        /* basic cycle start */ 
    {   sum += delta; 
        y += (z <<>> 5) + k[1]; 
        z += (y <<>> 5) + k[3]; 
    }                                             /* end cycle */ 


    v[0] = y; 
    v[1] = z; 





Basics of the routine 

It is a Feistel type routine, although addition and subtraction 
are used as the reversible operators rather than XOR.  The 
routine relies on the alternate use of XOR and ADD to provide 
nonlinearity.  A dual shift causes all bits of the key and data 
to be mixed repeatedly. 


The number of rounds before a single bit change of the data or 
key has spread very close to 32 is at most six, so that sixteen 
cycles may suffice and we suggest 32. 


The key is set at 128 bits as this is enough to prevent simple 
search techniques being effective. 


The top 5 and bottom four bits are probably slightly weaker than 
the middle bits.  These bits are generated from only two versions 
of z (or y) instead of three, plus the other y or z.  Thus the 
convergence rate to even diffusion is slower.  However the 
shifting evens this out with perhaps a delay of one or two extra 
cycles. 


The key scheduling uses addition, and is applied to the unshifted 
z rather than the other uses of the key.  In some tests k[0] etc. 
were changed by addition, but this version is simpler and seems 
as effective.  The number delta, derived from the golden number 
is used where 


                    delta = (sqrt(5) - 1).2^31 


A different multiple of delta is used in each round so that no 
bit of the multiple will not change frequently.  We suspect the 
algorithm is not very sensitive to the value of delta and we 
merely need to avoid a bad value.  It will be noted that delta 
turns out to be odd with truncation or nearest rounding, so no 
extra precautions are needed to ensure that all the digits of sum 
change. 


The use of multiplication is an effective mixer, but it needs 
shifts anyway.  It was about twice as slow per cycle on our 
implementation and more complicated. 


The use of a table look up in the cycle was investigated.  There 
is the possibility of a delay ere one entry of the table is used.   
For example if k[z & 3] is used instead of k[0], there is a 
chance one element may not be used of (3/4)^32, and a much higher 
chance that the use is delayed appreciably.  The table also 
needed preparation from the key.  Large tables were thought to be 
undesirable due to the set up time and complication. 


The algorithm will easily translate into assembly code as long as 
the exclusive or is an available operation.  The hardware 
implementation is not difficult, and is of the same order of 
complexity as DES, taking into account the double length key. 


Tests 


A few tests were run to detect when a single change had 
propagated to 32 changes within a small margin.  Also some loop 
tests including a differential loop test to determine loop 
closures. 


A considerable number of small algorithms were tried and the 
selected one is neither the fastest, nor the shortest but is 
thought to be the best compromise for safety, ease of 
implementation, lack of specialised tables, and reasonable 
performance.  On languages which lack shifts and XOR it will be 
difficult to code.  Standard C does make an arithmetic right 
shift and overflows implementation dependent so that the right 
shift is logical and y and z are unsigned. 


Usage 


This type of algorithm can replace DES in software, and is short 
enough to write into almost any program on any computer.   
Although speed is not a strong objective with 32 cycles (64 
rounds) on one implementation it is three times as fast as a good 
software implementation of DES which has 16 rounds. 


The modes of use of DES are all applicable.  The cycle count can 
readily be varied, or even made part of the key.  It is expected 
that security can be enhanced by increasing the number of 
iterations. 


Analysis 


The shifts and XOR cause changes to be propagated left and right, 
and a single change will have propagated the full word in about 4 
iterations.  Measurements showed the diffusion was complete at 
about six iterations. 


There was also a cycle test using up to 34 of the bits to find 
the lengths of the cycles.  A more powerful version found the 
cycle length of the differential function. 


                   d(x) = f(x XOR 2^p) XOR f(x) 


which may test the resistance to some forms of differential 
crypto-analysis. 


Conclusions 


We present a simple algorithm which can be translated into a 
number of different languages and assembly languages very easily.   
It is short enough to be programmed from memory or a copy.  It is 
hoped it is safe because of the number of cycles in the encoding 
and length of key.  It uses a sequence of word operations rather 
than wasting the power of a computer by doing byte or 4 bit 
operations.   


Acknowledgements 


Thanks are due to Mike Roe and other colleagues who helped in 
discussion and tests. 


References 


E. Biham and A. Shamir, Differential Analysis of the Data 
Encryption Standard, Springer-Verlag, 1993 


National Institute of Standards, Data Encryption Standard, 
Federal Information Processing Standards Publication 46.  January 
1977 


B. Schneier, Applied Cryptology, John Wiley & sons, New York 
1994. 


Appendix 


Decode Routine 


void decode( long* v, long* k )   
{   unsigned long n = 32, 
                  sum, 
                  y = v[0], 
                  z=v[1], 
                  delta = 0x9e3779b9; 


    sum = delta <<>
                                                /* start cycle */ 
    while(n-- > 0) 
    {   z -= (y <<>> 5) + k[3]; 
        y -= (z <<>> 5) + k[1]; 
        sum -= delta; 
    } 
                                                  /* end cycle */ 
    v[0] = y; 
    v[1] = z; 





It can be shortened, or made faster, but we hope this version is 
the simplest to implement or remember. 

A simple improvement is to copy k[0-3] into a,b,c,d before the 
iteration so that the indexing is taken out of the loop.  In one 
implementation it reduced the time by about 1/6th. 


It can be implemented as a couple of macros, which would remove 
the calling overheads. 


                            * * * * * 


Please excuse any typos as this has been lifted from a .tex file. 




A QUICK TOUR OF CRYPTOLOGY

Contents

 

Introduction
Useful Terminology 

A Brief History of Cryptography & Cryptanalysis
Early Cryptographic Systems
Cryptography During The Two World Wars
Cryptography In The Modern Age

Cryptography
Single Key Cryptography
Substitution Ciphers
Transposition Ciphers
Product Ciphers
Block Ciphers
Stream Ciphers
Data Encryption Standard (DES)
Public Key Cryptography
The Key Distribution Problem
A Solution To The Key Distribution Problem
Properties For A Two-Key Cryptosystem
Knapsack Problem
Rivest-Shamir-Adleman (RSA) Algorithm
"Real World" Usage Of The RSA Algorithm
Cryptography in the "Real World"  
Applications Of Cryptography
Politics Of Cryptography

Cryptanalysis
Types Of Cryptanalysis
Types Of Cryptanalytic Attacks
Frequency Tables
Cryptanalysis Of Public Key Ciphers
A Triumph of Cryptanalysis - Enigma
What Was Enigma?
Initial Work On Cryptanalysing Enigma
What Made It Possible?
The Turing Bombe

The Future

Endnotes
Sources
Authors Notes


--------------------------------------------------------------------------------

Introduction

The goal of this essay is to introduce the reader, with as little jargon as possible, to the basic theories of cryptography and (to a lesser extent) cryptanalysis. 

The science of cryptology is the science of secure communications, formed from the Greek words kryptós, "hidden", and lógos, "word". 

Useful Terminology

Much of the terminology of cryptography can be linked back to the time when only written messages were being encrypted and the same terminology is still used regardless of whether it is being applied to a written message or a stream of binary code between two computers.

CIPHERTEXT
 The encrypted form of the PLAINTEXT.
 
CODE
 An unvarying rule for replacing a piece of information with another object, not necessarily of the same sort e.g. ASCII.
 
CRYPTANALYSIS
 The science (and art) of recovering information from ciphers without knowledge of the key. 
 
CRYPTOGRAPHY
 The science of the enciphering and deciphering of messages in secret code or cipher. 
 
CRYPTOSYSTEM
 A system for encrypting information.
 
DECRYPTION
 The process of converting the CIPHER back into PLAINTEXT.
 
ENCRYPTION
 The process of converting the PLAINTEXT into a CIPHER.
 
KEY
 The secret information known only to the transmitter and the receiver which is used to secure the PLAINTEXT.
 
MONOALPHABETIC SUBSTITUTION
 A method of encryption where a letter in the plaintext is always replaced by the same letter in the ciphertext.
 
PLAINTEXT
 The source information to be secured.
 
POLYALPHABETIC SUBSTITUTION
 A method of encryption where a letter in the plaintext is not always replaced by the same letter in the ciphertext.
 

 

N.B. Ciphers, as in the case of codes, also replace a piece of information (an element of the plaintext that may consist of a letter or word or string of symbols) with another object. The difference is that the replacement is made according to a rule defined by a secret key known only to the transmitter and legitimate receiver(s) in the expectation that an outsider, ignorant of the key, will not be able to undo the replacement and retrieve the original plaintext.


--------------------------------------------------------------------------------

A Brief History of Cryptography & Cryptanalysis

Early Cryptographic Systems

It seems reasonable to assume that people have tried to conceal information in written form since writing was developed and examples survive in stone inscriptions and papyruses showing that many ancient civilisations including the Egyptians, Hebrews and Assyrians all developed cryptographic systems. The first recorded use of cryptography for correspondence was by the Spartans who (as early as 400 BC) employed a cipher device called a "scytale" to send secret communications between military commanders. The scytale consisted of a tapered baton around which was wrapped a piece of parchment inscribed with the message. Once unwrapped the parchment appeared to contain an incomprehensible set of letters, however when wrapped around another baton of identical size the original text appears.

The Greeks were therefore the inventors of the first transposition cipher and in the fourth century BC the earliest treatise on the subject was written by a Greek, Aeneas Tacticus, as part of a work entitled On the Defence of Fortifications. Another Greek, Polybius later devised a means of encoding letters into pairs of symbols using a device known as the Polybius checkerboard which contains many elements common to later encryption systems. In addition to the Greeks there are similar examples of primitive substitution or transposition ciphers in use by other civilisations including the Romans. 

The Polybius checkerboard consists of a five by five grid containing all the letters of the alphabet. Each letter is converted into two numbers, the first is the row in which the letter can be found and the second is the column. Hence the letter A becomes 11, the letter B 12 and so forth.

The Arabs were the first people to clearly understand the principles of cryptography and to elucidate the beginning of cryptanalysis. They devised and used both substitution and transposition ciphers and discovered the use of letter frequency distributions in cryptanalysis. As a result of this by approximately 1412 al-Kalka-shandi could include in his encyclopaedia Subh al-a’sha a respectable if elementary treatment of several cryptographic systems. He also gave explicit instructions on how to cryptanalyze ciphertext using letter frequency counts including examples illustrating the technique.

European cryptography dates from the Middle Ages during which it was developed by the Papal and Italian city states. The earliest ciphers involved only vowel substitution (leaving the consonants unchanged). Circa 1379 the first European manual on cryptography, consisting of a compilation of ciphers, was produced by Gabriele de Lavinde of Parma, who served Pope Clement VII. This manual contains a set of keys for correspondents and uses symbols for letters and nulls with several two character code equivalents for words and names. The first brief code vocabularies, called nomenclators, were expanded gradually and for several centuries were the mainstay of diplomatic communications for nearly all European governments. In 1470 Leon Battista Alberti described the first cipher disk in Trattati in cifra and the Traicté de chiffres, published in 1586 by Blaise de Vigernère contained a square table commonly attributed to him as well as descriptions of the first plaintext and ciphertext autokey systems.

By 1860 large codes were in common use for diplomatic communications and cipher systems had become a rarity for this application however cipher systems prevailed for military communications (except for high-command communications because of the difficulty of protecting codebooks from capture or compromise). During the US Civil War the Federal Army extensively used transposition ciphers. The Confederate Army primarily used the Vigenère cipher and on occasional monoalphabetic substitution. While the Union cryptanalysts solved most of the intercepted Confederate ciphers, the Confederacy in desperation, sometimes published Union ciphers in newspapers, appealing for help from readers in cryptanalysing them.

Cryptography During The Two World Wars

During the first world war both sides employed cipher systems almost exclusively for tactical communications while code systems were still used mainly for high-command and diplomatic communications. Although field cipher systems such as the U.S. Signal Corps cipher disk lacked sophistication some complicated cipher systems were used for high-level communications by the end of the war. The most famous of these was the German ADFGVX fractionation cipher.

In the 1920s the maturing of mechanical and electromechanical technology came together with the needs of telegraphy and radio to bring about a revolution in cryptodevices - the development of rotor cipher machines. The concept of the rotor had been anticipated in the older mechanical cipher disks however it was an American, Edward Hebern, who recognised that by hardwiring a monoalphabetic substitution in the connections from the contacts on one side of an electrical rotor to those on the other side and cascading a collection of such rotors, polyalphabetic substitutions of almost any complexity could be produced. From 1921 and continuing through the next decade, Hebern constructed a series of steadily improving rotor machines that were evaluated by the U.S. Navy. It was undoubtedly this work which led to the United States’ superior position in cryptology during the second world war. At almost the same time as Hebern was inventing the rotor cipher machine in the United States, European engineers such as Hugo Koch (Netherlands) and Arthur Scherbius (Germany) independently discovered the rotor concept and designed the precursors to the most famous cipher machine in history - the German Enigma machine which was used during World War 2. These machines were also the stimulus for the TYPEX, the cipher machine employed by the British during World War 2.

The United States introduced the M-134-C (SIGABA) cipher machine during World War 2. The Japanese cipher machines of World War 2 have an interesting history linking them to both the Hebern and the Enigma machines. After Herbert Yardley, an American cryptographer who organised and directed the U.S. government’s first formal code-breaking efforts during and after the first world war, published The American Black Chamber in which he outlined details of the American successes in cryptanalysing the Japanese ciphers, the Japanese government set out to develop the best cryptomachines possible. With this in mind, it purchased the rotor machines of Hebern and the commercial Enigmas, as well as several other contemporary machines, for study. In 1930 the Japanese’s first rotor machine, code named RED by U.S. cryptanalysts, was put into service by the Japanese Foreign Office. However, drawing on experience gained from cryptanalysing the ciphers produced by the Hebern rotor machines the U.S. Army Signal Intelligence Service team of cryptanalysts succeeded in cryptanalysing the RED ciphers. In 1939 the Japanese introduced a new cipher machine, code-named PURPLE by U.S. cryptanalysts, in which the rotors were replaced by telephone stepping switches. 

The greatest triumphs of cryptanalysis occurred during the second world war - the Polish and British cracking of the Enigma ciphers and the American cryptanalysis of the Japanese RED, ORANGE and PURPLE ciphers. These developments played a major role in the Allies’ conduct of World War 2. 

Cryptography In The Modern Age

After World War 2 the electronics that had been developed in support of radar were adapted to cryptomachines. The first electrical cryptomachines were little more than rotor machines where the rotors had been replaced by electronic substitutions. The only advantage of these electronic rotor machines was their speed of operation and they inherited the inherent weaknesses of the mechanical rotor machines. 

There is little information available regarding the secret cipher machines of the 1960s and it is likely that this subject will remain the shrouded in rumour until the relevant information is de-classified.

The era of computers and electronics has meant an unprecedented freedom for cipher designers to use elaborate designs which would be far too prone to error if handled by pencil and paper, or far to expensive to implement in the form of an electromechanical cipher machine. The main thrust of development has been in the development of block ciphers, beginning with the LUCIFER project at IBM, a direct ancestor of DES (Data Encryption Standard). 


--------------------------------------------------------------------------------

Cryptography

Cryptographic systems are generally grouped according to three facts about them:

The mathematical operation that changes the plaintext into the ciphertext using the encryption key. 
Whether a block or a stream cipher is produced. 
The type of key system used - single or two key. 
Single Key Cryptography

Substitution Ciphers

A substitution cipher is one in which the units of the plaintext (usually letters or numbers) are replaced with other symbols or groups of symbols. The actual order of the units of the plaintext is not changed. 

The simplest substitution cipher is one where the alphabet of the cipher is merely a shift of the plaintext alphabet, for example, A might be encrypted as B, C as D and so forth. Of this type of cipher, the most well known is the Caesar cipher, used by Julius Caesar in which A becomes D etc. It is easy to guess that cyclical-shift substitution ciphers of this sort are not at all secure because individual letter frequencies are left completely intact. 

There are primarily two approaches that have been used with substitution ciphers to reduce the extent to which the structure of the plaintext, including the letter frequencies, survives into the ciphertext. One of these methods is to treat more than a single letter as one element i.e. two or three successive letters are treated as one unit. The other method is to use several different cipher alphabets. 

Historically, for encrypting elements of a plaintext made up of more than a single letter only digraphs (two successive letters) have ever been used. By treating two successive letters as a single unit the extent to which the original letter frequency distribution survives is reduced, thus making the job of the cryptanalysts harder, but not impossible since it can be shown that digraphs themselves have a high degree of correlation. The most well known digraph substitution cipher is the Playfair cipher, invented by Sir Charles Wheatstone.


 Charles Wheatstone was a 19th century English physicist, born on February 6th, 1802. As well as devising the Playfair cipher he also invented the Wheatstone bridge, a device for accurately measuring electrical resistance which became widely used in laboratories. He also initiated the usage of electromagnets in electric generators and devised the stereoscope, a device for viewing pictures in three dimensions still used today. The Playfair cipher was named for Lyon Playfair, the first Baron Playfair of St. Andrews, who championed its usage at the British Foreign Office (although he was unsuccessful). 
 

 


 Here is an example of a Playfair cipher. The aid used to carry out the encryption is a 5 x 5 square matrix similar to a Polybius checkerboard in that it contains all the letters of the alphabet (I and J are treated as the same letter); however a keyword is placed first and then the remaining letters are placed in alphabetical order.
 

If the plaintext contains an odd number of letters then an X is appended to the last word to make it an even number. Also, if any of the digraphs consist of identical letters e.g. SUMMER, then an extra letter is placed between them. 

The first step in performing the encryption is to locate the two letters from the plaintext in the matrix. There are then several different substitution rules depending on their positioning:

If the pair of letters are in different rows and columns. The rows of the ciphertext letters are kept the same as the rows of the plaintext letters, however the columns swap. Therefore ME, once encrypted, becomes SC because E changes to the letter which is in the same row (2) but in the column of M (1) and M changes to the letter which is in the same row (1) but the column of E (3). It may be easier to remember this as the plaintext letters being at two corners of a rectangle and the ciphertext letters being at the other two corners. 
If the pair of letters are in the same row. The ciphertext letters are the letters to the right of the plaintext letters. For example, T and A are in the same row so T will encrypt to S and A will encrypt to B, forming SB. 
If the pair of letters are in the same column. The ciphertext letters are the letters below the plaintext letters. For example, Y and L are in the same column so Y becomes A and L becomes R, forming AR. 
Thus, we can now encrypt the phrase "Merchant Taylors’ School":

Plaintext:
 ME
 RC
 HA
 NT
 TA
 YL
 OR
 SZ
 SC
 HO
 OL
 
Ciphertext:
 SC
 OF
 LM
 BI
 AB
 AR
 PU
 BX
 ME
 OV
 RH
 

(The last S of "TAYLORS" is paired with a Z to separate it from the first S of "SCHOOL").

The other approach to concealing plaintext structure in the ciphertext involves using several different substitution ciphers. The resulting ciphers, which are generically known as polyalphabetics, have a long history of usage. The best-known polyalphabetic ciphers are the simple Vigenère ciphers which are named after the 16th century French cryptographer Blaise de Vigenère (see History). Blaise de Vigenère actually produced a more sophisticated autokey cipher, but through an accident of history his name has become attached to this weaker cipher. 

For many years this cipher was thought to be impregnable and it is rumoured that a well known scientific magazine pronounced it "uncrackable" as late as 1917, despite the fact that it had been broken by then.////


 In the simplest system of the Vigenère type the key is a word or a phrase which is repeated over and over again. The plaintext is encrypted using the table shown as Figure 4. The ciphertext letter is found at the intersection of the column headed by the plaintext letter and the row indexed by the key letter. To decrypt the plaintext letter is found at the head of the column determined by the intersection of the diagonal containing the cipher letter and the row containing the key letter.
 

It is the periodicity of the repeating key which leads to the weaknesses in this method and its vulnerabilities to cryptanalysis. This periodicity of a repeating key can be eliminated by the use of a running-key Vigenère cipher, produced when a non-repeating key is used. However, even though running-key ciphers eliminate periodicity it is still possible to cryptanalyse them by means of several methods, however the job of the cryptanalyst is made much harder and a cryptanalyst would require a much larger segment of ciphertext to solve a running-key cipher than one with a repeating key. In fact, the work of Major Joseph Mauborgne of the U.S. Army eventually led to the realisation that the only cryptographic system that is totally secure is that with a one-time completely random key.

Here is how the words "Merchant Taylors School" can be encrypted using this cipher:

Plaintext:
 M
 E
 R
 C
 H
 A
 N
 T
  
 T
 A
 Y
 L
 O
 R
 S
  
 S
 C
 H
 O
 O
 L
 
Key:
 D
 O
 N
 T
 S
 T
 A
 N
  
 D
 A
 L
 O
 N
 E
 D
  
 O
 N
 T
 S
 T
 A
 
Ciphertext:
 P
 S
 E
 V
 Z
 T
 N
 G
  
 W
 A
 J
 Z
 B
 V
 V
  
 G
 P
 A
 G
 H
 L
 

Transposition Ciphers

Transposition ciphers rearrange the letters of the plaintext without changing the letters themselves. For example, a very simple transposition cipher is the rail fence, in which the plaintext is staggered between two rows and then read off to give the ciphertext. In a two row rail fence the message MERCHANT TAYLORS’ SCHOOL becomes:

M
 R
 H
 N
 T
 Y
 O
 S
 C
 O
 L
 
E
 C
 A
 T
 A
 L
 R
 S
 H
 O
  
 

Which is read out as: MRHNTYOSCOLECATALRSHO. 

The rail fence is the simplest example of a class of transposition ciphers called route ciphers. These were quite popular in the early history of cryptography. Generally, in route ciphers the elements of the plaintext (usually in this case single letters) are written on a pre-arranged route into a matrix agreed upon by the transmitter and receiver. The example above has a two row by n-column matrix in which the plaintext is entered sequentially by columns, the encryption route is therefore to read the top row and then the lower.

Obviously, to even approach an acceptable level of security, the route would have to be much more complicated than the one in this example. One form of transposition that has enjoyed widespread use relies on identifying the route by means of an easily remembered keyword. This can be done in several ways. One way, as in this example, is to define the order in which each column is written depending on the alphabetical position of each letter of the keyword relative to the other letters.

Using the keyword CIPHER, a matrix can be written out like the one below:

C
 I
 P
 H
 E
 R
 
1
 4
 5
 3
 2
 6
 
M
 E
 R
 C
 H
 A
 
N
 T
 T
 A
 Y
 L
 
O
 R
 S
 S
 C
 H
 
O
 O
 L
 Z
 Z
 Z
 

Unlike the previous example the plaintext has been written into the columns from left to right as normal, and the ciphertext will be formed by reading down the columns. The order in which the columns are written to form the ciphertext is determined by the key.

This matrix therefore yields the ciphertext: MNOOHYCZCASZETRORTSLALHZ. 

The first column is first because C is the earliest in the alphabet, followed by the second to last column because E is the next in the alphabet.

The security of this method of encryption can be significantly improved by re-encrypting the resulting cipher using another transposition. Because the product of the two transpositions is also a transposition, the effect of multiple transpositions is to define a complex route through the matrix which would not by itself by easy to define with a simply remembered mnemonic. 

When decrypting a route cipher, the receiver simply enters the ciphertext into the agreed-upon matrix 

according to the encryption route and then simply reads out the plaintext.

In modern cryptography transposition cipher systems serve mainly as one of several methods used as a step in forming a product cipher.

Product Ciphers

In the days of manual cryptography i.e. without the aid of a computer product ciphers were a useful device for the cryptographer and double transposition ciphers on keyword-based matrices were, in fact, widely used. There was also some use of a particular class of product ciphers called fractionation systems. In a fractionation system a substitution is first made from symbols in the plaintext to multiple symbols (usually pairs, in which case the cipher is called a biliteral cipher) in the ciphertext, which is then superencrypted by a transposition.


 One of the most famous field ciphers ever was a fractionation system - the ADFGVX cipher which was employed by the German Army during the first world war. This system was so named because it used a 6 ´ 6 matrix to substitution-encrypt the 26 letters of the alphabet and 10 digits into pairs of the symbols A, D, F, G, V and X. The resulting biliteral cipher is only an intermediate cipher, it is then written into a rectangular matrix and transposed to produce the final cipher which is the one which would be transmitted.
 

Here is an example of enciphering the phrase "Merchant Taylors" with this cipher using the key word "Subject".

 
 A
 D
 F
 G
 V
 X
 
A
 S
 U
 B
 J
 E
 C
 
D
 T
 A
 D
 F
 G
 H
 
F
 I
 K
 L
 M
 N
 O
 
G
 P
 Q
 R
 V
 W
 X
 
V
 Y
 Z
 0
 1
 2
 3
 
X
 4
 5
 6
 7
 8
 9
 

 

Plaintext:
 M
 E
 R
 C
 H
 A
 N
 T
  
 T
 A
 Y
 L
 O
 R
 S
 
Ciphertext:
 FG
 AV
 GF
 AX
 DX
 DD
 FV
 DA
  
 DA
 DD
 VA
 FF
 FX
 GF
 AA
 

This intermediate ciphertext can then be put in a transposition matrix based on a different key.

C
 I
 P
 H
 E
 R
 
1
 4
 5
 3
 2
 6
 
F
 G
 A
 V
 G
 F
 
A
 X
 D
 X
 D
 D
 
F
 V
 D
 A
 D
 A
 
D
 D
 V
 A
 F
 F
 
F
 X
 G
 F
 A
 A
 

The final cipher is therefore: FAFDFGDDFAVXAAFGXVDXADDVGFDAFA.

Block Ciphers

Generally, ciphers transform pieces of plaintext of a fixed size into ciphertext. In older, manual systems, these pieces were usually single letters or characters (or sometimes, as in the Playfair cipher, digraphs), since these were the largest units that could be easily encrypted or decrypted by hand. Although systems which operated on sets of three characters and other, larger groups of numbers, were proposed and understood to potentially be more secure they were never implemented because of the extra difficulty in the manual encryption or decryption process. In modern, single key cryptography however, the units of information can be much larger. 

A block cipher is a type of symmetric-key encryption algorithm that changes a fixed-length block of the plaintext into the same length of ciphertext. The encryption works by means of a key. Decryption is simply the reverse of the encryption process using the same secret key. The fixed length is called the block size and for modern block ciphers is usually 64 bits. As processors become more sophisticated, however, it is likely that this block size will increase to 128 bits.

Since different plaintext blocks are mapped to different ciphertext blocks, a block cipher effectively provides a permutation of the set of all possible messages. The actual permutation produced during any particular operation is of course secret, and determined by the key.

An iterated block cipher encrypts a plaintext block using a process with several stages (rounds). At each stage the same process (known as a round function) is applied to the data using a subkey (the set of subkeys usually being derived from a user provided key). The number of rounds in an iterated block cipher depends on the desired security level of the encrypted ciphertext and the trade-off that must be made with performance; fairly obviously a iterated block cipher with a large number of rounds will require more processing time. It is worth noting that in some cases the number of rounds required to provide an accurate level of security will be too large for the cipher to be practical.

An example of an iterated block cipher is a Feistel cipher. Feistel ciphers are a special class of iterated block ciphers. In this type of cipher the ciphertext is calculated from the repeated application of the same round function.

Stream Ciphers

A stream cipher also breaks the plaintext into units, this time it is normally a single character. It then encrypts the nth unit of the plaintext with the nth unit of the key stream. Stream ciphers can be designed to be exceptionally fast, much faster than any block cipher. While the encryption of any particular plaintext with a block cipher will result in the same ciphertext when the same key is used; with a stream cipher, the transformation of the smaller plaintext units will vary, depending on when they are encountered during the encryption process. 

A stream cipher generates what is known as a keystream - a sequence of bits, which is used as a key. The encryption process involves combining the keystream with the plaintext. 

The keystream can be generated in two ways:

Independent of the plaintext and ciphertext (this yields what is known as a synchronous stream cipher). 
Depending on the data and its encryption (in which case the stream cipher is said to be self-synchronizing). 
The majority of stream cipher designs are for synchronous stream ciphers. 

Interest in stream ciphers is currently attributed to the appealing properties of the one-time pad. A one-time pad, which is sometimes called the Vernam cipher, uses a keystream which is the same length as the plaintext message and consists of a series of bits generated completely at random. Theoretically this should produce ciphertext which is the most secure possible, because since the keystream is random even a cryptanalyst with infinite computational resources can still only guess at the underlying plaintext. While the one-time pad has occasionally seen use in wartime for ultra secret transmissions the fact that the key is as long as the message introduces severe practical problems and so, while theoretically perfectly secure, the one-time pad is generally impractical. Stream ciphers were developed as an approximation to the one-time pad.

At this time there is no de facto standard for stream ciphers although the most widely used stream cipher is RC4, a stream cipher designed by Rivest for RSA Data Security Inc. It is a variable key-size stream cipher with an algorithm based on the use of a random permutation.

Strangely, certain modes of operation of a block cipher transform it into a keystream generator and so, in this way, any block cipher can be used as a stream cipher. Stream ciphers with a dedicated design and typically much faster, however.

One method for generating a keystream is a Linear Feedback Shift Register (LFSR). This is a mechanism for generating a sequence of binary bits. LFSRs are easy to implement and fast operating in both hardware and software however a single LFSR is not secure because over the years a mathematical framework has been developed which allows for the analysis of their output. This problem can be solved by using a shift register cascade, a set of LFSRs connected together so that the behaviour of one of them depends on another. The detailed operation of LFSRs and shift register cascades is beyond the scope of this essay.

Data Encryption Standard (DES)

Originally developed by IBM under the name of LUCIFER, the American NSA (National Security Agency - the US equivalent of GCHQ) and the National Institute of Standards and Technology played a substantial role in the final stages of developing DES. DES is the most well known and widely used symmetric algorithm in the world. The NIST has re-certified DES every five years and it was last certified in 1993 but NIST have indicated that they would not re-certify DES again, AES (Advanced Encryption Standard) is expected to replace DES.

DES has a 64-bit block size and uses a 56-bit key during encryption. DES is a 16-round Feistel cipher and was originally designed for implementation in hardware. Because it is a single-key cryptosystem, when used for communication both sender and receiver must know the same secret key which can be used to encrypt or decrypt the message. DES can also be used by a single-user, for example to store files on a hard disk securely.

No easy attack on DES has yet been discovered, despite research efforts over many years. There is no feasible way to "break" DES other than an exhaustive search - a process which takes 255 steps on average. However, cryptanalysis methods which rely on knowledge of some of the plaintext have had some success. The consensus of the cryptography community is that, if it is not currently so, DES will soon be insecure. As a result of this as of November of 1998 DES was no longer used by the U.S. Government.


--------------------------------------------------------------------------------

Public Key Cryptography

Up to this point all the examples have assumed that the encryption process is undertaken with the same key as the decryption process, and that only the sender and the receiver possess the secret key. However, this can cause problems.

The Key Distribution Problem

A major problem in the practical use of single-key cryptography is the key distribution problem. This problem basically occurs because both the sender and receiver must hold a copy of the key, but they must also prevent others from gaining a copy of the key.

This does not, on the face of it, appear to be a problem, but it can be one as is probably best illustrated by an example.

Suppose that two individuals, O and L, wish to exchange information securely but can not guarantee the security of the transmission itself. They would probably use some sort of encryption to ensure that even if the message was intercepted its contents would remain secret. In order for this to operate they would both have to know a secret key which could be used to encrypt the data. However, since the transmission medium is not secure they would have to meet in person to decide upon the key. Once this was done, however, they could exchange information happily, encrypted with this secret key, in the knowledge that to anyone without the key it would simply look like garbage. O and L must protect the security of the key in their possession since if a cryptanalyst, C, obtained it, he could read, alter or fake a message from either O or L. 

This appears to be working perfectly, except a problem could soon arrive. Suppose that O or L then wanted to correspond with another individual, T. If they were to give T the key they would further compromise it because C would now have another source from which he might obtain it. In order to ensure that each message is not intercepted by another party both O and L would have to create different keys to exchange with T so that they would all each hold two keys (because otherwise O could alter T’s message to L and so forth).

Now consider a system with 1,000 members, all of whom wish to communicate in secret with each other. In this case, each individual would need to hold a key for every individual besides himself, in other words 999 keys for other people. Each individual would also have to protect those 999 keys from being compromised.

It is possible to calculate the number of keys present in a system with any number of members using these facts as I have done below, calculating the number of keys required by multiplying the number of members minus one by the number of members and divided by two.

Keys = [n ´ (n-1)]/2

Members
 #/Members2
 # of keys required
 
2
 4
 1
 
3
 9
 3
 
4
 16
 6
 
5
 25
 10
 
6
 36
 15
 
7
 49
 21
 
8
 64
 28
 
9
 81
 36
 
10
 100
 45
 
100
 10000
 4950
 
1000
 1000000
 499500
 

The number of keys is divided by two because some of the keys are duplicated (the key A uses to send to B is the same which B uses to send to A). The reason why I have included a column containing the number of members squared is made obvious below. 


 Here is a graph of the number of members squared against the number of keys required (I have excluded the final two rows to ensure that there is a practical scale).
 

As you can see the number of different keys required is nearly proportional to the square of the number of members in the system. This is the key distribution problem. 

A Solution To The Key Distribution Problem

A solution to the key distribution problem can be found in public key, or two-key, cryptography. Public Key cryptography is based on the idea that a user can possess two keys - one public and one private key. The public key can only be used to encrypt the data to be sent and the private key can only be used to decrypt it. Although single-key cryptography has been in use for centuries, public key cryptography is a relatively new invention with the first discussion about the subject in open literature being in 1976.

The fact that anyone can use a single "locking" key to encrypt a message which they are confident can still only be read by a single authorised user means that the number of keys required can be greatly reduced. For example, in the 1000 member system above the number of keys necessary could be reduced from 499,500 to 2000 (1000 public and 1000 private keys).

For example I will use a far simpler three user system. An individual, say P, could distribute his public key to two other individuals, A and C. A could then send a message to P by encrypting it with the public key. When P receives the message he could then decrypt and read the message using his private key. Supposing C was a nosy individual then he might attempt to spy on the communications between A and P, however, because he only possesses the encrypting key, he can not decrypt the message. 

There is a problem with this solution however. In this example, although C can not read or alter a message sent to P by A, he or she could easily fake a message because C has the same public key as A. Therefore, with a public key system the ability to authenticate messages has been given up in return for privacy. In many cases this will not be a problem however there are times when it will be so.

There is an alternative to this method. A could send P and C his or her private (unlocking) key and keep the public key secret. A could then encrypt messages with his public key and send them to either P or C. This system has the advantage that the person receiving the message knows that it must have come from A; however, both P and C can now decrypt the message. In this example secrecy has been sacrificed in order to maintain an ability to authenticate the message.

This is a problem with the public key system which can only be solved by increasing the number of keys in the system. It can be solved however, by combining the two methods outlined above. If each user had two sets of public and private keys and distributed one key from each set then the capability to authenticate messages and to keep them secret would be maintained.

For example, another three individuals P, D and W are using this system. Each user holds, in addition to their normal private (unlocking) key and everyone else’s public keys, a public key which is paired with a private key which they then distribute. P could then encrypt a message with his own public key, thus authenticating it, and then encrypt it with the public key of whoever the message was intended for (thus ensuring secrecy). 

Under this system each message is encrypted twice, once in a way which only the intended receiver can decrypt it, and once in which only the authentic sender could have encrypted it. Even though the number of keys required has been increased it still does not approach the number of keys which would be required for a single-key system of the same size. 

It is this idea of nested encryption processes with public and private keys which makes public-key cryptography so attractive.

Properties For A Two-Key Cryptosystem

For two-key cryptography to be possible a cryptosystem must have the following properties:

It must be easy for the cryptographer to calculate a pair of keys (private and public) but virtually impossible for a cryptanalyst to recover either key, regardless of how much ciphertext is available. 
The encryption and decryption operations should be easy (computationally speaking) for legitimate users to carry out. 
At least one of the keys must be virtually impossible for the cryptanalyst to recover even when he knows the other key and many matching plaintext and ciphertext pairs. 
At present no cryptosystem ever devised has satisfied all of these conditions in a provable way. However, cryptographers have devised cryptosystems of this sort by starting with a difficult mathematical problem such as factoring a product of large prime numbers and attempting to make the cryptanalysis of the scheme be equivalent to solving the problem. If this can be done then the security of the scheme is at least as good as the underlying problem is hard to solve.

There are a number of problems that are commonly used as the basis for public key cryptosystems.

Knapsack Problem


 Ralph Merkle, XEROX and Martin Hellman, Stanford University, put forward one of the best known proposals for a public-key cryptosystem in 1976. In 1997 the six co-founders of public key cryptography were awarded the Paris Kanellakis Theory and Practice Award, named after a researcher who died in 1995, for their work. Hellman and Merkle were among those who received the award.
 

They suggested using the knapsack, or subset-sum, problem as the basis for a public key cryptosystem. This problem entails determining whether a number can be expressed as a sum of some subset of a given sequence of numbers and, more importantly, which subset has the desired sum.

Given a sequence of numbers A, where A = (a1 ... an), and a number C, the knapsack problem is to find a subset of a1 ... an which sums to C.

For a specific example:

n = 5, C = 14, A = (1, 10, 5, 22, 3)

Solution = 14 = 1 + 10 + 3

In general, all the possible sums of all subsets can be expressed by:

m1a1 + m2a2 + m3a3 + ... + mnan where each mi is either 0 or 1.

The solution is therefore a binary vector M = (1, 1, 0, 0, 1).

There is a total number 2n of such vectors (in this example 25 = 32)

Obviously not all values of C can be formed from the sum of a subset and some can be formed in more than one way. For example, when A = (14, 28, 56, 82, 90, 132, 197, 284, 341, 455, 515) the figure 515 can be formed in three different ways but the number 516 can not be formed in any ways.

The only known solution to the general knapsack problem (as above) is a brute force test - testing all the possible subsets until either the sum is found or you run out of subsets. 

However, there is a special case if A is sorted and when each progressive term is larger than the one which follows it (called the simple knapsack problem) and there is an easy solution (this type of sequence is described as superincreasing). The value of C is compared to each value of A, starting with the largest. If A is larger than C then that number is not in the subset, if it is smaller then that value is in the subset. The next value is then compared to C less the sum of all the values found already.

For example:

A = (1, 3, 5, 10, 22), C = 14

14 £ 22            m5 = 0

14 ³ 10            m4 = 1

4 £ 5                m3 = 0

4 ³ 3                m2 = 1

1 ³ 1                m1 = 1

M = (1, 1, 0, 1, 0)

The cryptographer uses a secret transformation to turn a simple knapsack into a hard one. Legitimate users, knowing the secret transformation, could easily invert the hard knapsack back to the simple knapsack. This can be done using a private multiplier and a private modulus, the original sequence is multiplied by the multiplier and then dividing by the modulus (finds the remainder). This has the effect of scrambling the values. The plaintext is then converted into a set of binary bits which are then multiplied by the values of the public sequence and the result is recorded. To decrypt them is the equivalent of the hard knapsack problem for anyone who doesn’t know the secret information.

For example:

Taking the following private information:

A = (1, 3, 5, 10), b = 20, k = 7 (b is the private modulus, k is the private multiplier)

\ A = (7 ´ 1 mod 20, 7 ´ 3 mod 20, 7 ´ 5 mod 20, 7 ´ 10 mod 20)

Public key = (7, 1, 15, 10)

To encrypt the plaintext we must convert in into a binary number. For this simple example I will convert the letter to a value describing its position in the alphabet.

Let plaintext P = m (= 13) = 1101

Assigning each of these values a "size" from the public key and multiplying produces the following code, C.

C = 1 ´ 7 + 1 ´ 1 + 0 ´ 15 + 1 ´ 10 = 18 (= r)

The code letter is therefore r.

To decrypt this code letter without the secret transformation would be computationally infeasible, so we must perform the operation on the letter r (18).

To decrypt, use the inverse key k-1.

Let k-1 = 3 (any value for which k ´ k-1 mod 20 = 1)

Let C = 18

C’ = 3 ´ 18 mod 20 = 14

Now we can use the value of 14 to solve the simple knapsack problem.

(1, 3, 5, 10)

14 ³ 10            m4 = 1

4 £ 5                m3 = 0

4 ³ 3                m2 = 1

1 ³ 1                m1 = 1

\ M = (1, 1, 0, 1) corresponds to binary 1101 = 13 = m

Unfortunately, it has been shown that it is possible, in a reasonable length of time, to derive the private key from the public key thus knapsack-based cryptosystems have been shown not to be secure. However, although of little practical use, they are an excellent introduction to the idea of public key encryption.

Rivest-Shamir-Adleman (RSA) Algorithm

The main contenders for two-key cryptosystems in the mid-1980s were those based on factoring large integers. Of these, the best known is the RSA algorithm, developed by Ron Rivest, Adi Shamir, and Leonard Adleman in 1977. 

In this system a user chooses a pair of prime numbers so large that factoring the product is beyond all computing capabilities. This is possible because testing for primes is easy whereas factoring the product is very difficult. 

In practise, RSA works as follows. The user takes two large primes, p and q, and computes their product n = pq. n is called the modulus. The user then chooses a number, e, less than n and with no common factors (except 1) with (p-1)(q-1). Another number, d, is then found, subject to the condition that (ed-1) is divisible by (p-1)(q-1). e is called the public exponent, d is called the private exponent. The public key is therefore the pair n and e, the private key is the pair n and d. The factors of n, p and q, can either be kept secret with the private key or destroyed.

It is currently virtually impossible to obtain the private key, d, from the public key (n and e). However, if n could easily be factored into p and q then a cryptanalyst could obtain the private key, d. The security of RSA is therefore based upon the assumption that factoring is difficult.

Here are the stages in sending a message using this method:

The receiver, M, distributes his public key pair. 
The sender, F, composes a plaintext message, m, and then uses Ms public key to encrypt the message and form the ciphertext, c. c is the remainder left when m is raised to the power of e and divided by the modulus n. 
c = me mod n (where e and n are Ms public key pair).

F sends the ciphertext, c, to M. 
The receiver, M, decrypts the ciphertext and retrieves the plaintext message, m. m is the remainder obtained when c is raised to the power of d and divided by n. 
m = cd mod n

As you can see, this process requires d, which only M knows. Another person, I, who intercepts the message, can not decrypt it. 
For example:

Let p = 5, q = 11, n = pq = 55

The least common multiple of (p-1)(q-1) is 20 = 22 ´ 5. 

Therefore, in this case, any key, e, not divisible by 2 or 5 will have a matching key, d (e must be relatively prime to (p-1)(q-1) for it to be the key).

Let e = 7

(ed -1) mod (p-1)(q-1) = 0 \ d = 3

Let the plaintext message, m = b = 2

\ The ciphertext, c = me mod n = 27 mod 55 = 18

To decrypt this information it is necessary to know d.

m’ = cd mod n = 183 mod 55 = 2 \ The message has been successfully decrypted.

This example used small primes so it can be seen that the product, n, is not at all difficult to factor to retrieve the original primes. In using RSA it has always been suggested to use "strong" primes which have certain properties making their product especially difficult to factor using certain factoring methods. However, advances in factoring techniques over the last decade have near completely negated the advantage of strong primes, the elliptic curve factoring algorithm is one such advance. Therefore, it is not choosing traditionally "strong" primes which matters but rather choosing large primes.

In 1997, a specific assessment of the security of 512-bit RSA keys shows that one may be factored for less than $1,000,000 in cost and eight months of effort. It is therefore believed that 512-bit keys provide insufficient security for anything other than short-term needs. RSA Laboratories currently recommends key sizes of 768 bits for personal use, 1024 bits for corporate use, and 2048 bits for extremely valuable keys like the root-key pair used by a certifying authority. Security can be increased by changing a users keys regularly and it is typical for a user’s key to expire after two years (the opportunity to change keys also allows for a longer length key to be chosen).

It should be noted that the key sizes for RSA (and other public-key techniques) are much larger than those for block ciphers like DES, but the security of an RSA key cannot be compared to the security of a key in another system purely in terms of length. It should also be noted that although increasing the length of the modulus may increase the security of the system it also significantly increases the amount of time required to perform encryption, decryption and authentication - doubling the modulus will, on average, quadruple the amount of time taken for encryption and increase the time taken for decryption by a factor of eight.

"Real World" Usage Of The RSA Algorithm

DES, and other block ciphers, are much faster in operation than RSA. Typically, in software, DES is approximately 100 times faster than RSA and in hardware can be between 1000 and 10000 times faster depending on the implementation. Because of the length of time taken for public key encryption a public key system such as RSA is often used in conjunction with a secret-key system such as DES. In this case the message is encrypted with a randomly chosen DES key and the key itself is encrypted and sent with RSA. This method combines the low number of keys required for RSA with the high-speed operation of DES.

RSA is the most widely used public-key cryptosystem available currently and has often been referred to as a de-facto standard regardless of official recognition. In fact, RSA is part of many official standards worldwide. These include ISO 9796 which lists RSA as a compatible cryptographic algorithm and many internet standards and proposals including S/MIME. 

Cryptography in the "Real World"

Applications Of Cryptography

In the information dependent world in which we now live cryptography can be found all around us, often in places where you would not expect it. When people think about encryption they tend to think about vast computer banks processing military and diplomatic communications, or a world war two rotor cipher machine slowly deciphering an order. In reality, cryptography - although obviously essential for the military and diplomatic services - has many commercial uses and applications. From protecting confidential company information, to protecting a telephone call, to allowing someone to order a product on the Internet without the fear of their credit card number being intercepted and used against them, cryptography is all about increasing the level of privacy of individuals and groups. For example, cryptography is often used to prevent forgers from counterfeiting winning lottery tickets. Each lottery ticket can have two numbers printed onto it, one plaintext and one the corresponding cipher. Unless the counterfeiter has cryptanalysed the lottery’s cryptosystem he or she will not be able to print an acceptable forgery.

In a world where virtually all data of any importance is held on a computer system the necessity of cryptography cannot be disputed. 

Politics Of Cryptography

Widespread use of cryptosystems is something most governments are not particularly happy about - precisely because it threatens to give more privacy to the individual, including criminals. For many years, police forces have been able to tap phone lines and intercept mail, however, in an encrypted future that may become impossible. 

This has lead to some pretty strange decisions on the part of governments, particularly the United States government. In the United States, cryptography is classed as a munition and the export of programs containing cryptosystems is tightly controlled. In 1992, the Software Publishers Association reached agreement with the State Department to allow the export of software that contained RSA's RC2 and RC4 encryption algorithms, but only if the key size was limited to 40 bits as opposed to the 128 bit keys available for use within the US. This significantly reduced the level of privacy produced. In 1997 this was increased to 56 bits. The US government has proposed several methods whereby it would allow the export of stronger encryption, all based on a system where the US government could gain access to the keys if necessary, for example the clipper chip.

The resolution of this issue is regarded to be one of the most important for the future of e-commerce.


--------------------------------------------------------------------------------

Cryptanalysis

It is beyond the scope of this essay to deal with all types of cryptanalysis so I will give a brief overview and an example of where cryptanalysis has been successful. Unlike cryptography which is a clearly defined science, cryptanalysis is as much an art as it is a science. Success in cryptanalysising a cipher is a flash of inspiration almost as often as it the result of using crytanalysis techniques alone.

Types of Cryptanalysis

There are several distinct types of cryptanalytic attack. The type used depends on the type of cipher and how much information the cryptanalyst has.

Types Of Cryptanalytic Attacks

A standard cryptanalytic attack is to determine the key which maps a known plaintext to a known ciphertext. This plaintext can be known because it is standard or because it is guessed. If the plaintext segment is guessed it is unlikely that its exact position is known however a message is generally short enough for a cryptanalyst to try all possible positions in parallel. In some systems a known ciphertext-plaintext pair will compromise the entire system however a strong encryption algorithm will be unbreakable under this type of attack.

A brute force attack requires a large amount of computing power and a large amount of time to run. It consists of trying all possibilities in a logical manner until the correct one is found. For the majority of encryption algorithms a brute force attack is impractical due to the large number of possibilities.

Another type of brute force attack is a dictionary attack. This essentially involves running through a dictionary of words in the hope that the key (or the plaintext) is one of them. This type of attack is often used to determine passwords since people usually use easy to remember words.

In a ciphertext only attack the cryptanalyst has only the encoded message from which to determine the plaintext, with no knowledge whatsoever of the actual message. A ciphertext only attack is presumed to be possible, if not easy. In fact, an encryption techniques resistance to a ciphertext only attack is considered the basis for its cryptographic security.

In a chosen plaintext attack the cryptanalyst has the capability to find the ciphertext corresponding to an arbitrary plaintext message of his or her own choosing. The likelihood of this type of attack being possible is not much. Codes which can survive this attack are considered to be very secure.

In a chosen ciphertext attack the cryptanalyst can choose an arbitrary ciphertext and find the corresponding decrypted plaintext. This attack can be used in public key systems, where it may reveal the private key.

In an adaptive chosen plaintext attack the cryptanalyst can determine the ciphertext of chosen plaintexts in an iterative process based on previous results. This is the general name for a method of attacking product ciphers called "differential cryptanalysis".

Frequency Tables

The cryptanalysis of single-key cryptosystems depends on one simple fact - that some traces of the original structure of the plaintext may be visible in the ciphertext. For example, in a monoalphabetic substitution cipher where each letter in the plaintext is replaced by a letter in the ciphertext which is the same each time, a simple analysis of a sizeable portion of ciphertext can be used to retrieve most of the plaintext.

Here is a monoalphabetic substitution cipher of a random paragraph of English:

UFMDHQAQTMGRG BX GRAZTW PWM 

UFMDHBGMGHWOG VWDWAVG BA BAW 

GRODTW XQUH AQOWTM HCQH HFQUWG 

BX GHFIUHIFW BF DQHHWFA RA HCW 

DTQRAHWLH OQM GIFJRJW WAUFMDHRBA 

QAV SW VRGUWFARSTW RA HCW 

URDCWFHWLH. 


 W occurs 20 times in the cipher, H occurs 16 times. From this information and using the frequency table to the left it would be possible for a cryptanalyst to recover the majority of the plaintext.

The usefulness of analysing the structure of the ciphertext can be reduced by any encryption procedure that attempts to disguise the structure. However, eliminating the underlying structure is harder than it would first seem. Digraphs, for example, show a strong frequency distribution - TH occurs very often, about 20 times more often than HT and so on. With tables of digraph frequencies it is possible to recover the underlying plaintext, however, the amount of ciphertext required would be much greater.

In the heyday of manual cryptanalysis huge volumes of word patterns were compiled. Although these only contained the most obvious and easily recognised word patterns they were still of importance if they could provide that vital clue which could break the entire cipher.
 

Cryptanalysis Of Public Key Ciphers

Public key cryptography requires a fundamentally different type of cryptanalysis than is used for single key cryptanalysis. Because public key cryptography relies on "hard" mathematical problems, their cryptanalysis is essentially research into solving the underlying mathematical problems. Cryptanalysis of public key ciphers is therefore virtually indistinguishable from research into any other area of mathematics.

For example, to "crack" the RSA algorithm and obtain the private key from the public key would essentially involve research into factoring algorithms. Factoring is a very active field of research among mathematicians and computer scientists. The best general-purpose factoring algorithm today is the Number Field Sieve but even so, the running time to search through all the possibilities is very long. 

In 1980 it was possible to factor a 50-digit number with 1,000,000,000,000 computer operations. By 1984 factoring algorithms had advanced to the stage where a 75-digit number could be factored in the same number of operations. If a mathematical advance was made which enabled a 150 (or more) digit number to be factored in the same number of operations it would allow cryptanalysts to break many RSA implementations. 

A Triumph of Cryptanalysis - Enigma

No look at cryptanalysis, however brief, with even the slightest element of historical content, could finish without at least briefly covering the tale of how the Enigma was broken.

What Was Enigma?

The efficiency of the German armed forces during World War 2 would not have been possible without radio communication. Messages sent over the radio had to be encrypted and the encryption system they used was adapted from one which was commercially available before the war. Despite the fact that it was modified so as to make it harder to crack it was broken by a group working at the GCHQ. 


 The Enigma machine consisted of a 26 letter keyboard for input. The output was read off 26 lamps which each corresponded to a letter. The encipherment was performed by a device called a "scrambler" that was made of three rotating wheels on a common spindle and a plugboard known as a "Steckerboard" that added an additional level of security. 
 

The keyboard and lampboard were both connected to a common cabling bus. When no keys were being held down all the lamps were connected to the bus and the current present on any particular wire would cause its lamp to glow. When a key was pressed a current was placed on its wire but the corresponding lamp was disconnected from the bus.

Since no letter could ever be enciphered to itself this was a good set-up. When a key was pressed one wire on the bus would have current applied to it and the other 25 could respond with the enciphered letter. The one that actually did respond was the result of the other components.

The original, commercial, Enigma did not have the plugboard. The 26 wire bus was connected to a circular set of contacts that sat to the right hand side of the three rotor scrambler. These three rotors were identical except in their internal wiring. The rotors had two sets of 26 connectors, one on the left and one on the right. All three rotors had to be rotated by hand and by doing so the "key" could be changed. Because each rotor could be set independently there were 26 ´ 26 ´ 26 (17576) possible rotor settings. This would not have provided a secure ciphering system so every time a key was pressed one of the rotors (known as the "fast" rotor) was advanced one position so that even if the same key was pressed three times the result would be three different characters. When the fast rotor got to a certain position it would cause the middle rotor to be rotated to one position, which in turn could cause the slow rotor to move one position. 

Initial Work On Cryptanalysing Enigma

The initial work on cryptanalysing Enigma was done by the Polish. The earliest Polish work on the intercepting German machine ciphers had begun in 1928, right after the systems introduction by the German Army. However, no progress was made during the first four years. Then the Polish Cipher Bureau decided to recruit three young mathematicians. Among them was Marian Rejewski. On September 1, 1932 Rejewski and his two somewhat younger colleagues began work at the Cipher Bureau in Warsaw. In early October Rejewski began work on the Enigma and was supplied with an obsolete commercial Enigma machine which had been brought in Germany. 

The whole, complicated process of mastering the German Enigma was ultimately concluded in the early days of 1933. This included combinations of mathematics, statistics, computational ability and guesswork. The breaking of Enigma involved two distinct processes. The first was the theoretical reconstruction of the Enigma cipher device itself including the internal wiring and the interdependence between different components of the machine. This knowledge of the machine enabled the Poles to build doubles of the Enigma, The second was the creation of methods for recovering the Enigma keys (starting positions) based exclusively on the basis of intercepts.

At the beginning of the second world war, realising that a German invasion was imminent, the Polish handed over all the information including the duplicate Enigma models over to the British.

What Made It Possible?

With the level of sophistication of the Enigma machines it should have been unbreakable. However, the Germans had a number of procedural flaws which allowed the British and Polish to break the cipher. 

Some of these were just stupid. For example, the Germans reused the monthly code book settings. Others were less obvious. It was discovered that the German military messages were very similar in nature and, once they were decrypted, very careful records were kept regarding how the messages tended to be formatted. When the British thought they had a likely idea of what the start of the message was, they would use this to reduce the many trillions of possible Enigma settings to a small enough number to be tested individually.

The Turing Bombe


 The design of the Turing bombe is normally credited to two British mathematicians - Alan Turing and Gordon Welchman, working in secret at Bletchley Park. The information handed over to the British by the Polish was much more advanced than anything the British (or anyone else) had accomplished . One of the things the Polish had developed was a machine they called a Bomba (the Polish word for bomb - probably named because of the ticking sound it made). The Polish machine and the British one worked on different algorithms but were mechanically very similar. This provided Turing with a starting point.
 

The Bombe was effectively a collection of Enigma machines that were all working together. According to Welchman there were twelve sets of Enigma rotors in the Bombe, but there is also other evidence that larger Bombes existed. None of the Bombes survived the war so it is difficult to be sure. 

The theorem the Bombe was based on the loop that exists between the crib (a guessed sequence of letters) and the equivalent cipher text. The method Turing devised to test this was based upon the mathematical technique reductio ad adsurdum. For every one of the possible rotor settings Turing would start by assuming that this was the correct setting and then try to disprove this. If the procedure did disprove the setting the machine would automatically advance to the next setting to be tested. If it did not the setting was recorded, and then the procedure continued to try and find other possible settings.


--------------------------------------------------------------------------------

The Future

Although Moore’s Law may cease to hold at some point in the near future it is still an inevitability that the power of computer processors will increase with each passing year. 


 Within a matter of a few decades computers many more times powerful than those available today will be in use. These more powerful processors will allow more complicated encryption algorithms to be run within a reasonable time span. This will inevitably lead to the steadily increasing power of encryption systems. 

The same computing power which makes these computers useful to cryptographers will make them useful to cryptanalysts. The computer shown in Figure 10 is one in use at the Government Communications Headquarters in Cheltenham, with computers hundreds of times more powerful it will become plausible to decrypt many ciphers by a simple brute force attack where it had not been so before. This will probably spell the end of existing cryptosystems, however, other, more complicated ones will be developed in their place. 

We have already seen this happen. In November 1998 the US Government discontinued its usage of the DES system because what was considered secure upon the algorithms conception is no longer so. A variant of DES called triple-DES will be used until AES (Advanced Encryption System) is ready.
 

It also seems likely that better factoring algorithms will be developed in the future. Coupled with increasing computing power this may mean the end of RSA, or simply that people will be using larger and larger primes to stay ahead of the cryptanalysts. 

There is an inherent danger in attempting to make any definite statements about the future, more often than not they are hilariously inaccurate. However, in my opinion, I believe that the balance between cryptographer and cryptanalyst will be roughly maintained although I think it is likely that the actual volume of encrypted traffic will increase due to individuals using cryptography to protect their privacy. This is of course the great fear of law enforcement agencies who have, for many years, enjoyed the ability to tap phone lines. Attempts to set-up some sort of escrow key system such as what the United States government attempted with the clipper clip may be successful, or they may not. Whatever happens, however, cryptology promises to remain an area of interesting developments for a long time yet.


--------------------------------------------------------------------------------

Endnotes

Sources 

CDs

Encyclopedia Britannica CD 99

Web Sites

Forensic Files

A Cryptography Compendium 

Beginners Guide to Cryptography

RSA Laboratories

Ralph Merkle’s Homepage 

German Enigma Cipher Machine

Enigma and the Turing Bombe

Sci.Crypt FAQ

Authors Notes

This list represents all the sources which I actually used information from in producing this essay. This is not a complete list of all the information sources I consulted.

About Me

My photo
U can get in touch with me for any help related to coupons and I will try my best to provide u with all latest coupons and offers.