Home About
Complex Discrete Logs Compressed Histogram Hash
DIY crypto Fusion
Heterodyne Physics McCoskey Count Sort
Oil Spill Fix Personal Rail
Rational Residue Residue Arithmetic
Reverse Translation Analysis Rotary Engine
Subatomic Energy

DIY crypto

Encryption is an ancient practice. People often want to get information to one set of people while keeping it from others. There are usually many channels of communication with different levels of security. People transmit small amounts of info over a secure channel in order to set up later communications over less secure channels. Public key cryptography still requires the exchange of a private key to set up the communications link.

Large authoritarian organizations routinely want to broadcast communications to subordinates secretly. The cost of that secrecy rises as a function of the amount of time the info needs to be kept secret, and the number of people involved. We might call this institutional cryptography. It is a subject that has been extensively analyzed.

Now that the general public has been made aware of the transparency of digital communications, there would seem to be more of a need for private cryptography. The technical challenge of DIY crypto should be much less that of institutional cryptology. Private individuals would probably need systems that cover smaller groups for shorter periods of time, and they have no need to maintain institutional uniformity, so they could maintain a greater level of secrecy by using a lot of different encryption methods.

With all that in mind, I have written a small Python program that would be pretty robust under modification and would also be pretty difficult to decipher. At least that is my hope.

The program uses a deterministic pseudo-random sequence to generate a large translation table. In this case the table has 126,099 entries, but there is no reason it couldn't be raised to a million entries if desired to get stronger encryption. All the entries in the sequence need to be unique, but that is assured by use of the Python set() data structure. A programmer with just a little knowledge could substitute any number of variations of the seq(cnt) function. Some might be harder to defeat than others. There are a lot of pseudo random sequence generators available, but it would also be possible to get data from any number of sources. You could put a picture of your dog on your website, and use some subset of the pixel data for the sequence.

People might also want to modify the dst list. It is a list of the symbols to be encrypted and the frequencies with which they will be used. I got the frequencies for general English from the web, and I added some arbitrary numbers for the numbers, periods, and spaces. If you made a histogram of the letter use in a message, and used that for this dst list, you could achieve almost perfectly unbreakable encryption. Since you wouldn't need to send messages at all if you knew in advance what you would need to say, a good guess at what your letter use will be is the only practical option. For English speakers and general usage, this table should be sufficient for most needs.

Below is an output of two consecutive runs of the program. First the program prints the text to be encoded, then it prints the encoded version, then in prints the output of the decode. Notice that the encoded version of the data is different for the two runs, but the decode process works for both.

Large translation tables are built in memory from a very compact source, and they are never saved to disk and don't need to be transmitted as part of the message. There are 25,000 codes for space and 12,702 codes for e, but only 74 codes for z, a codebreaker would get very little information from running a histogram analysis this encoded data. And unlike public key methods, there is no information given on how the data was encoded, so the codebreaker has no built-in target for his efforts. There is no public key for him to factor. He is left with analyzing contextual clues. All this will not keep some codes from being broken, but it will up the cost of wholesale spying dramatically. People who have nothing to encrypt, but would like to annoy the NSA could attach random data so their emails to the NSA might waste time trying to decode random noise.

Here is the code:

In the interest of seeing how easily this system could be broken, I changed the keys and the clear text message, and below is the encoding. It is a famous saying. Can anyone tell me what it is, or what keys I used?