Signs of Snake Oil

There are several general cues one can pick up on that a particular encryption algorithm is weak, or at least that the developers of an algorithm are not satisfactorily competent. We hope that the following comments will help the reader determine whether vendors are quite knowledgeable, or are just talking jive.

 

Use of Key Size

A person may tell you that their encryption algorithm has however many bits for a key, or if they really want to sound impressive, they may say that there are however many millions of  possible keys a system allows. Though the numbers may seem strikingly large, remember that machines can be built to try billions of keys per second.  To get a better idea of what a good key size is, have a look at our theory section.

Often, export laws limit the key spaces to 1.1x1012 (for 40-bit keys), or possibly 1.8x1019 (for 64-bit keys). The latter is considered border-line safe for most uses, and the former is simply too easy to break. Systems using 128-bit keys are quickly becoming the standard, and there is currently a large push from the cryptography community to allow larger key sizes to be exported. [Pearson97]

There is a flip-side to having large keys. Bruce Schneier [Schneier99] brings up an interesting examples of how some companies use ridiculously huge keys (such as 4096-bit keys for conventional block cyphers) to ensure that their system is "unbreakable". Such designs are usually about hype on the product. Schneier believes that huge keys may suggest that the developers do not truly understand how key spaces work, or that there are other important factors in cryptography. Even if such systems resist statistical types of attacks, they would likely be unnecessarily complex and slow. Perhaps the extra key bits are not worth it.

 

Use of Passphrases and Non-random Keys

Passphrases are the same concept as passwords, except that they are used to refer cryptographic applications rather than user logins, like email, etc. The idea of a passphrase is that an encryption key is generated from some user input. For example, say we generate keys by asking the user to type in a phrase (a string of ASCII characters). Most users would use something that is easy to remember, such as an address, a phone number, a favourite furry critter,  or something similar. As most people are probably aware, this type of passphrase is relatively easy for attackers to figure out, and all kinds of dictionary attacks have been developed for these systems. So system administrators start revising their rules for choosing passphrases. They tell their users to sprinkle some numbers and special characters into the passphrase, in order to make it more random. Still, there will be some pattern that people will use when choosing their phrase, if only to make it easier to remember. Finally, we could define a hash function that maps the user's input into an obscure block of binary. This doesn't make the key more random, but only makes key selection more complicated. There are still keys that will have a higher probability of being used. As such, an attacker can try the most likely ones first, and will usually be able to find the correct one faster. The moral of the story is that keys should be completely randomized. For statistics people, this means that the probability distribution for all keys in the key space is uniform; that is, all keys are equally likely to be used. They should not depend on human input, system clocks, or some other predictable structure. Any such system may be more convenient for users, but can drastically diminish the overall security of the system, particularly against dictionary attacks.

We have prepared a small web script that allows you to input your passwords, and we will tell you if they are highly vulnerable to the above mentioned attacks. Give it a try, and see if your passwords are weak.

As a special case to the above, encryption software should never get its keys from the system that it is being used on. For example, various programs derive keys from hard-drive serial numbers, ethernet addresses, installation dates, and various other system-specific sources. Although these keys may be unique and rather random, they are far from secure. For one, they do not allow the user to change keys (at least not very conveniently). The other larger security flaw is that an attacker could figure out how the key is derived. This is no stronger than an encryption algorithm that always uses the exact same key. Essentially, once attackers knows the algorithm, they can break it all the time, with negligible effort.

 

System Environment and Implementation

One concern is that an encryption key (as with passwords and other secret information) could be written to disk at some point. If the particular encryption software is badly designed, it may store such information in a temporary (or possibly permanent) file on a disk drive. As such, an attacker may be able to recover the information, even if the file is deleted from the file system. Even if the encryption program does not make such silly mistakes, the operating system may store the program's working memory on the disk (Windows does this with its virtual memory file, and UNIX keeps such data on a swap partition). This, too, gives an attacker the chance to scan the disk for any secret information. In the latter case, there is little that a software developer can do to prevent such memory usage. Various creative techniques, such as Steganographic data hiding have been used to attenuate this kind of security breach. 

 

Secret Encryption Algorithms

A particular software developer may say that their algorithm is secret, and that is why it is so successful. The cryptography community frowns profusely upon such concealment, as "good" encryption does not depend on the secrecy of the algorithm for strength. Even if the algorithm is not published, attackers and analyzers can figure out if it works, and if there are any weaknesses in the algorithm. DeCSS was just one example of this. If the algorithm has been published, and cryptographic researchers have not found any practical attacks on the system, then there are two possibilities: either the algorithm is truly strong, or we just haven't found a weakness yet. In either case, it is generally safe for the public at large to use. After all, if experts in cryptography can't find a weakness, then it is unlikely that a good cracker will either. Without such vigorous testing, no one really knows how safe a particular algorithm is. This argument also applies to algorithms revealed under a "non-disclosure" agreement. Since patents can be taken out on a particular algorithm, there is no real reason to hide it from the public. If a developer is afraid to release the details of an algorithm, then it probably has not been thoroughly tested. 

Related to this, a developer may be using a well-known algorithm, but their implementation may be flawed. An example of this is the score of reduced version of DES that use only four or eight rounds (rather than the intended sixteen) for the sake of "faster encryption". Statistical attacks have been devised for these reduced versions that trivially break them, though the full version is still considered acceptable for civilian use. In situations like these, if the developers had released their particular implementation for public testing, they would have quickly realized its flaws.

 

Technobabble

Matt Curtin [Curtin98] coins the phrase "technobabble" in his FAQ about snake oil. His term refers to any confusing or under-explained jargon, often newly invented or trademarked to mount up hype about an algorithm. Such vocabulary may even confuse an expert in the field. Curtin states that such wordiness may be over-compensation for lack of expertise on the part of the developer.

Other instances of technobabble are sited by Schneier [Schneier99], in which companies use invalid or irrelevant mathematical proofs, and pass them off as verification that their system is secure. He even notes a system that was developed by IBM that was genuinely proven to be unbreakable by adaptive chosen plaintext attacks, but had some serious flaws against other attacks. Such proofs could easily fool the untrained eye.

 

Aggregate Cyphers

Encrypting something multiple times may seem like it creates a stronger system, but only in certain circumstances. For instance, lets say that a company has devised a system that encrypts data with the Blowfish algorithm, and then encrypts that cyphertext again with IDEA. Such a system may have advantages over statistical types of attack, but they are not necessarily stronger against brute-force. For more details, see our theory section .

 

Revolutionary Breakthroughs

Encryption techniques evolve slowly, for a good reason: when a new concept emerges, professionals generally take several years to accept or reject it. Lots of testing and research has to go on before the true properties of an algorithm are known. Often, new concepts first appear in research literature, and are not-so-new by the time they are accepted. Again, be wary of developers that modify well-known encryption algorithms and claim that their version is stronger than the original. Without proper fore-thought, such modifications have been known to weaken the algorithm.

Curtin's work [Curtin98] also mentions various "new paradigms of computing", such as cellular automata, neural nets, genetic algorithms, etc. These terms have become buzzwords for the software industry at large, but do not necessarily apply to cryptography. Just because a person is an expert in one area of computer science (or math, or engineering for that matter) does not imply that they have the knowledge to create new "paradigms" in cryptography [Schneier99]. But, even if we suppose that such nuances had occurred, they would be new techniques in the realm of encryption and would need time for sufficient testing. 

 

Useless Acclamation

Often, developers may state that their encryption product has been reviewed by expert cryptographers. This is good, but one or two people giving their "thumbs up" is not enough. If the algorithm hasn't appeared in a handful of research papers, then there is no saying how strong it is. Also, as Curtin [Curtin98] points out, celebrity crackers are not necessarily cryptographic experts.

Similarly, algorithms may be advertised as "military-grade" systems, or have other such designations. There are no current standards of what is a "strong" or 
"military-grade" encryption algorithm. The ISO used to regulate various well-known algorithms, but has ceased to do so over the last 6 years [Lubbe97].  Find out what cryptographers have to say about a system, not what their advertisements claim.

 

Unbreakable Codes

All conventional codes are breakable, though good ones are impractical to break. There is the notion of a one-time-pad, which is unbreakable if used properly, but such a system is not very practical for day-to-day use. A one-time-pad needs a truly random number generator (such as radioactive decay, etc.) to be unbreakable. Since this kind of randomness doesn't exist in standard computer hardware, developers often substitute with pseudo-random generating algorithms, which greatly diminishes the system's security (see our theory section for more details). Also, with one-time-pads, it is important to know who is supplying the key stream to be used. If it is the developer, or some third party company, how do you know if the same keys are being sent to other clients, or if they are being sold to your competitors?

If an algorithm is said to be "unbreakable", it may be so within some narrow view of what attackers are capable of. For instance, improper implementation of one-time-pads are preposterously vulnerable to known-plaintext attacks, but can still be very strong against brute-force attacks. Every algorithm will have its vulnerabilities, and they should be considered in terms of the environment in which it will be used. 

 

Crappy Competitors

A developer of an algorithm may mention that other rival algorithms are flawed and point out numerous attacks that have been developed for it. Take these with a grain of salt. Some of the attacks may be theoretical, and still not practical. For example, the DES algorithm requires 255 (3.6x1016) trials to break it with a brute-force attack, but a known-plaintext attack has been developed by Matshiu [Stallings99] which breaks DES in an average of 242 (4.4x1012) trials. In other words, it is about 8,000 time faster than a brute-force attack. At first, this seems to be a major flaw in the DES algorithm, but consider that, in order to execute the attack, an attacker has to collect, on average, 4 trillion plaintext-cyphertext pairs. This translates to 64 terabytes of data. If the encryption software simply switches keys every gigabyte or so, then an attacker will never get enough data to determine the key that is currently in use, at least through this method of attack. 

 

Key Recovery

Because of encryption export laws, and various other reasons, a particular piece of software may have a key recovery mechanism that allows a select group of people (often law enforcement or security personnel) to decode encrypted data in case of some emergency. Similarly, the software may implement key escrow, which stores a portion of the key to make recovery easier. The questions to ask are who will be able to recover the key, and how much can you trust them? Also, if a developer claims that they can recover your key with escrow, then it is likely that attackers can do so as well. 



Back to the top.