Department of Computer Science
Helsinki University of Technology
Janne.Frosen@hut.fi
2.11.1995
Some general terms used in cryptography
Basic Terminology
Plaintext or Cleartext | The original data. This may be a text message, file or a stream of communication |
Encryption | Encoding a message so that hides it hides the contents from outsiders |
Ciphertext | The encrypted message |
Decryption | Retrieving the plaintext from ciphertext |
Cipher | A method of encryption and decryption |
Key | A key is usually used in encrypting, decrypting is only possible by knowing the key |
Cryptography | Art or science of encoding data and keeping the message secret |
Cryptoanalysis | Art of breaking ciphertext without knowing the key |
Cryptology | Branch of mathemathics that studies cryptographic methods |
Some US instances linked with cryptology
NSA | National Security Agency | NIST | National Institute of Standards and Technology |
Cryptography is formally the art of encoding data in a way that
only the intended recipient can decode it, and know that the message
is authentic and unchanged.
Cryptography means different things to different people. Small
children play with simple ciphers and substitution secret languages,
bigger children play with crypto puzzles. Some people are concerned
with privacy for various reasons and use different methods to encrypt
sensitive data, with standard unix "crypt" or "rot-13" variants. None
of these have anything to do with strong encryption and real data
security.
Strong encryption is not a technical standard, it means your
encryption cannot be broken by current known methods within feasible
time without the data being outdated. Strong encryption can be used to
protect your sensitive data against organized crime, government and
multinational corporations - all instances with virtually unlimited
resources. That has been a cause to recent tries to outlaw strong
encryption.
Strong encryption brings many possible applications in daily
life. Different applications that require privacy, trust and access
control should all use strong encryption methods when
possible. Applications include things like electronic money [AUK95]
[SAA95], secure communications [PES95] [PUL95] [HEI95], passwords, and
many others. It is in people's own interest that different
legal/medical/personal data about their person stays confidential to
the instances that have a permit to collect the databases (finnish
Tietoturvalaki).
Some cryptograhic methods rely on the secrecy of the algorithms used
in the cipher (security by obscurity). These ciphers are only of
historical interest and are not adequate for a real-world
situation.
All modern algorithms use a key to control the encryption and
decryption. The message can only be decrypted if the key matches the
one it was encrypted with. The key used for decryption can be
different from the key used in encryption, and this divides the
algorithms in symmetric (or secret-key) and asymmetric
(or public-key) classes.
Modern cryptographic algorithms are meant to be executed by computers
or specialized hardware devices for which there are several different
cryptographic algorithms and methods. This paper concentrates on those
commonly used in data encryption today.
Different symmetric algorithms use different length keys, usually a
longer key means higher security.
Symmetric algorithms can be further divided into two categories:
stream ciphers which take and encrypt one bit of plaintext at a
time, and block ciphers which take a number of bits (typically
64) and encrypt them as a single block. Most ciphers belong to the
block cipher class.
Symmetric algorithms are generally faster than asymmetric ones and use
a much shorter key.
DES stands for "The Data Encryption Standard". DES has been certified
by NIST (National Institute of Standards and Technology) for use as
an official US Government encryption standard for less-than-top-secret
secret material. It is recertified every five years. DES was first
certified for government use in 1977, and most recently recertified in
1993, but may not be recertified again [FAQ95].
DES is a strong cipher which encrypts a block of 64 bits at a time
using a 56 bit key (56 + 8 parity checks = 64) resulting in 64
encrypted bits.
DES encryption itself consists of many rounds of different
transformations and permutations, which are linear and easy to
reverse. The critical encryption is done using S-boxes. The S-boxes,
or substitution boxes, are a set of highly non-linear functions,
implemented in DES as a set of lookup tables (of 4 rows and 16
columns). The S-boxes encrypt 4 bits at a time, so encrypting is done
in 16 rounds. After the S-boxes the results are still permutated.
The complicated substituting, permutating, XORs and shifts were chosen
to have some useful properties:
DES is normally used in cipher block chaining (CBC) or cipher feedback
(CFB) mode. In CBC mode a plaintext block is first XORed with the
previous ciphertext block and then encrypted to obtain the ciphertext.
In CFB mode the previous ciphertext block is encrypted and then XORed
with the plaintext to get the ciphertext.
There are numerous software applications and C libraries with the DES
encryption routines widely available. Exporting DES from US is
regulated (by NSA - National Security Agency).
Speed
According to RSA Laboratories, when implemented entirely in software,
DES is at least 100 times faster than RSA. Implemented in hardware, it
may outperform the RSA algorithm by 1000 or even 10000 times. This is
due primarily to the fact that the DES S-boxes are simple table-lookup
functions, while RSA depends on very-large-integer arithmetic.
A fast 486 PC can encrypt about 400 kb per second using DES. [YLO95]
Security
There are two known ways to 'break' DES. The first requires an
exhaustive search of the keyspace, which consists of 2^56 (about
7.2*10^16) possible keys ("brute force"). If you can test one million
keys every second, it should only take about 2000 years to go though
the keyspace. With special hardware and/or networked machines testing
can be done magnitudes faster, a chip could be designed that does a
billion tests per second, reducing the needed time to 2 years. It is
said that you can buy a dedicated machine (with special decrypting
hardware) that can break DES in a couple of hours by brute force, by
spending 1 million US dollars [FAQ95].
The other more recent method is called differential
cryptanalysis. This method reduces the number of keys that must be
tested, but it requires that you have 2^47 chosen plaintexts encrypted
with the key you are trying to recover. Since it is highly unlikely
that anyone will agree to encrypt 2^47 chosen plaintexts with their
secret DES key, this attack is unfeasible in practice [FAQ95].
Because the DES algorithm is based on the "mysterious" S-boxes, which
are just constants without any known connections, it has lead to some
rumors that there was a trapdoor in the algorithm [SCH95].
The overall consensus is that DES, when used properly, is secure
against all but the most powerful organizations. (like NSA,
governments, mafia, big supercomuting/parallel computing companies,
etc.). Proper use means avoiding known "weak" keys. (weak keys are
result of the key being split to 16 pieces, one for each round of
encryption). It's simple to avoid the 4 weak, 12 semi-weak keys and 48
possibly weak keys.
The short key is the main known risk in DES. Given all these points,
using simple DES for top-secret data is not a good idea anymore, but
is sufficient for everyday use.
If DES with a single key is not sufficiently secure for a given
application, it can be made more secure by encrypting more than once
with different keys. It has been proven that multiple encryptions do
actually improve the security of DES (DES is not an algebraic group!),
and it is thought that triple-encryption with DES is about equivalent
to single-encryption with a 112-bit key [SCH95].
3DES usually uses 2 different keys. First the data is DES encrypted
with first key, then decrypted with the second key, and then encrypted
with the first key again.
Speed
Triple DES is almost 3 times slower than DES. A fast 486 PC can
encrypt about 150 kb per second [YLO95].
Security
At a rate of one million keys per second, an exhaustive search of
2^112 keys would require about 1.65*10^20 years to complete. Since the
universe is estimated to be only about 10^10 years old, that is
probably long enough for most purposes.
3DES has been proven much more secure than conventional DES, and is a
good alternative for current designs. There still remains the same
rumors that concern DES (trapdoor?). But as is, it's a better
alternative to DES and with current machine technology makes brute
force attacks not viable. Applications using DES can easily be
converted to use 3DES instead, if the slight effect on speed is not
critical.
IDEA (International Data Encryption Algorithm) is an algorithm
developed at ETH Zurich in Switzerland. IDEA is considered one of the
best and most secure algorithms available today. It uses a 128 bit key
to encrypt 64 bit blocks, and the same algorithm is used for
encryption and decryption. IDEA is a fairly new algorithm, but it has
already been around for several years, and no practical attacks on it
have been published despite of numerous attempts to analyze it.
IDEA mixes algorithms from three algebraic groups: XOR, Addition
(modulo 2^16), Multiplication (modulo 2^16 + 1). The 64-bit data is
divided into 4 16-bit sub-blocks on which the operations (8 total
rounds of swapping, XORring, addition, multiplication).
IDEA is normally used in CBC and CFB modes like DES, and many software
implementations are publicly available (PGP being most famous). IDEA
is patented in the United States and in most of the European
countries. The patent is held by Ascom-Tech. Non-commercial use of
IDEA is free. Commercial licenses can be obtained by contacting
idea@ascom.ch. (Patent not valid in Finland).
Speed
Encryption speed is somewhat faster than 3DES, but slower than DES. A
fast 486 PC can encrypt about 200 kb per second [YLO95].
Security
It cannot practically be broken by brute force (even a billion
tests/second chip would require 10^13 years .. an array of 10^24 such
chips would be needed to find the key in a day, but there is not
enough silicon atoms in the known universe to build them). No other
methods are known (at least yet). It seems to be safe to use,
authorities such as Bruce Schneier [SCH95] consider it the best
symmetric algorithm today.
A Triple-IDEA (similar to 3DES) cipher should be sufficient for even
the most paranoid people.
The main risk in IDEA is its relatively short age (in cryptography an
algorithm needs to be around a couple of decades before it starts to
get believed truly unbreakable). Only time will prove the strength of
IDEA, authorities today think it is safe.
Skipjack is an NSA-developed encryption algorithm for the Clipper (and
Capstone) chips. The algorithm is classified as secret. What is known
about it is that it's symmetric, uses a 80-bit key and has 32 rounds
of processing per each encrypt or decrypt operation.
The Clipper-chip is a commercial chip made by NSA for encryption,
using the Skipjack algorithm. At least AT&T will be using the Clipper
for encrypted voice phonelines.
Security
NSA is using Skipjack to encrypt it's own messaging system, so that
leads to think the algorithm itself is secure.
Clipper uses Skipjack with two keys. The idea behind it is that anyone
who knows the chip's "master key" can decrypt all messages encrypted
with it. So, basically NSA could decrypt Clipper-encrypted messages
with this backdoor. This method of tampering with the algorithms is
called key escrow.
There are many movements and foundations campaigning against the
Clipper-chip, concerned about privacy. There are rumors that a
similar chip nicknamed EuroClipper could be introduced within EU, but
the latest more direct news convince this to be just a rumor.
RSA is a public-key cryptosystem for both encryption and
authentication, it was invented in 1977 by Ron Rivest, Adi Shamir, and
Leonard Adleman (RSA). RSA is the most widely used public-key
cryptosystem today and has often been called a de facto standard.
RSA works as follows: take two large primes, p and q, and find their
product n = pq. Choose a number, e, less than n and relatively prime
to (p-1)(q-1), and find its inverse, d, mod (p-1)(q-1), which means
that ed = 1 mod (p-1)(q-1); e and d are called the public and private
exponents, respectively. The public key is the pair (n,e); the private
key is d. The factors p and q must be kept secret, or destroyed.
It is difficult (presumably) to obtain the private key d from the
public key (n,e). If one could factor n into p and q, however, then
one could obtain the private key d. Thus the entire security of RSA is
predicated on the assumption that factoring is difficult; an easy
factoring method would break RSA.
There are many applications today using RSA, most notably PGP and Ssh.
Encryption
Suppose Alice wants to send a private message, m, to Bob. Alice
creates the ciphertext c by exponentiating: c = m^e mod n, where e and
n are Bob's public key. To decrypt, Bob also exponentiates: m = c^d
mod n, and recovers the original message m; the relationship between e
and d ensures that Bob correctly recovers m. Since only Bob knows d,
only Bob can decrypt.
Authentication
Suppose Alice wants to send a signed document m to Bob. Alice creates
a digital signature s by exponentiating: s = m^d mod n, where d and n
belong to Alice's key pair. She sends s and m to Bob. To verify the
signature, Bob exponentiates and checks that the message m is
recovered: m = s^e mod n, where e and n belong to Alice's public key.
Encryption and authentication take place without any sharing of
private keys: each person uses only other people's public keys and his
or her own private key. Anyone can send an encrypted message or verify
a signed message, using only public keys, but only someone in
possession of the correct private key can decrypt or sign a message.
Speed
RSA operations are all based on series of multiplications. In
practical applications, it is common to choose a small public exponent
for the public key. Entire groups of users can use the same public
exponent. This makes encryption faster than decryption and
verification faster than signing. Algorithmically, public-key
operations take O(k^2) steps, private key operations take O(k^3)
steps, and key generation takes O(k^4) steps, where k is the number of
bits in the modulus.
There are many commercially available hardware implementations of RSA,
and there are frequent announcements of newer and faster chips. The
fastest current RSA chip has a throughput greater than 600 Kbits per
second with a 512-bit modulus, implying that it performs over 1000 RSA
private-key operations per second. It is expected that RSA speeds will
reach 1 Mbit/second within a year or so.
By comparison, DES is much faster than RSA. In software, DES is
generally at least 100 times as fast as RSA. In hardware, DES is
between 1,000 and 10,000 times as fast, depending on the
implementations. RSA will probably narrow the gap a bit in coming
years, as it finds growing commercial markets, but will never match
the performance of DES.
Security
The security of RSA depends of factoring being difficult. There are
several methods to try factoring, but as long as the keys are long
enough there is small risk of having your RSA encoded message broken.
384 bits can be broken relatively easily, 512 bits is probably
insecure and breakable by major governments, 768 bits is probably
relatively safe, 1024 bits should be secure for decades according to
today's information. 2048 bits will most probably remain safe for a
long time.
Another way to break RSA is to find a technique to compute e-th roots
mod n. Since c = m^e, the e-th root of c is the message m. This attack
would allow someone to recover encrypted messages and forge signatures
even without knowing the private key. This attack is not known to be
equivalent to factoring. No methods are currently known that attempt
to break RSA in this way.
RSA is very vulnerable to chosen-plaintext attacks, and a good guess can
reveal the used key. It is also advisable to include some random
data (at least 64 bits) to the encrypted plaintext.
Diffie-Hellman is a commonly used public-key algorithm for key
exchange. It is generally considered to be secure when sufficiently
long keys and proper prime generators are used. The security of
Diffie-Hellman relies on the difficulty of the discrete logarithm
problem (which is believed to be computationally equivalent to
factoring large integers).
Security
Diffie-Hellman is sensitive to the choice of the strong prime and the
generator. The size of the secret exponent is also critical to the
security. Conservative advice is to make the random exponent twice as
long as the intended session key.
A cryptographic hash function generates a fixed-size hash value from a
message of any length. The idea is to generate a hash value that
cannot be used to trace back the original message. The typical
applications include things like secret numbers on ATM cards etc.
Message Digest Algorithm 5 (MD5) is a secure hash algorithm developed
at RSA Data Security, Inc. It can be used to hash an arbitrary length
byte string into a 128 bit value. MD5 is in wide use, and is
considered reasonably secure.
MD5 processes the input text in 512-bit blocks, divided into 16 32-bit
sub-blocks. The output is a set of 4 32-bit blocks, which are
concatenated to a single 128-bit hash value.
Security
It has been reported recently that MD5 was potential weaknesses in it,
and that it's breakable in some cases. Also it is also said that one
could build a special-purpose machine costing a few million dollars to
find a plaintext matching given hash value in a few weeks.
Still, MD5 is considered to be relatively secure and good enough for
most purposes.
Abstractly, a zero knowledge proof is an interactive proof with a
prover and a verifier, where the prover convinces the verifier of a
statement (with high probability) without revealing any information
about how to go about proving that statement. A mathematical proof is
found at [ZER95].
Zero knowledge proofs can be used for authentication before exchanging
key information. Details about the zero knowledge systems can be found
in Hannu Aronsson's paper [HAA95].
The best applications combine different cryptosystems. The primary
advantage of public-key cryptography is increased security: the
private keys do not ever need to transmitted or revealed to anyone. In
a secret-key system, by contrast, there is always a chance that an
enemy could discover the secret key while it is being transmitted.
Another major advantage of public-key systems is that they can provide
a method for digital signatures and authentication. Secret key systems
would require a third party for this.
A disadvantage of using public-key cryptography for encryption is
speed.
For encryption, the best solution is to combine public- and secret-key
systems in order to get both the security advantages of public-key
systems and the speed advantages of secret-key systems. The public-key
system can be used to encrypt a secret key which is then used to
encrypt the bulk of a file or message.
Introduction to Practical Cryptosystems
Different Cryptosystems
Symmetric Algorithms
Symmetric algorithms, also called secret-key algorithms, use
the same key for both encryption and decryption (or in some
cases the key is easily derivable from the other). The key is not to
be leaked to outside enemmies (hence the name) and should be changed
often and be sufficiently random.
Asymmetric Algorithms
Cryptographic Hash Functions
Zero Knowledge Systems
Conclusions
References