Cryptography in Cyber Security
Table of Contents
- 1. Abstract
- 2. Cryptography in Cyber Security
- 3. Prime Factorization Theorem
- 4. What does "Computationally Infeasible" Mean?
- 5. One Way Hash Function
- 6. Symmetric-Key Encryption
- 7. Public-Key Encryption
- 8. Man-in-the-Middle (MiTM) Attack
- 9. Digital Certificates and Signatures
- 10. Open Source Cryptography
- 11. StegFS - A Steganographic File System for Linux
- 12. References
- 13. End
1 Abstract
Data integrity and privacy, on the Internet and within your PC and smartphones, primarily rests on using cryptography well. Unfortunately, it is easily compromised by errors in (operating) system configuration. This lecture is a quick overview of cryptography as relevant in cyber security and passwords.
Educational Objectives
- Introduce the student to cryptography as it applies to computer/ cyber security.
- Describe the authentication technology.
2 Cryptography in Cyber Security
"Whoever thinks his problem can be solved using cryptography, doesn't understand his problem and doesn't understand cryptography." – Roger Needham/Butler Lampson. Lampson is a 1992 Turing Award winner
Data integrity and privacy on the Internet primarily rests on using cryptography well. The design and implementation of cryptography requires deep understanding of discrete mathematics and number theory. Unfortunately, when cryptography is deployed carelessly, it is easily compromised by errors in (operating) system configuration. This lecture is a quick overview of cryptography as relevant in cyber security and passwords.
A cryptographic encryption algorithm, also known as cipher,
transforms a "plain text" (e.g., human
readable) pt
and outputs cipher text
ct
as the output,
ct = cipher(pt, key)
so that it is possible to re-generate the pt
from
the ct
through a companion decryption algorithm. Note
that we said "for example, human readable" and not
"that is, human readable" as an explanation for the phrase
"plain text". Often, the so-called "plain text"
is human un-readable binary data that is ready-to-be-used by a
computer.
Ciphers use keys together with plain text as the input to produce cipher text. It is in the key that the security of a modern cipher lies, not in the details of the algorithm.
3 Prime Factorization Theorem
Any natural number N can be factorized into primes:
N = 2n2 * 3n3 * 5n5 * 7n7 * …
3.1 Factorization of Arbitrarily Large Numbers is Infeasible
The factorization of an arbitrarily large number N, into its constituent primes, determining the powers n2, n3, n5, n7, etc. of the primes, is computationally infeasible. As far as we know.
3.2 Decryption without Key is Infeasible
Decryption uses factorization internally. So, it is computationally infeasible. Note that this is assuming that we are using known methods, including brute force.
Is it possible that some one or some country has actually discovered fast algorithms, but chose to keep them secret, for these tasks that we believe to be computationally infeasible?
4 What does "Computationally Infeasible" Mean?
Roughly speaking, computationally infeasible means that a certain computation that we are talking about takes way too long (hundreds of years) to compute using the fastest of (super) computers.
Suppose our key is a 128-bit number. There are 340,282,366,920,938,463,463,374,607,431,768,211,456 128-bit numbers starting from zero (i.e., 128 bits of 0). To recover a particular key by brute force, one must, on average, search half the key space: 170,141,183,460,469,231,731,687,303715,884,105,728. If we use 1,000,000,000 machines that could try 1,000,000,000 keys/sec, it would take all these machines longer than the universe as we know it has existed to find the key.
4.1 Not the Same as Turing-incomputable
This is not the same thing as saying that computational infeasibility is the same idea as Turing-incomputable.
4.2 Heuristic Computations
Nor is it the same thing as saying that you cannot make a lucky guess, or heuristically arrive at a possible answer, and then systematically verify that the guessed answer is indeed the correct answer, all done within a matter of seconds on a lowly PC.
Here is an example: Microsoft's old Windows NT used the DES encryption algorithm in storing the passwords. Brute-forcing such a scrambled password to compute the plain text password can take, according to Microsoft, "about a billion years." But https://www.l0pht.com claims that L0phtCrack breaks Windows passwords in about one week, running in the background on an old Pentium PC.
4.3 Quantum Computing
- As of 2018: Most popular public-key algorithms can be efficiently broken by a sufficiently strong hypothetical quantum computer.
- https://en.wikipedia.org/wiki/Post-quantum_cryptography Recommended Reading.
5 One Way Hash Function
A hash function maps input sequences of bytes into a fixed-length sequence. The fixed length is considerably shorter than the typical length (millions of bytes) of the input, and hence the function is a hash function.
The nature of all hash functions is that there must exist multiple input sequences that map to the same hash. The inverse is a mathematical relation, not a mathematical function. But, good hash functions have the following properties: It is hard to find two strings, from the expected set of typically used strings, that would produce the same hash value. A slight change in an input string causes the hash value to change drastically.
A "one way" hash function is designed to be computationally infeasible to reverse the process, that is, to algorithmically discover a string that hashes to a given value.
5.1 Message Digests
- One-way hash functions are also known as message digests (MD), fingerprints, or compression functions.
5.2 MD5 and SHA1
- The most popular one-way hash algorithms are MD4 and MD5 (both producing a 128-bit hash value), and SHA Secure Hash Algorithm, also known as SHA1 (producing a 160-bit hash value).
- As of 2006, both MD5 and SHA1 are considered separately broken. That is, given plain text p, it is possible to modify p to a desired p' so that md5(p) = md5(p').
- Similarly, for SHA1. What is not known is if we can modify p to a p' so that md5(p) = md5(p') and sha1(p) = sha1(p').
5.3 SHA512
- SHA2, a successor to SHA1, is a range of hash functions and includes closely related SHA224, SHA256 (256 bits long), SHA384 and SHA512 (512 bits long). SHA3 is released by NIST in 2015. https://en.wikipedia.org/wiki/SHA-3.
5.4 Example Outputs of SHA512 and MD5
- Read the man pages:
man md5
andman md5sum
andman sha1sum
andman hashalot
etc.- % sha512sum kubuntu-cosmic-desktop-amd64.iso # stdout shown folded
851238208f71114e64dd3dfd2bd516d097823fb5fe94432a7ed4dfad02dfe8f1 ce966f9f04793ce019869ee241db9044ade7219fcff8fe73db406c4a4a3b94f0 kubuntu-cosmic-desktop-amd64.iso
- % sha256sum kubuntu-cosmic-desktop-amd64.iso
3f470978690b8fb343c94b8a8ded62f0372f9837ced583eba84045480d79d065 kubuntu-cosmic-desktop-amd64.iso
- % md5sum kubuntu-cosmic-desktop-amd64.iso
bb93c40531b7b13fe7b01a9c75bbc312 kubuntu-cosmic-desktop-amd64.iso
- % sha512sum kubuntu-cosmic-desktop-amd64.iso # stdout shown folded
6 Symmetric-Key Encryption
Symmetric-key cryptography is an encryption system in which the sender and receiver of a message share a single, common key to encrypt and decrypt the message. Symmetric-key systems are simpler and faster, but their main drawback is that the two parties must somehow exchange the key in a secure way. Symmetric-key cryptography is sometimes also called secret-key cryptography.
If ct = encryption (pt, key), then pt = decryption (ct, key).
Encryption is done as follows. Consider the entire message to be encrypted as a sequence of bits. Suppose the length of n in bits is b. Split the message into blocks of length b or b-1. A block viewed as a b-bit number should be less than n; if it is not, choose it to be b-1 bits long. Each block is separately encrypted, and the encryption of the entire message is the catenation of the encryption of the blocks.
Let m stand for a block viewed as a number. Multiply m with itself e times, and take the modulo n result as c, which is the encryption of m. That is, c = me mod n.
Decryption is the "inverse" operation: m = cd mod n.
6.1 DES
A popular symmetric-key system is the DES, short for Data Encryption Standard. DES was developed in 1975 and standardized by ANSI in 1981. DES encrypts data in 64-bit blocks using a 56-bit key. The algorithm transforms the input in a series of steps into a 64-bit output.
6.2 IDEA
IDEA (International Data Encryption Algorithm) is a block cipher which uses a 128-bit length key to encrypt successive 64-bit blocks of plain text. The procedure is quite complicated using subkeys generated from the key to carry out a series of modular arithmetic and XOR operations on segments of the 64-bit plaintext block. The encryption scheme uses a total of fifty-two 16-bit subkeys.
6.3 Blowfish
Blowfish is a symmetric block cipher that can be used as a drop-in replacement for DES or IDEA. It takes a variable-length key, from 32 bits to 448 bits, making it ideal for both domestic and exportable use. Blowfish is unpatented and license-free, and is available free for all uses.
7 Public-Key Encryption
Public key cryptography uses two keys – a public key known to everyone, and a private or secret key that is kept confidential. Public key cryptography was invented in 1976 by Whitfield Diffie and Martin Hellman. It is also called /asymmetric encryption / because it uses two keys instead of one key. The two keys are mathematically related, yet it is computationally infeasible to deduce one from the other.
Unfortunately, public-key cryptography is about 1000 times slower than symmetric key cryptography. But, modern hardware can cope with it.
7.1 RSA
The most well-known of the public-key encryption algorithms is RSA, named after its designers Rivest, Shamir, and Adelman. The un-breakability of the algorithm is based on the fact that there is no efficient way to factor very large numbers into their primes.
- Find two (large) primes, p and q.
- Compute the product, n = p*q (called, the public modulus).
- Choose e (the public exponent), such that
- e < n, and
- e is relatively prime to (p-1)*(q-1)
- Compute d (the private exponent) such that (e*d) mod (p-1)*(q-1) = 1.
- Public-key is (n, e), and the private key is (n, d).
An example of the above numbers: ./rsa.txt. Look up the man page: openssl(1).
The/e/ and d are symmetric in that using either ((n,e) or (n,d)) as the encryption key, the other can be used as the decryption key.
The only way known to find d is to know p and q. If the number n is small, p and q are easy to discover by prime factorization. Thus, p and q are chosen to be as large as possible, say, a few hundred digits long. Obviously, p and q should never be revealed.
7.2 DSA
The Digital Signature Algorithm (DSA) is a United States Federal Government standard for digital signatures.
- Choose a prime q. Choose a prime modulus p such that p - 1 is a
multiple of q.
- Choose g, a number whose multiplicative order modulo p is q. (This may be done by setting g = h((p - 1)/q) mod p for some arbitrary h (1 < h < p-1), and trying again with a different h if the result comes out as 1. Most choices of h will lead to a usable g; commonly h=2 is used. )1. Choose x by some random method, where 0 < x < q.
- Calculate y = gx mod p.
- Public key is (p, q, g, y), and the private key is x.
An example of the above numbers: ./dsa.txt Look up the man page: openssl(1).
7.3 Secure Communication Using Public Keys
Public-key systems, such as Pretty Good Privacy (PGP), are popular for transmitting information via the Internet. They are extremely secure and relatively simple to use. You need to retrieve the recipient's public key from one of several world-wide registries of public keys that now exist to encrypt a message.
When John wants to send a secure message to Jane, he uses Jane's public key to encrypt the message. Jane then uses her private key to decrypt it.
In real-world implementations, public keys are rarely used to encrypt actual messages because public-key cryptography is slow. Instead, public-key cryptography is used to distribute symmetric keys, which are then used to encrypt and decrypt actual messages, as follows:
- Bob sends Alice his public key. (Or, Alice retrieves Bob's public key from a registry.)
- Alice generates a (random) symmetric key (called a session key), encrypts it with Bob's public key, and sends it to Bob.
- Bob decrypts the session key with his private key.
- Alice and Bob exchange messages using the session key.
8 Man-in-the-Middle (MiTM) Attack
The public key-based communication between Alice and Bob described above is vulnerable to a man-in-the-middle attack.
Let us assume that Mallory, a cracker, not only can listen to the traffic between Alice and Bob, but also can modify, delete, and substitute Alice's and Bob's messages, as well as introduce new ones. Mallory can impersonate Alice when talking to Bob and impersonate Bob when talking to Alice. Here is how the attack works.
- Bob sends Alice his public key. Mallory intercepts the key and sends her own public key to Alice.
- Alice generates a random session key, encrypts it with Bob's public key (which is really Mallory's), and sends it to Bob.
- Mallory intercepts the message. He decrypts the session key with his private key, encrypts it with Bob's public key, and sends it to Bob.
- Bob receives the message thinking it came from Alice. He decrypts it with his private key and obtains the session key.
- Alice and Bob start exchanging messages using the session key. Mallory, who also has that key, can now decipher the entire conversation.
A man-in-the-middle attack works because Alice and Bob have no way to verify they are talking to each other. An independent third party that everyone trusts is needed to foil the attack. This third party could bundle the name "Bob" with Bob's public key and sign the package with its own private key. When Alice receives the signed public key from Bob, she can verify the third party's signature. This way she knows that the public key really belongs to Bob, and not Mallory.
9 Digital Certificates and Signatures
9.1 Certification Authority (CA)
An independent third party that everyone trusts, whose responsibility is to issue certificates, is called a Certification Authority (CA).
9.2 What is a Digital Certificate?
A package containing a person's name or website's name … (and possibly some other information such as an E-mail address and company name) and his public key and signed by a trusted third party is called a digital certificate (or digital ID). A digital certificate certifies the ownership of a public key by the named subject of the certificate.
A digital certificate serves two purposes. First, it provides a cryptographic key that allows another party to encrypt information for the certificate's owner. Second, it provides a measure of proof that the holder of the certificate is who they claim to be.
9.3 Usage
The recipient of an encrypted message uses (i) the CA's public key to decode the digital certificate attached to the message, (ii) verifies the certificate as issued by the CA, (iii) obtains the sender's public key and identification information held within the certificate. The message must have been encrypted with the private key of the sender.
9.4 X.509
The most widely used standard for digital certificates is X.509. What are colloquially known as SSL certificates are X.509 certificates. X.509 uses public key infrastructure (PKI) standard. It defines the following:
- Version field identifies the certificate format.
- Serial Number unique within the CA.
- Signature Algorithm identifies the algorithm used to sign the certificate.
- Issuer Name is the name of the CA.
- Period of Validity is a pair of Not Before, and Not After Dates
- Subject Name is the name of the user to whom the certificate is issued
- Subject's Public Key field includes algorithm name and the public key of the subject.
- Extensions
- Signature of CA.
9.5 Obtaining a Digital Certificate
- "You can either buy an SSL (X.509) certificate or generate your own (a self-signed certificate) for testing or, depending on the application, even in a production environment. Good news: If you self-sign your certificates you may save a ton of money. Bad news: If you self-sign your certificates nobody but you and your close family (perhaps) may trust them."
- https://letsencrypt.org/ Let’s Encrypt is a free, automated, and open Certificate Authority.
- IRS approved certificate authorities: https://www.irs.gov/businesses/corporations/digital-certificates
9.6 Digital Signature
- The digital signature is associated with a
document to authenticate where the document originated.
- Alice the sender.
- Compute a hash H of the document D.
- Encrypt H with the sender’s private key. Result is E.
- Send the E and the digital certificate DCA of Alice.
- Bob the recipient of the document D, E, and DCA.
- Compute a hash H' of D.
- Decrypts E, the signature, with Alice's public key from DCA. Result is E'.
- Compare the two values, E' == H'?. If they match, Bob knows that:
- the document really came from Alice and
- the document was not tampered with during transmission.
- Alice the sender.
- A digital certificate contains the digital signature of the certificate-issuing authority.
- "While digital signature is a technical term, defining the result of a cryptographic process that can be used to authenticate a sequence of data, the term electronic signature – or e-signature – is a legal term that is defined legislatively."
10 Open Source Cryptography
- Changes in the crypto export regulations in 2000 now make it possible to distribute open source cryptography.
- https://kernel.org/pub/linux/kernel/crypto/ contains crypto extensions to the kernel that provide the ability to encrypt filesystems, create virtual private networks, etc.
11 StegFS - A Steganographic File System for Linux
- Steganography hides "data" such that it cannot be proved to be there.
- StegFS encrypts data.
- https://www.cl.cam.ac.uk/~mgk25/ih99-stegfs.pdf Reference
12 References
man openssl
This man page is Required Reading.- Simson Garfinkel, and Gene Spafford, Practical Unix and Internet Security, 3rd edition (2003), O'Reilly & Associates; ISBN: 0596003234; Chapter on Cryptography. Required Reading.
- Microsoft, "Introduction to Code Signing," Required reading.
- Boaz Barak, An Intensive Introduction to Cryptography, https://intensecrypto.org/public/, 2018. Free PDF. Whole Book: Reference. Chapter on Public Key Cryptography: Required Reading.
- https://crypto.stanford.edu/~dabo/courses/OnlineCrypto/slides/13-sigs.pdf What is a digital signature? 60+ slides from stanford.edu
- https://medium.com/cryptolawreview/cryptolaw-9410cf7a8fd4 CryptoLaw How to (1) see the Crypto Legal Matrix, (2) overcome regulatory anxiety (3) make peace with law.