Cryptography is a very old concept which has existed for centuries, with the advent of modern computer technology, it has been a part of everyone’s daily life, although a lot of people might not realize it. One simple example is the online shopping, when you pay your purchase online by your credit card, cryptography is involved in the transaction, well, someone might say that I never do online shopping, then how about gas filling or any other thing that you need to use a credit card with?
Although the cryptography is ubiquitous in our life and there are many articles describing it on the net, I have not found a single place so far that has all the confusing concepts/terms explained, this is why I am writing this post.
Problems That It Tries to Solve
Generally speaking, there are two typical situations that we may run into when coming to exchange information or message with others:
- The message needs to be kept confidential and can be comprehended only by the other trust party
- We need to know if the message we receive has been tampered
Cryptography is used in both cases, however in different way. For the first case, it is called message encryption/decryption, basically we need to transform a plain text to a cypher text which can’t be interpreted/comprehended unless it gets decrypted properly, so it solves a “confidentiality” problem and is used to transmit secrecy; for the later case, it is called authentication where the plain text can be transmitted as is however a signature needs to be attached for the receipt to verify if the plain text has ever been tampered, in order to do so, data hashing is usually adopted to be coupled with encryption/decryption. The later case solves a “Trustiness” or “integrity” problem and ensures that the message indeed comes from the person who the message claims to come from.
Encryption vs Authentication
Cryptography involves complicated algorithms to ensure that it is valid and secure, as a result, the encryption or decryption or both can be very time consuming, so it is suitable for messages/text of small size, or for the case where the message must be kept confidential. In this case, the sender encrypts the plain text, the receiver decrypts the cypher text.
For the text that doesn’t need to be kept confidential, the authentication comes to play. The authentication involves encryption/decryption however it also involves data hashing. Data Hashing is a computing process that generate a Digital Digest (or Digital Fingerprint) for the original message, see more in the following Data Hashing section in this post. Instead of encrypting the original message, Authentication enable the sender to only encrypt the Digital Digest and the resultant cypher message is called a Signature. Once received the message along with the signature, the receiver will first decrypt the signature to retrieve the digital digest of the original plain message, then use the same hash algorithm to hash the received plain text and compare if the computed the hash is the same as the decrypted one.
Encryption Type
Almost all encryption algorithms fall into one of the two categories: Symmetric or Asymmetric, the main difference between them is how the encryption key is managed and distributed.
Symmetric Encryption
Symmetric encryption requires the same secret or private key to be shared between the sender and the receiver, and the same key is used for both encryption and decryption operation, the good examples on Symmetric Encryption include AES, Blowfish etc. In order for Symmetric Encryption to work, the secret key must be kept confidential or private so that no one besides the receiver and sender is able to decrypt the message, therefore the challenge becomes how the private key can be securely distributed from the sender to the receiver or visa versa.
Symmetric Encryption is usually used in message transmission only, the most common form of symmetric encryption comes once an encrypted connection has been negotiated between a client and a server with an SSL certificate installed. Once the connection is negotiated, two 256-bit session keys are created and exchanged so that encrypted communication can occur.
Asymmetric Encryption
Unlike Symmetric encryption, the Asymmetric Encryption involves a public/private key pair, the sender and receiver holds one of them respectively, the sender uses the public key to encrypt the message while the receiver uses the private key to decrypt it, or vice versa. In any case, the private key must be kept securely, while the public key can be openly distributed. The examples of Asymmetric Encryption include RSA and ECIES etc.
Although Asymmetric Encryption can be directly used for encrypting user data where its public key is used to encrypt the data while the private key is used to decrypt it, its more common usage is to be used along with Symmetric Encryption due to its inefficiency in encrypting the message, take RSA as an example, it is often used to pass encrypted shared keys for Symmetric Encryption which in turn can perform bulk encryption-decryption operations at much higher speed.
Asymmetric Encryption is often used in Authentication where the private key is used to encrypt the data, while the public key is used to decrypt it. The encryption process is also called Signature Signing because the resultant encrypted data is called a signature, while the decryption is called Signature Verification. The good examples are DSA1024 and ECDSA algorithms.
Data Hashing
Data Hashing is a computing process that applies cryptographic hash functions to the input message and generates a byte sequence of small fixed length which is usually called Digital Digest or Digital Fingerprint of the original message. A cryptographic hash function is a special class of hash function that has certain properties which make it suitable for use in cryptography. It is a mathematical algorithm that maps data of arbitrary size to a bit string of a fixed size (a digital digest or hash) and is designed to be a one-way function, that is, a function which is infeasible to invert. The only way to recreate the input data from an ideal cryptographic hash function’s output is to attempt a brute-force search of possible inputs to see if they produce a match, or use a rainbow table of matched hashes.
There are many cryptographic hash algorithms, which may generate a digital digest of various length. Here is list of mostly commonly used ones:
- MD5: Digest length is 16 bytes
- SHA-1: Digest length is 20 bytes
- SHA-256: Digest length is 32 bytes
- SHA-512: Digest length is 64 bytes
Signature Signing
When coming to cryptography’s authentication usage, the sender first needs to generate a signature for the digital digest of the message to be sent/transmitted, this process is called Signing Process which encrypts the digital digest to generate the signature. Depending on how and where the authentication is used, the signing process can be different.
Signing Certificate
One common usage of Signature Signing is the code signing which signs executables and scripts in order to verify the author’s identity and ensure that the code has not been changed or corrupted since it was signed by the author. One good example is that Windows 10 demands that all kernel mode drivers must be signed in order to be installed and loaded.
One thing that people usually get confused is the Signing Certificate. First of all, the name of Signing Certificate is very misleading because it is not used to sign the code at all, instead it is a certificate of PKI x.509 format used to distribute the signer’s public key. A public/private key pair is generated when the certificate is requested. The private key stays on the applicant’s machine and is never sent to the certificate provider. The public key is submitted to the provider with the certificate request and the provider issues a certificate. The code signing certificate acts as a digital signature. When you sign data, you include your digital signature with the data. A certificate contains information that fully identifies an entity, and is issued by a certificate authority (CA) after that authority has verified the entity’s identity.
Signing Certificate usually have an expiration date, it is important that you enable the time-stamping when performing the signing, or your signed content won’t be valid after the signing certificate expires.
Signing HSM
HSM stands for Hardware Security Module, the signing HSM refers a dedicated crypto processor that is specifically designed for the protection of the crypto key lifecycle. Hardware security modules act as trust anchors that protect the cryptographic infrastructure of some of the most security-conscious organizations in the world by securely managing, processing, and storing cryptographic keys inside a hardened, tamper-resistant device. There are a lot of usage cases requires HSM, like the banking system and the gaming machine authentication etc. Enterprises buy hardware security modules to protect transactions, identities, and applications, as HSMs excel at securing cryptographic keys and provisioning encryption, decryption, authentication, and digital signing services for a wide range of applications. The good example on HSM include Trust Platform Module (TPM) embedded to the modern PC board as well as what the third party like the Gemalto provides.
Once the public/private key pair is generated by the HSM, the private key will stay within HSM device and remains secure and safe, the public key can be distribute to the users via either the commercial certificate form like PKI x.509 or proprietary format. HSM is usually connected to another computer via net, which serves as its client to satisfy the signing request.
Security Strength
We often hear the talks like that xxx key is of 128bit or 256 bit strength, but as often we have seen a lot of keys are named by numbers, like DSA2014, RSA 2048, ECDSA p521 etc, so how are these number translated to the key security? Is RSA2048 of 2048 bit security? The answer is no in most cases, while it is true that the higher number associated with the key name usually implies a more secure and stronger key, there is no direct mapping between its number to the bit security level, at least for the public/private key pair.
The Security Strength is defined by the amount of work (that is, the number of operations of some sort) that is required to break a cryptographic algorithm like the encryption key. It is specified by bits and is a specific value from {80, 112, 128, 192, 256}. If the security strength associated with an algorithm is Y bit, then it is expected to take 2’s Y component operations to break it. From this perspective, AES is probably the only one algorithm that numbers its key with the numbers matching their strength, for example, AES128 key is of 128bit security while AES256 key is of 256bit security.
The below table shows the bit security mapping defined by NIST for various key algorithms:
While the following table shows the strength of various hashing algorithms: