Digital signatures

Digital signature

Digital signature is a digital analogue of a pen-and-ink signature on a physical document. The purpose of digital signature is to solve the following scenario. Alice has a digital document and wants to attach some “proof” that can be used to prove that she had approved this document. Therefore, digital signature can be recognized as an analogue to her handwritten signature on an ordinary document.

Asymmetric Signing Algorithms

Asymmetric-key cryptography is based on an exchange of two keys — private and public. Since the public key is accessible to all, anyone could get yours and then contact you pretending to be someone else. Luckily, authentication problems were solved early in the internet age with digital signatures. Two major parts of a digital signature in public-key cryptography are the sender’s private key and a hash. In simple terms, it’s a condensed version of a message. Regardless of the file type or its size, a hash is only 5-20 symbols long. Keep in mind that hashing is a one-way process, meaning that you can’t turn those symbols back into the message. Its only purpose is to protect you from fake versions of the file. Even if a hacker made the tiniest of changes, the hash would change as well, indicating that the message is no more genuine. You can read more about hashing in our previous post.

Therefore, it is very critical to clarify and confirm that who had signed the contract. In order to prevent forged digital signature, we have to apply public-key cryptography. Public-key cryptography, is a cryptographic system that uses pairs of keys: public keys, which may be disseminated widely, and private keys are known only to the owner.

Using this cryptography primitives, the basic idea of digital signature is that the person who signs a message by his private key, and anyone can use the associated public key to verify the signed message. Meanwhile, an ideal digital signature should have the following properties:

A signed message can unambiguously be traced back to its originator as a valid signature is only generated by the unique signer’s private key.
Only the signer has the ability to generate a signature on his behalf.
Anyone can use the public key to convince himself/herself that the signer has actually approved this message.
It is very difficult to get the private key even though you have the knowledge of a public key.

There are three popular public-key families, namely integer factorization, discrete logarithm of finite fields or elliptic curves. Each of these allows us to construct ideal digital signatures: RSA, DSA, and ECDSA. Among these digital signature, RSA is the most widely used method in practice. However, in order to attain the same security-level as other signatures, the bit-length of public key of RSA should be the longest, which implies that it needs more space to save necessary data. Therefore, Bitcoin and Ethereum both adopt ECDSA, which we covered in the previous post and the shortest one compared with other signature types.

In other words, a digital signature not only guarantees that the sender is authentic but also ensures that the integrity of the message is intact.

What is the difference between RSA and DSA?

First, it’s the algorithm’s use of mathematical problems. Both algorithms use modular arithmetic, but the RSA certificate relies on prime factorization, while DSA uses the discrete logarithm problem. For now, both are considered completely safe.

Another difference between DSA and RSA is speed. The former is a faster signature, but the latter is more efficient at verification. However, since authentication requires both, speed discrepancies might not be as significant as they sound.

Also, DSA only works with a safer, second edition of the Secure Shell (SSH) network protocol. RSA works with SSH2 but is also compatible with the original SSH, which is now considered heavily flawed. So, if you're concerned about accidentally using SSH, DSA may be a better choice. In other words, the difference between RSA and DSA is in what each can do. RSA can be used as a digital signature and an encryption algorithm. Also, RSA is a block cipher, while DSA is a stream cipher. Compatibility-wise, they are equal. RSA and DSA are both used for the same internet protocols and certificates, like Nettle, OpenSSL, wolfCrypt, Crypto++, and cryptlib.

By now, you probably see that, despite some minor differences, RSA and DSA are pretty similar.

Symmetric Signing Algorithms

Symmetric encryption, which can also be called a secret key algorithm, uses only one key: a secret key for encryption and decryption of messages. The main disadvantage of symmetric key encryption is that all parties involved in communication have to exchange the key used to encrypt the message before they can decrypt it.

HMAC

A Hash-based Message Authentication Code (HMAC) can be used to determine whether a message sent over an insecure channel has been tampered with, provided that the sender and receiver share a secret key. The sender computes the hash value for the original data and sends both the original data and the HMAC as a single message. The receiver recomputes the hash value on the received message and checks that the computed hash value matches the transmitted hash value.

Design considerations

The design of the HMAC specification was motivated by the existence of attacks on more trivial mechanisms for combining a key with a hash function. For example, one might assume the same security that HMAC provides could be achieved with MAC = H(key ∥ message). However, this method suffers from a serious flaw: with most hash functions, it is easy to append data to the message without knowing the key and obtain another valid MAC ("length-extension attack", of which we had the previous post).

The alternative, appending the key using MAC = H(message ∥ key), suffers from the problem that an attacker who can find a collision in the (unkeyed) hash function has a collision in the MAC (as two messages m1 and m2 yielding the same hash will provide the same start condition to the hash function before the appended key is hashed, hence the final hash will be the same).

Using MAC = H(key ∥ message ∥ key) is better, but various security papers have suggested vulnerabilities with this approach, even when two different keys are used.

HMAC uses two passes of hash computation. The secret key is first used to derive two keys – inner and outer. The first pass of the algorithm produces an internal hash derived from the message and the inner key. The second pass produces the final HMAC code derived from the inner hash result and the outer key.

No known extension attacks have been found against the current HMAC specification which is defined as H(key ∥ H(key ∥ message)), because the outer application of the hash function masks the intermediate result of the internal hash. But before applying, we have to compute S bits and then append it to plain text and after that apply the hash function. For generating those S bits we make use of a key that is shared between the sender and receiver - called ipad and opad. The values of ipad and opad are not critical to the security of the algorithm, but were defined in such a way to have a large Hamming distance from each other and so the inner and outer keys will have fewer bits in common. The security reduction of HMAC does require them to be different in at least one bit.

Important aspect of HMAC, is that it does not encrypt the message! Instead, the message (encrypted or not) must be sent alongside the HMAC hash. Parties with the secret key will hash the message again themselves, and if it is authentic, the received and computed hashes will match.

HMAC can be used with any iterative cryptographic hash function, such as SHA family, in combination with a secret shared key. The cryptographic strength of HMAC depends on the properties of the underlying hash function. An iterative hash function breaks up a message into blocks of a fixed size and iterates over them with a compression function. For example, SHA-256 operates on 512-bit blocks. The size of the output of HMAC is the same as that of the underlying hash function (e.g., 256 and 512 bits in the case of SHA-256 and SHA3-512, respectively), although it can be truncated if desired. Due to collision problems with MD5 and SHA-1, it is recommended to use the security model based on SHA-256 or better. Also, as the Keccak hash function, doesn't need this nested approach and can be used to generate a MAC by simply prepending the key to the message, as it is not susceptible to length-extension attacks.

Conclusion

There are different advantages and disadvantages of both approaches, and picking one mostly depends on the situation. HMACs are ideal for high-performance systems like routers, due to the use of hash functions, which are calculated and verified quickly unlike the public key systems. HMACs are used in administrations where public key systems are prohibited.

However as HMAC uses shared key, this may lead to non-repudiation. If either sender or receiver’s key is compromised then it will be easy for attackers to create unauthorized messages. With an asymmetric primitive like RSA, on the other hand, the verifying party needs only the public key (certificate), so the private key is held only by the party who needs to sign. This means that an employee could not, as a hypothetical scenario, steal an authentication key and masquerade as a customer without changing the codebase (which often leaves an audit trail), since the private key is held only by the client. The tradeoff is that RSA is much slower than HMAC, but that doesn't always matter.

Deep Research and Development