Authentication
When people communicate through written words, signatures and seals are applied to ensure that the written message is exactly the same from the sender to the receiver. Notice that the sender must share some information with the receiver up front, for example, the receiver must have known the signature of the sender.
In this section, we will discuss digital methods which make it difficult to “forge” a “signature.” Just as with encryption, there are two different approaches to the problem based on whether private keys are allowed: message authentication codes and digital signatures. Message Authentication Codes (MACs) are used in the private key setting. Only people who know the secret key can check if a message is valid. Digital Signatures extend this idea to the public key setting. Anyone who knows the public key of Alice can verify a signature issued by Alice, and only those who know the secret key can issue signatures. [Ps, p133]
Message Authentication Codes (MAC)
Definition: MAC
is a message authentication code (MAC) over the message space if the following syntax, correctness, and security hold:
- is a PPT algorithm that returns a key .
- is a PPT algorithm that on input key and message outputs a tag .
- is a deterministic polynomial-time algorithm that on input and outputs “accept” or “reject”.
Correctness: For all , for all ,
Security: for all NUPPT adversaries , there exists a negligible function such that
Discuss:
- The adversary is allowed to oracle query . Is that too strong as gets too much info? What if we totally remove the oracle query? What if the adversary gets some pairs but only for challenger-chosen ’s?
- The adversary aims to forge a tag for arbitrary message . Is that too strong since useful messages are not arbitrary?
- Can we change the definition to that wins only if it outputs a good key?
- Can this definition prevent an adversary that replays a (message, tag) pair? Replay can be a devastating attack in many applications, such as financial transactions.
- If we additionally give the adversary oracle access to verification , is the definition stronger?
- Does the existence of MAC imply the existence of OWF? If so, what’s the construction?
The above questions provide intuitions to our first MAC candidate.
Construction: Warmup, Information-Theoretic MAC
Define the following functions
- .
- : Output .
- : Ouptut “accept” if and only if .
This MAC is clearly insecure if the adversary can oracle query once: given any pair , it is trivial to recover . However, the MAC is actually Information-Theoretically (and perfectly) secure when the adversary can not obtain any pair. Hence, it is called IT-MAC and is indeed used in some cryptographic protocols (which we don’t have time to cover).
Noticing that the above output of is uniformly random, we construct a MAC that outputs a psuedorandom string.
Construction: MAC
Let be a family of pseudorandom functions such that .
- .
- : Output .
- : Ouptut “accept” if and only if .
Theorem:
If there exists a pseudorandom function, then the above scheme is a Message Authentication Code over the message space .
Proof Sketch:
The correctness is direct, and the security is argued below.
Consider a hybrid scheme and the corresponding hybrid experiment such that a random function is used instead of the PRF . The hybrid scheme is indistinguishable from the real construction by the oracle indistinguishability of PRF and then by a standard reduction (for any NUPPT adversary). Next, in the hybrid experiment, we claim that any adversary can only win with probability at most : if the adversary queries , it is always rejected; otherwise, the adversary can only output w.p. (for even unbounded adversaries).
From fixed-length to arbitrary length
The above definition and construction consider MAC for messages in space so that the length is fixed. Of course, to handle arbitrarily long messages, we can instantiate the PRF in the construction with a PRF scheme of arbitrary length. However, existing practical pseudorandom functions (i.e., block ciphers) take short, fixed-length inputs. It is necessary in practice to handle some subtleties, and this is called “domain extension” in general.
- Split a long message into a sequence of short “blocks” and then each.
- Include a sequence number into each block (to prevent reordering attack).
- Append to every block the total length (to prevent truncation attack).
There are other approaches, such as CBC-MAC (which is only secure when the message length is “fixed and agreed upon in advance”, see [KL, Exercise 4.13]). We skip the details of the constructs.
Cryptographic Hash Functions
Hash functions are used in many areas of computer science, and in general, all areas require that the function is shrinking; that is, the output is shorter than the input of the function.
In cryptographic applications like MAC, we want a hash function to be collision-resistant. That is, give an input and its corresponding output , it is computationally hard to find such that . There are several formal definitions, and we discuss only two below.
Ref: [Ps 5.5] [KL 9.3]
Definition: Universal One-Way Hash Functions
A set of functions is a family of universal one-way hash functions (UOWHF) if:
- (compression) .
- (ease of evaluation) Given and , the computation of can be done in PPT in .
(collision resistance) for all NUPPT , there exists a negligible such that , wins the following game w.p. :
- wins if and
Example:
Suppose is a family of UOWHF. Let be a MAC for bit messages, where is the security parameter (given to ). Then the following is also a MAC for arbitrarily long (polynomial in ) messages:
- is identical to .
- : sample , compute , then output .
- : accept iff and accepts.
Here, we upgraded the MAC to work for messages longer than .
Question: in the definition of UOWHF, we allow to choose . Why or where we use such property in the example?
Definition: Collision-Resistant Hash Functions
A set of functions and algorithm is a family of collision-resistant hash functions (CRHF) if:
- (ease of sampling) runs in PPT, .
- (compression) .
- (ease of evaluation) Given and , the computation of can be done in PPT.
(collision resistance) for all NUPPT , there exists a negligible such that ,
Note Because the adversary chooses both and , the key is necessary to defend against non-uniform adversaries; otherwise, a non-uniform can just remember a colliding pair for every problem size . Many practical hash functions (such as SHA) are unkeyed and do not satisfy this definition.
A CRHF is a UOWHF and also a OWF, and we can construct UOWHF from OWF. However, it is long open whether we can get CRHF from OWF. Instead, CRHF is contructed from various concrete assumptions, and CRHF is also constructed from “trapdoor permutations”, which is yet another primitive that we do not know how to obtain from OWF. We will use discrete logarithm assumption below.
{:.label}Discuss: Suppose that is a CRHF or UOWHF (for some corresponding key and key generation). Let be the function that truncates the last output bit of . Is the resulting and the key generation a good CRHF (or UOWHF)? Is there a counterexample such that is secure but is not?
Hash-and-MAC
Using either collision-resistant hash functions (CRHF) or even UOWHF, we have a standard way to extend the message space of any MAC scheme (or digital signature, discussed later). That is, whenever we want to sign (or tag) a message , we sample a key of the hash function then compute the hash value and then compute the signature as , where and are the tagging algorithm and the key of the given MAC scheme. The verification of is contructed accordingly.
Note: In the textbook [KL, Construction 6.5], the key of the hash function is generated once for all messages, and thus is kept secret from the adversary. (CRHF and UOWHF both gives the key to the adversary, so the proof does not use the full property.)
Note: Practical hash functions are typically using variants of SHA that are currently implemented directly in mainstream processors, including AMD, Intel, and Arm since 2015–2017. Also notice that SHA is a function but NOT keyed, while people have not find any collision.
Digital Signature Schemes
With message authentication codes, both the signer and verifier need to share a secret key. In contrast, digital signatures mirror real-life signatures in that anyone who knows Alice (but not necessarily her secrets) can verify a signature generated by Alice. Moreover, digital signatures possess the property of non-repudiability, i.e., if Alice signs a message and sends it to Bob, then Bob can prove to a third party (who also knows Alice) the validity of the signature. Hence, digital signatures can be used as certificates in a public key infrastructure.
– [Ps, Section 5.3]
Ref: [KL 13.1, 13.2, 13.6]
Definition: Digital Signatures
is a digital signature scheme over the message space if the following syntax, correctness, and security are satisfied:
- is a PPT which on input outputs a public key and a secret key , .
- is PPT algorithm which on input a secret key and message outputs a signature .
- is a deterministic poly-time algorithm which on input a public key , a message and a signature returns either “accept” or “reject”.
Correctness: For all , for all ,
Security: For all NUPPT adversaries , there exists a negligible function such that ,
Discuss:
- must be sent to the verifier through an athenticated way (by definition in the above). If we have that authenticated way, why do we just use MAC?
- Would it be meaningful if gets no oracle access to ? The adversary is given as input.
- Would it be meaningful if gets oracle access to but only once?
- The definition is a public-key version of MAC.
- Since the verification uses only public key, can perform verification without oracle queries.
- Hence, it is clear that DS implies MAC, and then it implies OWF.
- Can we obtain a digital signature from (public-key) encryption?
The following theorem shows that signing -bit messages is sufficient because we can extend the message space through hashing.
Theorem: (Extending message space)
Suppose that is a digital signature scheme over the message space such that is the security parameter (given as input to ). Also suppose that is a family of UOWHF. Then, the following is a digital signature scheme over the message space for any polynomial .
- is the same as
- : sample uniformly at random, compute , run , and then output .
- : parse , compute , accept if accepts.
Lamport’s Signature Scheme
Refs: [KL 14.4], Lamport’79, Goldwasser@Berkeley
Definition: One-Time Digital Signatures
is a one-time digital signature scheme the definition of DS is satisfied under the constraint that the adversary is only allowed to query the signing oracle once (in ).
Construct: Lamport’s Digital Signature
Let be a OWF. Given , is constructed as follows for .
:
- Sample strings for .
- Compute for all
- Output and .
:
- Output , where denotes the -th bit of .
:
- Output iff for all , it holds that , where comes from and comes from .
Theorem:
If is a one-way function, then Lamport’s Signature is a secure one-time digital signature (for -bit messages).
As a corollary, the construction extends to messages of bits such that is a polynomial.
Note Many other public-key cryptographic schemes (such as public-key encryption) relied on stronger assumptions (such as PKE from RSA assumption), and we do not know any construction from OWF. Digital signature is a big surprise since we get it from OWF.
Proof sketch.
We want to prove by contradiction: if there exists that makes one-time query and then forges the message and signature with , then we want to construct another adversary that inverts . The intuition is that given , there exists a bit for some , and then in order to pass the verification of , must be able to find the pre-image of the -th entry of , which is inverting .
The tricky step is that in the reduction, we need to give to up front. Since we have no idea about at that step, we are going to guess it.
More formally, assume for contradiction, there exists NUPPT adversary and polynomial such that for infinitely many ,
where denotes the event that and accept in the security game. We want to construct that inverts .
Let and . is constructed as below.
Algorithm :
- Sample and as per Lamport’s key generation.
- Sample uniformly at random.
- Modify by setting .
- Run : if queries such that , then output (“fail” symbol); otherwise, respond to the signature as per . Let the result be .
- If , then output (as a candidate pre-image of ); output otherwise.
Notice that can not know nor because all entries in are identically distributed. Hence, , and .
Tree-based Signatures
Ref: [KL 14.4.2, 14.4.3]
How to extend one-time signature to sign many messages? When signing a message, we can additionally generate the next pair of and then sign and send the next with the current message, and so on for the next messages. The verifier needs to verify and to keep the next . That is, both the signer and verifier need to keep states, or the signing / verification time is linear in the number of signed messages.
To improve it, we use tree-based approach. That is, for each pair , we sign two public keys , and then each (together with the corresponding ) can further sign two keys , and so on. We build a tree of leaves so that we can sign up to messages, and the first signature consists of one-time signatures:
where . The second, thrid, and forth signatures and so on are moving the path from to and then to and so on. Because all the pairs are one-time, the signer keeps a state that counts the number of messages signed so far and all the pairs generated so far so that the next message is signed consistently. The verifier is stateless and keeps only .
The above scheme can be further improved to achieve a stateless signer by using a PRF to generate the randomess needed by each tree node. That is, let be a string of less then bits that indicates the tree node, and let be a PRF with key sampled one-time up front, and then we can use be the randomness needed by node (since we need only two and one at each tree node). Key is added to be part of the secret key, and signer can always generate the identical at node .
Discuss:
- If the number of messages is not given in advance, how to build a DS?
Notice If we plug Lamport’s signature into the above composition, then the length of the signature will grow exponentially because the public key is much longer than the message (and we sign the ’s). An easier “fix” is to shrink the message to a shorter “hash” and then sign the hash. That is, the signer S and verifier V agree on a keyed hash function , S computes the (shorter) hash from message so that S can sign the hash and V can verify. Of course, we need a secure hash function so that any adversary given can not forge another but . This property is called targeted collision resistant hash functions or universal one-way hash functions (UOWHF). Such hash functions can be constructed from one-way functions. We omit the proof here and give CRHF below since UOWHF is implied by CRHF. Historically, Naor and Yung formalized UOWHF and constructed it from one-way permutations, and then Rompel constructed UOWHF from any OWF. [Goldreich, FoC, Vol 2, Section 6.4.3] shows the result of Naor and Yung, and Rompel’s result is later proved formally by Katz and Koo. Similar to PRG, it is still active research, e.g., Mao, Nazor, and Zhang improved the construction to be non-adaptive (ie parallel) calls to OWF.
Postscript{. label} Leslie Lamport wrote a great note on “How To Present a Paper” in 1979. That is decades old and just two pages, but it is still inspiring and pure gold on good presentations.
Collision-Resistant Hash Functions
Definition: Generator of a Group
A element of a multiplicative group is a generator if the set . We denote the set of all generators of a group by .
Assumption: Discrete Log
If is a group of prime order , then for every adversary , there exists a negligible function such that
In the above definition, the group is not instantiated. Hence the hardness of the DL problem depends on , and indeed for some groups such as it is trivial. In crypto, it is believed that DL is hard on the subgroup of for prime . Notice that the order of is even , not prime order (and thus DL may be easy). Therefore, we sample a prime of the form : first sample prime and then let if is also prime, otherwise repeatedly sample another . Such primes are known as Sophie Germain primes or safe primes.
Unfortunately, even though this procedure always terminates in practice, its basic theoretical properties are unknown. It is unknown even (a) whether there are an infinite number of Sophie Germain primes, (b) and even so, whether this simple procedure for finding them continues to quickly succeed as the size of increases. – [Ps, Section 2.8.1]
Also notice that DL is a one-to-one mapping, and the domain and range are the same in cardinality. However, the representations of elements (in bits) can be quite different, and thus it is not exactly a one-way permutation (strictly speaking).
Sampling prime-order group and the generator
Theorem:
Let with prime. Then
is a subgroup of of order .
It is direct that is a subgroup. It remains to prove the order, and it suffices to show that is 2-to-one (because ). Let be a generator of (proof omitted: the existence of generator of ) Let such that for some . Consider any such that where . We have , and . Thus by , either or .
This proof extends to for all even integer . Moreover, it also shows that sampling from uniformly at random is efficient: just sample and compute .
Construction: CRHF from DL
: output , where
- are -bit primes that ,
- is the generator of ,
- , and
- is order subgroup of .
: Output , where
- is -bit, is 1-bit
Theorem:
Under the Discrete Logarithm assumption, the above construction is a collision-resistant hash function that compresses by 1 bit.
Proof sketch:
It is efficient, and the compression follows by to bits mapping (strictly speaking, that is to as ). To prove collision resistance, firstly consider the case but . Then, we have
which means that . Hence, it must be but . Given , suppose (this is without loss of generality as otherwise). Then,
which implies that
Thus, we can compute the DL of using any adversary that on input outputs a collision .
Note: the above construction is based on the DL assumption w.r.t. the prime order group . We can analogously construct CRHF from group based on the DL assumption w.r.t. .