Authentication

In this section, we will discuss digital methods which make it difficult to “forge” a “signature.” Just as with encryption, there are two different approaches to the problem based on whether private keys are allowed: message authentication codes and digital signatures. Message Authentication Codes (MACs) are used in the private key setting. Only people who know the secret key can check if a message is valid. Digital Signatures extend this idea to the public key setting. Anyone who knows the public key of Alice can verify a signature issued by Alice, and only those who know the secret key can issue signatures. [Ps, p133]

Message Authentication Codes (MAC)

Definition: MAC

(Gen,Tag,Ver) is a message authentication code (MAC) over the message space {Mn}n if the following syntax, correctness, and security hold:

  • Gen is a PPT algorithm that returns a key kGen(1n).
  • Tag is a PPT algorithm that on input key k and message m outputs a tag σTagk(m).
  • Ver is a deterministic polynomial-time algorithm that on input k,m and σ outputs “accept” or “reject”.

Correctness: For all nN, for all mMn,

Pr[kGen(1n):Verk(m,Tagk(m))=accept]=1.

Security: for all NUPPT adversaries A, there exists a negligible function ϵ(n) such that

Pr[kGen(1n);(m,σ)ATagk()(1n) : A did not not query m, and Verk(m,σ)=accept]ϵ(n)

Discuss:

  • The adversary A is allowed to oracle query Tagk. Is that too strong as A gets too much info? What if we totally remove the oracle query? What if the adversary gets some (x,Tagk(x)) pairs but only for challenger-chosen x’s?
  • The adversary A aims to forge a tag for arbitrary message m. Is that too strong since useful messages are not arbitrary?
  • Can we change the definition to that A wins only if it outputs a good key?
  • Can this definition defined against an adversary that replay a (message, tag) pair? Replay can be a devastating attack in many applications, such as financial transactions.
  • If we additionally give the adversary oracle access to verification Verk, is the definition stronger?
  • Does the existence of MAC imply the existence of OWF? If so, what’s the construction?

Construction: MAC

Let F={fs} be a family of pseudorandom functions such that fs:{0,1}|s|{0,1}|s|.

  • Gen(1n):k{0,1}n.
  • Tagk(m): Output fk(m).
  • Verk(m,σ): Ouptut “accept” if and only if fk(m)=σ.

Theorem:

If there exists a pseudorandom function, then the above scheme is a Message Authentication Code over the message space {0,1}n.

Proof Sketch:

The correctness is direct, and the security is argued below.

Consider a hybrid scheme and the corresponding hybrid experiment such that a random function RF is used instead of the PRF fk. The hybrid scheme is indistinguishable from the real construction by the oracle indistinguishability of PRF and then by a standard reduction (for any NUPPT adversary). Next, in the hybrid experiment, we claim that any adversary can only win with probability at most 2n: if the adversary queries m, it is always rejected; otherwise, the adversary can only output σ=RF(m) w.p. 2n (for even unbounded adversaries).

From fixed-length to arbitrary length

The above definition and construction consider MAC for messages in space Mn so that the length is fixed. Of course, to handle arbitrarily long messages, we can instantiate the PRF in the construction with a PRF scheme of arbitrary length. However, existing practical pseudorandom functions (i.e., block ciphers) take short, fixed-length inputs. It is necessary in practice to handle some subtleties, and this is called “domain extension” in general.

  1. Split a long message into a sequence of short “blocks” and then Tag each.
  2. Include a sequence number into each block (to prevent reordering attack).
  3. Append to every block the total length (to prevent truncation attack).

There are other approaches, such as CBC-MAC (which is only secure when the message length is “fixed and agreed upon in advance”, see [KL, Exercise 4.13]). We skip the details of the constructs.

Digital Signature Schemes

With message authentication codes, both the signer and verifier need to share a secret key. In contrast, digital signatures mirror real-life signatures in that anyone who knows Alice (but not necessarily her secrets) can verify a signature generated by Alice. Moreover, digital signatures possess the property of non-repudiability, i.e., if Alice signs a message and sends it to Bob, then Bob can prove to a third party (who also knows Alice) the validity of the signature. Hence, digital signatures can be used as certificates in a public key infrastructure.

– [Ps, Section 5.3]

Ref: [KL 13.1, 13.2, 13.6]

Definition: Digital Signatures

(Gen,Sign,Ver) is a digital signature scheme over the message space {Mn}n if the following syntax, correctness, and security are satisfied:

  • Gen(1n) is a PPT which on input n outputs a public key pk and a secret key sk: pk,skGen(1n).
  • Sign is PPT algorithm which on input a secret key sk and message mMn outputs a signature σ: σSignsk(m).
  • Ver is a deterministic poly-time algorithm which on input a public key pk, a message m and a signature σ returns either “accept” or “reject”.

Correctness: For all nN, for all mMn,

Pr[pk,skGen(1n):Verpk(m,Signsk(m))=accept]=1

Security: For all NUPPT adversaries A, there exists a negligible function ϵ(n) such that nN,

Pr[pk,skGen(1n);(m,σ)ASignsk()(1n,pk) : A did not not query m, and Verpk(m,σ)=accept]ϵ(n)

Discuss:

  • pk must be sent to the verifier through an athenticated way (by definition in the above). If we have that authenticated way, why do we need DS then?
  • Would it be meaningful if A gets no oracle access to Signsk, since A is given pk as input?
  • Would it be meaningful if A gets oracle access to Signsk but only once?
  • The definition is a public-key version of MAC.
  • Since the verification uses only public key, A can perform verification without oracle queries.
  • Hence, it is clear that DS implies MAC, and then it implies OWF.
  • Can we obtain a digital signature from (public-key) encryption?

Lamport’s Signature Scheme

Refs: [KL 14.4], Lamport’79, Goldwasser@Berkeley

Definition: One-Time Digital Signatures

(Gen,Sign,Ver) is a one-time digital signature scheme the definition of DS is satisfied under the constraint that the adversary A is only allowed to query the signing oracle once (in ASignsk()).

Construct: Lamport’s Digital Signature

Let f:{0,1}{0,1} be a OWF. Given nN, (Gen,Sign,Ver) is constructed as follows for Mn:={0,1}n.

Gen(1n):

  1. Sample strings xbi{0,1}n for i[n],b{0,1}.
  2. Compute ybi=f(xbi) for all i,b
  3. Output pk:=(y_bi)_i,b and sk:=(x_bi)_i,b.

Signsk(m):

  1. Output σ:=(xmii)i[n], where mi denotes the i-th bit of m.

Verpk(σ):

  1. Output accept iff for all i[n], it holds that f(xmii)=ymii, where xmii comes from σ and ymii comes from pk.

Theorem:

If f is a one-way function, then Lamport’s Signature is a secure one-time digital signature (for n-bit messages).

As a corollary, the construction extends to messages of bits such that :=(n) is a polynomial.

Note Many other public-key cryptographic schemes (such as public-key encryption) relied on stronger assumptions (such as PKE from RSA assumption), and we do not know any construction from OWF. Digital signature is a big surprise since we get it from OWF.

Proof sketch.

We want to prove by contradiction: if there exists A that makes one-time query m and then forges the message and signature (m,σ) with mm, then we want to construct another adversary B that inverts f. The intuition is that given mm, there exists a bit mimi for some i, and then in order to pass the verification of (m,σ), A must be able to find the pre-image of the i-th entry of pk, which is inverting f.

The tricky step is that in the reduction, we need to give pk to A up front. Since we have no idea about i at that step, we are going to guess it.

More formally, assume for contradiction, there exists NUPPT adversary A and polynomial p such that for infinitely many nN,

Pr[A win]1/p(n),

where A win denotes the event that mm and Verpk(m,σ)= accept in the security game. We want to construct B that inverts f.

Let zf(x) and x{0,1}n. B is constructed as below.

Algorithm B(1n,z):

  1. Sample pk:=(ybi)i,b and sk:=(xbi)i,b as per Lamport’s key generation.
  2. Sample b{0,1},i[n] uniformly at random.
  3. Modify pk by setting ybiz.
  4. Run ASignsk()(pk): if A queries m such that mi=b, then output (“fail” symbol); otherwise, respond to A the signature as per Signsk. Let the result be (m,σ).
  5. If mimi, then output σ (as a candidate pre-image of z); output otherwise.

Notice that A can not know b nor i because all entries in pk are identically distributed. Hence, Pr[mib]=1/2, and Pr[mimi]1/n.

Tree-based Signatures

Ref: [KL 14.4.2, 14.4.3]

How to extend one-time signature to sign many messages? When signing a message, we can additionally generate the next pair of (pk1,sk1) and then sign and send the next pk1 with the current message, and so on for the next messages. The verifier needs to verify and to keep the next pk1. That is, both the signer and verifier need to keep states, or the signing / verification time is linear in the number of signed messages.

To improve it, we use tree-based approach. That is, for each pair (pk,sk), we sign two public keys pk0,pk1, and then each pkb (together with the corresponding skb) can further sign two keys pkb0,pkb1, and so on. We build a tree of 2n leaves so that we can sign up to 2n messages, and the first signature consists of n one-time signatures:

σ:=(pk,σ0,(pk0,pk1),σ1,(pk00,pk01),...,σn1,(pk0n,pk00...01),Signsk00...0(m)),

where σiSignsk0i(pk0i0,pk0i1). The second, thrid, and forth signatures and so on are moving the path from 00000 to 00001 and then to 00010 and so on. Because all the (pkx,skx) pairs are one-time, the signer keeps a state that counts the number of messages signed so far and all the (pkx,skx) pairs generated so far so that the next message is signed consistently. The verifier is stateless and keeps only pk.

The above scheme can be further improved to achieve a stateless signer by using a PRF to generate the randomess needed by each tree node. That is, let x be a string of less then n bits that indicates the tree node, and let fk be a PRF with key k sampled one-time up front, and then we can use fk(x) be the randomness needed by node x (since we need only two Gen and one Sign at each tree node). Key k is added to be part of the secret key, and signer can always generate the identical (pkx,skx) at node x.

Discuss:

  • If the number of messages is not given in advance, how to build a DS?

Notice If we plug Lamport’s signature into the above composition, then the length of the signature will grow exponentially because the public key is much longer than the message (and we sign the pk’s). An easier “fix” is to shrink the message to a shorter “hash” and then sign the hash. That is, the signer S and verifier V agree on a keyed hash function hk, S computes the (shorter) hash hk(m) from message m so that S can sign the hash and V can verify. Of course, we need a secure hash function so that any adversary given k can not forge another mm but hk(m)=hk(m). This property is called targeted collision resistant hash functions or universal one-way hash functions (UOWHF). Such hash functions can be constructed from one-way functions. We omit the proof here and give CRHF below since UOWHF is implied by CRHF. Historically, Naor and Yung formalized UOWHF and constructed it from one-way permutations, and then Rompel constructed UOWHF from any OWF. [Goldreich, FoC, Vol 2, Section 6.4.3] shows the result of Naor and Yung, and Rompel’s result is later proved formally by Katz and Koo. Similar to PRG, it is still active research, e.g., Mao, Nazor, and Zhang improved the construction to be non-adaptive (ie parallel) calls to OWF.

Collision-Resistant Hash Functions

Ref: [Ps 5.5] [KL 9.3]

Definition: Collision-Resistant Hash Functions

A set of functions H={hi:DiRi}iI and algorithm Gen is a family of collision-resistant hash functions (CRHF) if:

  • (ease of sampling) iGen(1n) runs in PPT, iI.
  • (compression) |Ri|<|Di|.
  • (ease of evaluation) Given iI and xRi, the computation of hi(x) can be done in PPT.
  • (collision resistance) for all NUPPT A, there exists a negligible ϵ such that nN,

    Pr[iGen(1n);(x,x)A(1n,i) : hi(x)=hi(x)xx]ϵ(n).

Note Key k is necessary to defend against non-uniform adversaries, which is unlike practical unkeyed hash functions such as SHA.

Definition: Universal One-Way Hash Functions

A set of functions H={hk:{0,1}d(n){0,1}r(n),n=|k|}k{0,1} is a family of universal one-way hash functions (UOWHF) if:

  • (compression) r(n)<d(n).
  • (ease of evaluation) Given k{0,1}n and x{0,1}r(n), the computation of hk(x) can be done in PPT in n.
  • (collision resistance) for all NUPPT A, there exists a negligible ϵ such that nN, A wins the following game w.p. ϵ(n):

    1. (x,state)A(1n)
    2. k{0,1}n
    3. xA(1n,k,state)
    4. A wins if xx and fk(x)=fk(x)

A CRHF is a UOWHF and also a OWF, and we can construct UOWHF from OWF. However, it is long open whether we can get CRHF from OWF. Instead, CRHF is contructed from various concrete assumptions, and CRHF is also constructed from “trapdoor permutations”, which is yet another primitive that we do not know how to obtain from OWF. We will use discrete logarithm assumption below.

Definition: Generator of a Group

A element g of a multiplicative group G is a generator if the set {g,g2,g3,}=G. We denote the set of all generators of a group G by Gen(G).

Assumption: Discrete Log

If Gq is a group of prime order q, then for every adversary A, there exists a negligible function ϵ such that

Pr[qΠn;gGen(Gq);xZq:A(q,g,gx)=x]<ϵ(n).

In the above definition, the group Gq is not instantiated. Hence the hardness of the DL problem depends on Gq, and indeed for some groups such as (Zq,+) it is trivial. In crypto, it is believed that DL is hard on the subgroup Gq of Zp for prime p. Notice that the order of Zp is even p1, not prime order (and thus DL may be easy). Therefore, we sample a prime of the form p=2q+1: first sample prime q and then let p=2q+1 if 2q+1 is also prime, otherwise repeatedly sample another q. Such primes are known as Sophie Germain primes or safe primes.

Unfortunately, even though this procedure always terminates in practice, its basic theoretical properties are unknown. It is unknown even (a) whether there are an infinite number of Sophie Germain primes, (b) and even so, whether this simple procedure for finding them continues to quickly succeed as the size of q increases. – [Ps, Section 2.8.1]

Also notice that DL is a one-to-one mapping, and the domain and range are the same in cardinality. However, the representations of elements (in bits) can be quite different, and thus it is not exactly a one-way permutation (strictly speaking).

Sampling prime-order group and the generator

Theorem:

Let p=2q+1 with p,q prime. Then

G:={h2modp:hZp}

is a subgroup of Zp of order q.

It is direct that G is a subgroup. It remains to prove the order, and it suffices to show that f(h)=h2modp is 2-to-one (because |G|=|f(Zp)|=(p1)/2=q). Let g be a generator of Zp (proof omitted: the existence of generator of Zp) Let hZp such that h=gt for some tp1. Consider any h2 such that h22=h2modp where h2=gt2. We have g2t2=g2t, and 2(t2t)=0mod(p1). Thus by p1=2q, either t2=t or t2t=0modq.

This proof extends to hr for all even integer r2. Moreover, it also shows that sampling from G uniformly at random is efficient: just sample h and compute h2modp.

Construction: CRHF from DL

Gen(1n): output (p,q,g,y), where

  • p,q are n-bit primes that p=2q+1,
  • g is the generator of Gq,
  • yGq, and
  • Gq is order q subgroup of Zp.

hp,q,g,y(x,b): Output ybgxmodp, where

  • x is n-bit, b is 1-bit

Theorem:

Under the Discrete Logarithm assumption, the above construction is a collision-resistant hash function that compresses by 1 bit.

Proof sketch:

It is efficient, and the compression follows by n+1 to n bits mapping (strictly speaking, that is 2p to p as x[p]). To prove collision resistance, firstly consider the case h(x,b)=h(x,b) but xx. Then, we have

gx=gxmodp

which means that x=x. Hence, it must be xx,bb but h(x,b)=h(x,b). Given bb, suppose b=1,b=0 (this is without loss of generality as b=0 otherwise). Then,

y1gx=y0gxmodp

which implies that

y=gxxmodp.

Thus, we can compute the DL of y using any adversary A that on input y outputs a collision (x,b)(x,b).

Note: the above construction is based on the DL assumption w.r.t. the prime order group Gq. We can analogously construct CRHF from group Zp based on the DL assumption w.r.t. Zp.