One-Way Functions

Motivation: Given the impossibility of efficient but perfectly secure encryption, we want to relax our security definition. Intuitively, we want to

  1. encrypt long plaintext many times using a single short key, and
  2. perform “fast” encryption and decryption given the legitimate key, and
  3. ensure that the recovering of plaintext is “hard” without the key.

One-time pads achieves 2 and 3 but not 1. However, OTP achieves 3 only bcs the key space is as large as message. Since we want 1, the key space is relatively small. Very vague, we want a encryption that is easy to compute (to achieve 1) but hard to find any k,m that Enck(m)=c. (This is vague bcs we can still try any k to decrypt c.)

This suggests that we require functions that are

“easy” to compute but “hard” to invert.

That is called “one-wayness” in this lecture. We next define easy and hard computation in terms of efficiency and probability.

Efficient Computation and Efficient Adversaries

We want to provide “efficient computation” to the honest users.

Definition: Deterministic Algorithm

An algo A is a deterministic Turing Machine with input and output as bit strings, {0,1}.

Note: TM is by definition uniform, that is, its description size is constant for all unbounded long inputs.

Definition: Polynomial Running-Time

A runs in time T(n) if x{0,1}, A(x) halts within T(|x|) steps. A runs in polynomial time if there exists a constant c such that A runs in time T(n)=nc.

Definition: Efficient Computation

We say an algo is efficient if it runs in polynomial time.

Justification of poly time:

  • Indep of computation models (TM, circuit, …)
  • Closed under compositions of algos
  • From human experience: poly-time become cubic, but super-poly remain hard

As discussed in perfect secrecy, we need randomized algo to construct encryption (and more generally crypto).

Definition: Randomized (or Probabilistic) Algorithm

A randomized algorithm A, also called a probabilistic polynomial-time Turing Machine and abbreviated as PPT, is a deterministic algorithm equipped with an extra random tape. Each bit of the random tape is uniformly and independently chosen.

We denote the computation by yA(x;r) but sometimes omit r. Note that the running-time is bounded by a poly for all (x,r).

As an example, we define efficient and correct encryption.

Definition: Efficient Private-key Encryption.

(Gen,Enc,Dec) is called an efficient private-key encryption scheme if:

  1. kGen(1n) is PPT s.t. for every nN, it samples k.
  2. cEnck(m) is PPT s.t. k,m{0,1}n, outputs c.
  3. mDeck(c) is PPT s.t. c,k, outputs m{0,1}n.
  4. Correctness: nN, m{0,1}n,
Pr[kGen(1n):Deck(Enck(m))=m]]=1.

Note: the notation 1n means the string of n copies 1’s, it is called the security parameter. The purpose is to instantiate the “security” of the scheme, e.g., n-bit key. We write unary 1n (but not n as binary) because “poly-time” is defined by the input length.

Adversaries: Non-Uniform

Next we want to model adversaries with a stronger capability than honest.

Definition: Non-Uniform PPT

A non-uniform PPT machine (abbreviated nuPPT) A is a sequence of algos A={A1,A2,} s.t.:

  • Ai computes on inputs of length i, and
  • exists a polynomial d s.t. the description size |Ai|<d(i) and the time Ai is also less than d(i). We write A(x) to denote the computation A|x|(x).

Alternatively, an nuPPT algo can be defined as a uniform PPT A that takes an additional advice string of poly length d(i) for each input length i.

Purpose: non-uniform gives adv extra power and models many real scenario, e.g., Eve may have a list of known (plain, cipher) pairs.

Discuss: the advice may not be computable in poly time (if poly-time, then no need to take advice as input). NU is closed under reduction.

Definition of One-Way Functions

We try to define OWF using efficient computation and efficient adversary.

The first may come from NP-hardness.

Attempt: (Worst-Case)

A function f:{0,1}{0,1} is one-way if both of the following hold:

  1. Easy to compute. There is a PPT C that computes f(x) on all inputs x{0,1}.
  2. Hard to Invert. No nuPPT adversary A, for all nN and x{0,1}n,

    Pr[A(1n,f(x))f1(f(x))]=1.

This is implied by NPBPP, which is a long-open complexity problem. Majority of x are easy (even we set prob 11/poly), open if it is useful for encryption.

Note: nuPPT can store any poly, so there may be more than poly are hard.

Attempt:

A function f:{0,1}{0,1} is one-way if both of the following hold:

  1. Easy to compute. There is a PPT C that computes f(x) on all inputs x{0,1}.
  2. Hard to Invert. For all nuPPT adversary A, for all nN and x{0,1}n,

    Pr[A(1n,f(x))f1(f(x))]2n.

Impossible: too strong, A is NU and could have many (x,f(x)) pairs. Note: A takes 1n as the security parameter in case |f(x)|n.

Randomize x:

Attempt:

2. Hard to Invert. For any nuPPT adversary A, for all nN,

Pr[x{0,1}n;yf(x):f(A(1n,y))=y]2n.

Still too strong: A can take poly time to slash some of the possible x.

We may relax by poly(n)/2n or 20.1n on the RHS, but they still too strong to find a candidate. We formalize “very small” as follows.

Definition: Negligible Function

Func ϵ(n) is negligible if for every c, there exists some n0 s.t. n>n0,ϵ(n)1/nc.

Note: ϵ is smaller than any inverse poly for sufficiently large n.

Note: when the probability is 1ϵ(n), we often call it “overwhelming”.

Definition: (Strong) One-Way Function

A function f:{0,1}{0,1} is one-way if both of the following hold:

  1. Easy to compute. There is a PPT C that computes f(x) on all inputs x{0,1}.
  2. Hard to Invert. For any nuPPT adversary A, there exists a negligible function ϵ that for any nN,

    Pr[x{0,1}n;yf(x):f(A(1n,y))=y]ϵ(n).

Note: each A has a different ϵ. The definition is asymptotic.

The above definition is standard in literature, but it is still hard to construct: any adversary can only invert a tiny fraction. Many natural candidates, such as factoring, does not meet this. We relax it:

Definition: Weak One-Way Function

A function f:{0,1}{0,1} is weak one-way if (… same as strong OWF.)

2. Hard to Invert. There exists a polynomial q:NN such that for any nuPPT adversary A, for sufficiently large nN,

Pr[x{0,1}n;yf(x):f(A(1n,y))=y]11/q(n).

Note: here the prob. 11/q is the same for all adv A, but in the strong OWF, the prob. ϵ is different and depends on A.

Primes and Factoring

Define fmul:N2N by

fmul(x,y)={1if x=1 or y=1(eliminating trivial invert)xyo.w.

Easy to compute. For many (x,y) are “easy” to invert: w.p. at least 3/4 when xy even. It is not strong OWF.

Assumption: Factoring

For any adv A, there exists a negligible function ϵ s.t.

Pr[(p,q)Πn2;Npq:A(N){p,q}]<ϵ(n),

where Πn:={p<2n:p prime} is the set of primes less than 2n.

The Prime Number Theorem

Define π(x) as the number of primes x. PNT states that primes are dense.

Theorem: (Chebychev, 1848)

For all x>1, π(x)>x2logx.

Note: the above is easier to prove, but the famous Prime Number Theorem is π(x)x/lnx when x. The above log is base 2.

Theorem:

If the factoring assumption is true, then fmul is a weak OWF.

fmul is easy to compute. Hard to invert?

Assume for contradiction (AC), for all poly q, exists nuPPT A, s.t. for infinitely many nN,

Pr[(x,y){0,1}n;z=xy:fmul(A(12n,z))=z]>11q(n).

Note: the negation of weak OWF.

Then, we construct an adversary B breaks factoring.

Algorithm B(12n,z):

  1. Sample (x,y){0,1}n
  2. If both x,y prime, let z¯z; otherwise, let z¯fmul(x,y).
  3. Run (x¯,y¯)A(12n,z¯)
  4. Output (x¯,y¯) if both x,y are prime and z=x¯y¯.

We intentionally make the input to A uniform in {0,1}2n.

By Chebychev, both x,y prime w.p. >1/(2log2n)2=1/(4n2). Hence, B fails to pass z to A w.p. at most 11/(4n2).

By eq (AC), A fails to invert z¯ w.p. at most 1/q(n). Choose q(n):=8n2 and A correspondingly.

By union bound, the failure probability of B is at most

Pr[zz¯A fails]Pr[zz¯]+Pr[A fails]11/(4n2)+1/8n2=11/8n2,

and thus B breaks factoring w.p. at least 1/8n2, greater than negl, contradicting Factoring Assumption.

Note: the above reduction assumes efficient primality testing. That is not necessary, left as exercise.

Note: the pattern is common in crypto. Reduction from Assumption (factoring) to Construction (OWF) is bread and butter in this course.

From Weak OWF to Strong OWF

The existence of OWF is long-open. We will show that strong and weak OWFs are existentially equivalent. Clearly, any strong OWF satisfies weak. The challenge is from weak to strong.

We begin with a warmup.

Claim:

If {fn:{0,1}n{0,1}l}nN is a strong OWF, then g(x1,x2):=(f(x1),f(x2)) is also a strong OWF.

Assume for contradiction (AC), there exist poly p(n) and a nuPPT adv A such that for infinitely many nN,

Pr[x1,x2{0,1}n;y=g(x1,x2):g(A(12n,y))=y]>1/p(n).

We construct nuPPT B that inverts f.

Algorithm B(1n,y):

  1. x2{0,1}n and y2=f(x2).
  2. Run x1,x2A(12n,(y,y2)).
  3. Output x1 if f(x1)=y.

For uniform z{0,1}n, the above (y,y2) is the same distribution as obtaining the output g(x1,x2) by sampling (x1,x2){0,1}2n.

Also, when A inverts (y,y2), we have that B inverts z successfully. By (AC), A inverts w.p. >1/p, greater than any negligible function, and it contradicts that f is a strong OWF.

Note: this is a typical template to prove security by reduction. The quantifiers of (AC) is often the same (to negate negligible).

Observation: the definition of weak states that exist poly q(n) for all nuPPT; that is, even weak, there is a good fraction, 1/q, of instances that are hard for all.

Idea: we repeat the weak f for poly many instances and ask the Adv to invert all, so that Adv fails with high prob.

Theorem:

For any weak OWF fn:{0,1}n{0,1}l, there exists a poly m(n) such that

g(x1,...,xm):=(f(x1),...,f(xm))

from g:{0,1}mn{0,1}ml is a strong OWF.

Note: by def, there exists a poly q(n) s.t. no adv can invert f w.p. 11/q(n).

Proof:

Assume for contradiction (AC), there exists a nuPPT adv A and poly p(n) s.t. for inf many nN, A inverts g w.p. 1/p, i.e.,

Pr[{xi{0,1}n,yif(xi)}i[m],y(y1...ym):g(A(1n,y))=y]1/p(n).

We want to construct a nuPPT B to invert y=f(x) for uniform x{0,1}n by running A. So, the idea is to transform y into an output of g, that is (y1,,ym). How? (y,y,,y)? (y,y2,,ym)?

We construct B0 as below to run A.

Algorithm B0(1n,y):

  1. j[m]
  2. x1,,xm{0,1}n
  3. let yif(xi) for all i
  4. let yjy
  5. run x1,..,xmA(1mn,(y1,,ym))
  6. if f(xj)=y, output xj, otherwise output .

Note: B0 inverts y w.p. roughly 1/p by (AC), but our goal is to invert w.p. 11/q1/p. Hence, repeating B0(y) is necessary.

Algorithm B(1n,y):

  1. repeatedly run B0(y) poly r1(n) many times using fresh randomness
  2. output the first non output of B0.

Note: we use the same y as input, and that makes the probability analysis involved since the repetition is dependent.

By (AC), inverting g is easy, intuitively there are many (x,y=f(x)) that can be inverted by B0(y). Clearly that holds for A, but as mentioned, we need to prove it for B0.

The following claim is the key.

Claim: (many easy instances)

Suppose that (AC) holds. There exists a “large” set Gn of “easy” instances,

Gn:={x{0,1}n:Prx,y=f(x)[f(B0(y))=y]1/r2(n)}

for some poly r2(n), and |G|(11/2q)2n.

If the claim holds, then B can invert by repeating B0:

Prx,y[B not inv]=Pr[B not invxG]+Pr[B not invxG]Pr[B not inv|xG]+Pr[xG](11/r2)r1+1/2qen+1/2q1/q

We will choose r1(n):=nr2(n) to get (11/r2)r1en.

Then, B inverts w.p. >11/q, and it is contradicting that f is weak OWF. It remains to prove the claim.

Proof of Claim:

Intuition: Gn is B0, but essentially B0 is running A. If G0 is small, then A should not invert w.p. 1/p, and thus contra (AC).

Assume for contra (AC2), |G0|<(11/2q)2n. We have

Pr[A inv]=Pr[A invall xiGn]+Pr[A invsome xiGn]

Since the “easy” set Gn is small, it is unlikely all xi are easy. Formally, by (AC2)

Pr[A invall xiG]Pr[all xiG](11/2q)men.

Note: this is where the repetition m kicks in (in the construction of g), and we choose m(n):=n2q(n) to get the ineq.

Also, by union bound,

Pr[A invsome xiG]iPr[A invxiG]iPr[A inv|xiG]

Observe that Pr[A inv|xiG] is very close to Pr[B0(y) inv|xG] as that of the claim. The only difference is that B0 plant y in random position. Indeed, for any fixed i[m],

Pr[x1,...,xm{0,1}n,y1f(x1),...,ymf(xm):A(y1,...,ym) inv|xiG]=Pr[j[m] in B:B0(y) inv|xGj=i]

and thus

Pr[B0 inv|xG]=iPr[B0 invj=i|xG]=iPr[B0 invj=ixG]/Pr[xG]=iPr[B0 inv|j=ixG](Pr[j=ixG]/Pr[xG])=iPr[A inv|xiG]Pr[j=i]=(1/m)iPr[A inv|xiG]

We thus conclude

Pr[A invsome xiG]mPr[B0 inv|xG]<m1/r2.

and thus

Pr[A inv]<en+m1/r2.

We choose r2(n)=2mp(n) so that Pr[A inv]<1/p, contradicting (AC).

Primality Testing

Definition: Group

A group G is a set of elements with a binary operator that satisfies the following properties:

  1. Closuer: a,bG,abG
  2. Identity: 1G s.t. aG,1a=a1=a.
  3. Associativity: a,b,cG,(ab)c=a(bc).
  4. Inverse: aG,bG s.t. ab=ba=1.

Definition: Euler’s Totient Function

Let Zn:={aN:a<n,gcd(a,n)=1} be the multiplicative group modulo n. Let ϕ(n):=|Zn| be the Euler’s totient.

Note: ϕ(n)=p1k11(p11)p2k21(p21) for n=p1k1p2k2 where pi are distinct primes.

Theorem: Chinese Remainder Theorem (CRT), or Extended Euclidean Algo

For any n1,n2N s.t. gcd(n1,n2)=1,

Zn1n2Zn1×Zn2,

and we have poly time algos to transform from one representation to the other.

Theorem: (Euler)

nN,aZn,aϕ(n)=1modn

Let aZn, and let S:={ax:xZn}. We have S=Zn (otherwise, we have x1x2 but ax1=ax2, a contradiction given a1 exists). Then, by commutative for the first equality,

xZnx=bSb=xZnax=aϕ(n)xZnx.

That implies aϕ(n)=1.

Corollary: (Fermat’s Little Theorem)

For all prime p,

aZp,ap1=1modp

Definition:

For any composite nN, we say that aZn is a witness if an11modn.

Lemma: strict subgroup is small

Let G be a finite group. If HG is a strict subgroup of G, then |H||G|/2.

Let bG be an element s.t. bH. Consider elements in the set B:={ab:aH}. If there exists abH, then we have a1ab=bH, contradiction. Hence, BH=, and it remains to show that |B|=|H|. Suppose for contradiction that |B|<|H|, then there exist a1a2H s.t. a1b=a2b, a contradiction since we can multiply b1 on both sides.

Lemma:

For all nN, if there exists a witness, then there are at least ϕ(n)/2 witnesses.

Let HZn be the subset of none witnesses. We have 1H, and H is a subgroup modular n. Given that there exists a witness, H is a strict subgroup. We can then show that any strict subgroup is at most half size of the supergroup, ie, |H|ϕ(n)/2.

Definition: Strong witness

For any composite nN, write n1=2rd for some integer rN and odd d. We say that aZn is a strong witness if

ad±1modn , and  a2id1modn for all i=1,2,...,r1

Lemma: (warm up)

If a is a witness, then a is also a strong witness.

Assume for contradiction that a is not a strong witness. Then the sequence ad,a2d,,a2rd is either

  • (±1,1,1,,1), or
  • (,,,1,1,1,,1).

Hence, a is not a witness, a contradiction.

Lemma: (Miller-Rabin, every prime has no strong witness)

If n prime, then there is no strong witness in Zn.

If n prime, then the only solution to x2=1modn is ±1 (need proof). By Fermat’s Little Theorem, for any aZn, a2rd=1modn, and a2r1d=±1modn. If 1, then it is not a strong witness. Otherwise, 1, we can continue the next square root r2, and so on, until ad, which must be ±1.

It remains to show that every composite has many strong witnesses. The first step is to exclude perfect powers. The second step is to show that other composites have many strong witnesses.

Lemma:(Miller-Rabin, every composite has many strong witnesses)

If n is composite such that n=n1n2 for some coprime n1,n2, then there are at least half strong witness in Zn.

Let H:={aZn:a is a not a strong witness}. We will show that there exists H¯H s.t. H¯ is a strict subgroup of Zn, which is sufficient (as |H||H¯||Zn|/2).

Let 2rd=n1. For each aH, consider the sequence ad,a2d,,a2rd. Let j be the largest index such that

aH,a2jd=1modn

(so that for all aH, a2j+1d=1modn).

Such j<r exists because (1)d=1modn since d odd. Now, define

J:=2jd, and H¯:={a:aJ=±1modn}.

Clearly, HH¯. Also, H¯ is a subgroup (need proof). It remains to show that H¯ is strict. Let aH¯ be an element s.t. aJ=1modn (where a exists by def of H). Let a1Zn1 be the element s.t.

a=a1modn1.

Then, we have

a1J=aJ=1modn1,

where the second equality holds by n=n1n2 and gcd(n1,n2)=1. Let b1=a1modn1 and b2=1modn2, and let

b=b1modn1=b2modn2

by CRT. Then, we have

bJ=b1J=1modn1=b2J=1modn2.

Because 1=1modn1=1modn2 and 1=1modn1=1modn2 are unique, bJ±1modn, which implies bH¯.

Algorithm: Miller-Rabin Primality Testing

Input: n

  1. Output ‘No’ if n even.
  2. Output ‘No’ if n=xy is a perfect power for some x,yN.
  3. Write n1 as 2rd.
  4. Repeat λ times:
    • Sample uniformly aZn (by computing gcd(a,n)).
    • If a is a strong witness, output ‘No’.
  5. Output ‘Yes’.

Theorem: For any prime n, this algo outputs ‘Yes’ w.p. 1. For any composite n, this algo outputs ‘Yes’ w.p. 2λ.

See also: Solovay-Strassen Primality Test

Article on Wikipedia

For odd n and integer a, let J(a,n) be the Jacobi symbol. The Solovay-Strassen primality test checks the given input n by

  1. Sample random a
  2. Compute J(a,n)
  3. Output “NOT PRIME” if a(n1)/2J(a,n)modn.

Similar to Miller-Rabin, any prime n always passes the test: J(a,n) is exactly the Legengre symbol, and the test is Euler’s criteria. For composit n, there are witnesses and liars such that a is a witness iff a yields NOT PRIME. It can be shown that 1) the existence of witness and 2) the liars are a subgroup. That implies that there are at least half witnesses.

Ref: Martin Dietzfelbinger, Primality Testing in Polynomial Time

A Universal OWF

The idea is to construct a function that computes all easy-to-compute functions.

Function: funiv

Input: y, let n:=|y|.

  1. Interpret y as a pair (M,x) of Turing machine and bitstring, where |M|=logn
  2. Run M on x for T=n2 steps
  3. If M halts in T steps, output (M,M(x)); otherwise, output .

Theorem: A Universal Weak OWF

If there exists a OWF, then the above function funiv is a weak OWF.

To prove it, we will use the following lemma.

Lemma: Strong OWF in time O(n2)

If there exists a strongly one-way function g , then there exists a strongly one-way function g that is computable in time O(n2).

Intuition: If there exists a strong OWF g in time O(n2), then there exists (at least) a TM Mg that computes g in O(n2) steps such that the description length |Mg|=d is a constant. WLOG, assume the description of any TM can be padded with a special symbol to arbitrary long. For any s|Mg|, let Mg,s be the description of Mg padded to s bits. Then, for all sufficiently large n, the logn-bit random prefix of y is exactly Mg,logn w.p. 1/n. Hence, funiv(y) is hard to invert.

Formally, assume for contra (AC), for all poly q(n), there exists NUPPT A s.t. for infinitely many nN,

Pr[y{0,1}n,zfuniv(y):funiv(A(1n,z))=z]>11/q.

We construct NUPPT B that inverts zg(x) for x{0,1}nlogn by

  1. Run yA(1n,(Mg,z)).
  2. Interpret y as (M,x).
  3. If M=Mg and z=g(x), output x; otherwise, output .

Notice We uses Mg in B because B is asked to invert g, which means that we know g in this proof. Alternatively, we prove it without AC in lecture, and there is only probability analysis which does not depend on g.

By (AC), we have

1/qPr[A not inv]Pr[A not inv|y=(Mg,)]Pr[y=(Mg,)]=Pr[A not inv|y=(Mg,)](1/n).

Notice that

Pr[A not inv|y=(Mg,)]=Pr[x{0,1}nlogn,zg(x):B not inv].

Hence, we have

Pr[B not inv]n/q(n).

Choosing q(n)=n2 and A correspondingly, we have B inverts w.p. at least 11/n, that is greater than 1/p(m) for some polynomial p and the input size m:=nlogn of g, contradicts g is a strong OWF.

Proof of Lemma (Strong OWF in time O(n2))

Suppose we can compute g in time nc for some const c>2. Then, for any input x{0,1}nc, interpret x=x1|x2 s.t. |x1|=ncn, and then define g(x)=g(x1x2):=x1g(x2). Let m=nc be the input size of g It is easy to see that g is computable in O(m2) time, and it follows by standard reduction that g is hard to invert if g is a OWF.

Note: The above construction is impractical due to inefficiency. Suppose there exists a OWF that is easy to compute by a TM of 1000 bits. The above needs a “sufficiently long” input so that logn1000 to be a weak OWF, which means |x|21000.

Collection of OWFs

To construct OWFs efficiently, many mathematical / computational assumptions are considered. The intuition is to consider efficiently sampleable distributions instead of uniformly random strings. The typical syntax is PPT algos (Gen,Samp):

  • ppGen(1n)
  • xSamp(1n,pp)
  • fpp:{0,1}n{0,1}

And then, for all NUPPT adversary A, there exists a negligible function ϵ such that

Pr[ppGen(1n),xSamp(1n,pp),yfpp(x):fpp(A(1n,pp,y))=1]ϵ(n).

For example, given the factoring assumption, we can construct a collection of OWF by

  • let Gen output n directly,
  • let Samp output two n-bit primes uniformly at random (using primality testing), and
  • let fpp be the n-bit multiplication.

Other collections (such as RSA, discrete logarithm, or Rabin) are more involved in their constructions, and they provide additional “properties” on top of OWF.