\newcommand{\RF}{\mathsf{RF}} \newcommand{\PRF}{\mathsf{PRF}} \newcommand{\Enc}{\mathsf{Enc}} \newcommand{\Dec}{\mathsf{Dec}} \newcommand{\Gen}{\mathsf{Gen}} \newcommand{\Expr}{\mathsf{Expr}} \newcommand{\state}{\mathsf{state}}

Indistinguishability and Pseudo-Randomness

Recall the perfectly secure encryption, OTP. That is, we bitwise-XOR our message with a uniform random string.

mk, m=k.m \oplus k, ~ |m| = |k|.

OTP is inefficient because the long random string must be shared between Alice and Bob in advance. Suppose that we have a (mathematical, deterministic) function that can extends a short truely random string to a long “random-looking” string. We can use the seemingly random to encrypt messages as in OTP, yet it is efficient.

Is that possible? How to formally define “random-looking”?

Let gg be the above function with short input xx and long output g(x)g(x). We want Alice and Bob share the same g(x)g(x) to decrypt correctly, so gg must be deterministic. Mathematically, we have def for the distance between two probability distributions. However, for any g(x)>x|g(x)| \gt |x|, the input / output distributions are far. The point is “random-looking” at best.

mg(s), m=g(s), but sm=g(s).m \oplus g(s), ~ |m| = |g(s)|, \text{ but } |s| \ll |m| = |g(s)|.

We will introduce computational indistinguishability, and then define pseudo-random generator (PRG) and pseudo-random function (PRF).

Computational Indistinguishability

Key Idea: If we have no way to show the difference, then we are satisfied. We call it indistinguishability.

Example: Turing test, when a machine and a human is indistinguishable in every human’s prompts, we call it AI.

Observation: they are not the same, not even close in any sense; however, the distinguisher “another human” can not tell the difference due to a limited power.

Concept: we say a distribution is pseudorandom if for every efficient algorithm, it can not be distinguished from a (truely) uniform distribution.

We will formalize the concept asymptotically.

Definition: Ensembles of Probability Distributions

A sequence {Xn}nN\set{X_n}_{n\in\N} is called an ensemble if for each nNn \in \N, XnX_n is a probability distribution over {0,1}\bits. (We often write X={Xn}n\cX = \set{X_n}_n when the context is clear.)

E.g., supposing XnX_n is a distribution over nn-bit strings for all nNn\in\N, {Xn}nN\set{X_n}_{n\in\N} is an ensemble.

Definition: Computational Indistinguishability

Let X={Xn}n\cX = \set{X_n}_n and Y={Yn}n\cY = \set{Y_n}_n be ensembles where Xn,YnX_n, Y_n are distributions over {0,1}(n)\bit^{\ell(n)} for some polynomial ()\ell(·). We say that X\cX and Y\cY are computationally indistinguishable (denoted by XY\cX \approx \cY) if for all NUPPT DD (called the “distinguisher”), there exists a negligible function ϵ\eps such that nN\forall n \in \N,

Pr[tXn,D(t)=1]Pr[tYn,D(t)=1]<ϵ(n).\Big| \Pr[t \gets X_n, D(t) = 1] − \Pr[t \gets Y_n, D(t) = 1] \Big| \lt \eps(n).

Note: /

  • “=1” is a convention in literature
  • “absolute” is not necessary due to “for all D”

This definition requires the two distributions to pass all efficient statistical tests, which include the following.

  • Roughly as many 0 as 1.
  • Roughly as many 01 as 10.
  • Each sequence of bits occurs with roughly the same probability.
  • Given any prefix, guessing the next bit correctly happens with roughly the same probability.

Lemma: Closure Under Efficient Operations

If the pair of ensembles {Xn}n{Yn}n\set{X_n}_n \approx \set{Y_n}_n, then for any NUPPT MM, {M(Xn)}n{M(Yn)}n\set{M(X_n)}_n \approx \set{M(Y_n)}_n.

(By standard reduction)

Examples:

  1. MM ignores its input. Clearly, M(Xn)M(Yn)M(X_n) \equiv M(Y_n) for all nn.
  2. MM is identity, i.e., its output is exactly the input. {M(Xn)=Xn}n{M(Yn)=Yn}n\set{M(X_n) = X_n}_n \approx \set{M(Y_n)=Y_n}_n.
  3. MM outputs the first half of the input, i.e., M(x):=x[1,...,x/2]M(x) := x[1, ..., |x|/2].

Lemma: Hybrid Lemma

Let X(1),X(2),...,X(m)X^{(1)}, X^{(2)}, ..., X^{(m)} be a sequence of probability distributions. Assume that the machine DD distinguishes X(1)X^{(1)} and X(m)X^{(m)} with probability pp. Then there exists some i{1,...,m1}i \in \set{1, ..., m - 1} s.t. DD distinguishes X(i)X^{(i)} and X(i+1)X^{(i+1)} with probability p/(m1)p/(m-1).

By triangular inequality. Namely, fixing any DD, for any i,j[m]i,j \in [m], define

d(i,j):=PrtX(i)[D(t)=1]PrtX(j)[D(t)=1].d(i,j) := \card{ \Pr_{t\gets X^{(i)}}[D(t)=1] - \Pr_{t\gets X^{(j)}}[D(t)=1] }.

Then, we have

d(1,m)d(1,2)+d(2,3)+...+d(m1,m).\begin{align*} d(1,m) \le d(1,2)+d(2,3)+ ... +d(m-1, m). \end{align*}

Hence, given that d(1,m)=pd(1,m) = p, we have that d(i,i+1)p/(m1)d(i,i+1) \ge p/(m-1) for some i[m1]i\in[m-1].

Notice that this lemma applies to distributions, not ensembles. Fortunately, it implies the following.

Corollary:

For any ensembles X,Y,Z\cX, \cY, \cZ, if XY\cX \approx \cY and YZ\cY \approx \cZ, then it follows that XZ\cX \approx \cZ.

(left as exercise)

Discuss If the number of hybrid distributions (between two ensembles) depends on the size nn (of the distributions in the ensembles), the above corollary is tricky. Consider two ensembles X={Xn}n,Y={Yn}n\cX = \set{X_n}_n, \cY=\set{Y_n}_n, and suppose that the machine DD distinguishes X,Y\cX, \cY w.p. p(n)p(n) (that depends on nn), and then suppose that the sequence (Xn=Xn(1),...,Xn(m(n))=Yn)(X_n = X_n^{(1)}, ... , X_n^{(m(n))} = Y_n) consists of m(n)m(n) distributions such that mm depends on nn. Then, we can not define m(n)m(n) ensembles between X\cX and Y\cY due to the dependence (i.e., the length of the sequence depends on nn). This is indeed the case when we have many hybrids, e.g., going from (n+1)(n+1)-bit PRG to 2n2n-bit PRG. There are two ways to treat this case, the formal one and the more popular one. In the formal way, we assume for contra that exists DD and p(n)p(n) s.t. for inf many nNn\in\N, DD distinguishes (Xn,Yn)(X_n, Y_n) w.p. at least p(n)p(n); we then construct a reduction BB such that guesses an index j[m(n)1]j \in [m(n)-1] and hoping that j=ij = i, where ii is the index given by hybrid lemma, so that BB runs DD to distinguish and solve the challenge specified by the jj-th hybrid. The popular way is less rigorous but more intuitive: we just claim that the two distributions Xn(j),Xn(j+1)X_n^{(j)}, X_n^{(j+1)} are “indistinguishable” for each j,nj, n, and thus Xn,YnX_n, Y_n are “indistinguishable”; this is informal because fixing any jj means that nn is also fixed and Xn,YnX_n, Y_n are two distributions (not ensembles), but indistinguishability is defined asymptotically on ensembles.

Example: Let X,Y,Z,Z\cX, \cY, \cZ, \cZ' be ensembles. Suppose that (X,Z)(X,Z)(\cX, \cZ) \approx (\cX, \cZ') and (Y,Z)(Y,Z)(\cY, \cZ) \approx (\cY, \cZ'). Does (X,Y,Z)(X,Y,Z)(\cX, \cY, \cZ) \approx (\cX, \cY, \cZ')?

Lemma: Prediction Lemma

Let {Xn0}n\set{X^0_n}_n and {Xn1}n\set{X^1_n}_n be two ensembles where Xn0,Xn1X^0_n, X^1_n are probability distributions over {0,1}(n)\bit^{\ell(n)} for some polynomial ()\ell(\cdot), and let DD be a NUPPT machine that distinguishes between {Xn0}n\set{X^0_n}_n and {Xn1}n\set{X^1_n}_n with probability p()\ge p(·) for infinitely many nNn \in \N. Then there exists a NUPPT AA such that

Pr[b{0,1},tXnb:A(t)=b]12+p(n)2\Pr[b \gets \bit, t \gets X^b_n : A(t) = b] \ge \frac{1}{2} + \frac{p(n)}{2}

for infinitely many nNn \in \N.

Remove the absolute value in the def of comp. ind. by negating the distinguisher DD, and then standard probability.

Note: the converse the easier to prove. Hence, prediction and distinguishing is essentially equivalent.

Pseudo-Random Generator

Definition: Pseudo-random Ensembles.

The probability ensemble {Xn}n\set{X_n}_n, where XnX_n is a probability distribution over {0,1}l(n)\bit^{l(n)} for some polynomial l()l(\cdot), is said to be pseudorandom if {Xn}n{Ul(n)}n\set{X_n}_n \approx \set{U_{l(n)}}_n, where UmU_m is the uniform distribution over {0,1}m\bit^m.

Note:

  • this definition says that a pseudorandom distribution must pass all efficiently computable tests that the uniform distribution would have passesd.
  • it is hard to check or prove if a distribution is pseudorandom (due to the “for all” quantifier from comp. ind.)

Definition: Pseudo-random Generators.

A function g:{0,1}{0,1}g : \bit^\ast \to \bit^\ast is a Pseudo-random Generator (PRG) if the following holds.

  1. (efficiency): gg can be computed in PPT.
  2. (expansion): g(x)>x|g(x)| \gt |x|
  3. The ensemble {xUn:g(x)}n\set{x \gets U_n : g(x)}_n is pseudorandom.

We sometimes say that the expansion of PRG gg is tt if g(x)xt|g(x)| - |x| \ge t for all xx.

Example: if g:{0,1}n{0,1}n+1g: \bit^n \to \bit^{n+1} for all nn is a PRG, then gg is a OWF. (proof left as exercise, why expansion is necessary?)

Example: Existence of PRG implies NPPNP \neq P

Suppose that g:{0,1}{0,1}g: \bits \to \bits is a PRG. Then, the language L:={g(s):s{0,1}}NPL := \set{g(s) : s \in \bits} \in NP and LPL \notin P.

It is direct to see LNPL \in NP. To see LPL \notin P, suppose for contradiction, there exists a polynomial time algorithm AA that decides LL, then AA can easily break the pseudorandomness of gg, a contradition.

This is indeed the case for all cryptographic objects: NPPNP \neq P is necessary for the objects to exist. Since NPPNP \neq P is long open, cryptography is built on assumptions. Ideally, even we can not prove or disprove NPPNP \neq P, we want the “win-win” scenario:

  • If NP=PNP = P, then we can solve all NPNP problems efficiently in polynomial time. This is a world called Algorithmica, but we do not have cryptography.

  • Otherwise NPPNP \neq P, then there exist some hard problems in NPNP that can not be solved in polynomial time. Ideally, we want to utilize the hard problems to build cryptographic objects (so that any polynomial-time adversary can not break).

This explained why cryptography needs assumptions, but what is a good assumption? Ideally, the minimal assumption would be NPPNP \neq P, or equivalently, SATPSAT \notin P. Unfortunately, we do not know how to build crypto assuming only NPPNP \neq P. So far, cryptography is build on factoring, RSA, …, we will discuss more on them. Notice that we still have the win-win w.r.t. the assumptions, e.g., either we can solve factoring in polynomial time (which would solve a century-old problem), or we have cryptography.

Here, PRG is the assumption that gives us encryption. It is currently an abstract object, but later we will show how PRG connects cryptography and other assumptions.

It also partly explains why we still can have crypto even when we can use quantum computers to break factoring, RSA, and other number-theoretic assumptions: there are other problems in NPNP that still resist quantum algorithms. It is interesting to observe that the thousand-year seek of secure encryption is deeply rooted in complexity (e.g., DES, MD5, Enigma, … are broken).

Also notice again that the discussion is asymptotic for all problem sizes, and that some real-world cryptographic constructions are not asymptotic and thus they do not fit in (e.g., AES and SHA are defined only for 128, 256, and 512)

Lemma: Expansion of a PRG

Let g:{0,1}n{0,1}n+1g:\bit^n \to \bit^{n+1} to be a PRG for all nNn \in\N. For any polynomial (n)>n\ell(n) \gt n, define g:{0,1}n{0,1}(n)g': \bit^n \to \bit^{\ell(n)} as follows:

g(s)b1b2...b,g'(s) \to b_1 b_2 ... b_{\ell},

where :=(s)\ell := \ell(|s|), x0s,xi+1bi+1g(xi)x_0 \gets s, x_{i+1} \| b_{i+1} \gets g(x_i). Then gg' is a PRG.

Proof, warmup:

Suppose that =2\ell = 2, no expansion, but we want to show pseudorandomness. Define distributions

Hn0:=g(s),Hn1:=U1g(s)[n+1],Hn2:=U2H^0_n := g'(s), H^1_n := U_1 \| g(s)[n+1], H^2_n := U_2

for nNn \in \N, and define Hi:={Hni}n\cH^i := \set{H^i_n}_n for i=0,1,2i=0,1,2. Since g(s)=g(s)[n+1]g(g(s)[1...n])[n+1]g'(s) = g(s)[n+1] \| g(g(s)[1...n])[n+1], by g(s)Un+1g(s) \approx U_{n+1} and closure, we have H0H1\cH^0 \approx \cH^1. By g(x)g(x) is pseudorandom and closure, g(Un)[n+1]U1g(U_n)[n+1] \approx U_1, which implies H1H2\cH^1 \approx \cH^2. By the corollary of hybrid lemma, we have H0H2\cH^0 \approx \cH^2.

Proof of PRG Expansion

It is slightly tricky when \ell depends on nn. Define the prefix hh and last bit ss of iterating gg as:

hi(x):={g(x)[1...n]i=1,g(hi1(x))[1...n]i>1h^i(x) := \begin{cases} g(x)[1...n] & i = 1,\\ g(h^{i-1}(x))[1...n] & i > 1 \end{cases}

and

si:=si(x):={g(x)[n+1]i=1,g(hi1(x))[n+1]i>1.s^i := s^i(x) := \begin{cases} g(x)[n+1] & i=1,\\ g(h^{i-1}(x))[n+1] & i \gt 1. \end{cases}

We have g(x)=s1s2...sg'(x) = s^1 \| s^2 \| ...s^{\ell}, and we want to prove it through Hybrid Lemma. Given nn, define hybrid distributions H0:=g(x)H_0 := g'(x), H:=UH_{\ell} := U_{\ell}, and define HiH_i for i=1,...,1i = 1,...,\ell-1 as

Hi:=Uis1...s(n)i,H_i := U_i \| s^{1} \| ...s^{\ell(n)-i},

where UiU_i denotes sampling an ii-bit string uniformly at random. Observe that for each i=0,1,...,1i=0,1,...,\ell-1, HiH_i and Hi+1H_{i+1} differ by a g(x)g(x), that is,

Hi+1=UiU1s1...si1, and Hi=Uis1s2...si=Uig(x)[n+1]s1(g(x)[1...n])...si1(g(x)[1...n])\begin{align*} H_{i+1} & = U_{i} \| U_1 \| s^{1} \| ...s^{\ell-i-1}, \text{ and } \\ H_{i} & = U_{i} \| s^{1} \| s^{2} \| ...s^{\ell-i} \\ & = U_{i} \| g(x)[n+1] \| s^{1}(g(x)[1...n]) \| ...s^{\ell-i-1}(g(x)[1...n]) \end{align*}

for all i=0,1,...,i = 0, 1, ..., \ell.

Assume for contra (AC), there exists NUPPT DD, poly p(n)p(n) s.t. for inf many nNn\in\N, DD distinguishes {x{0,1}n:g(x)}n\set{x\gets\bit^n : g(x)}_n and U(n)U_{\ell(n)} w.p. at least 1/p(n)1/p(n). The intuition is to apply Hybrid Lemma so that there exists jj^\ast such that Hj,Hj+1H_{j^*}, H_{j^\ast+1} are distinguishable, and thus by Closure Lemma g(x)g(x) is distinguishable from uniform.

We prove it formally by constructing DD' that aims to distinguish g(x)g(x). Given input t{0,1}n+1t \in \bit^{n+1}, DD' performs:

  1. Samplable i{0,...,1}i \gets \set{0,...,\ell-1} (where (n)\ell \gets \ell(n))
  2. t0Uit_0 \gets U_i, t1t[n+1]t_1 \gets t[n+1], and t2s1(t[1...n])s2(t[1...n])...si1(t[1...n])t_2 \gets s^1(t[1...n]) \| s^2(t[1...n]) \| ...s^{\ell-i-1}(t[1...n])
  3. output D(t0t1t2)D(t_0 \| t_1 \| t_2)

To show that DD' succeed with non-negl prob., we partition the event as follows:

PrtUn+1,i[D(t)=1]PrxUn,i[D(g(x))=1]=j=01Prt,i[D(t)=1i=j]Prx,i[D(g(x))=1i=j]=j=01(Prt,i[D(t)=1i=j]Prx,i[D(g(x))=1i=j])Pr[i=j]=1j=01Prt,i[D(t)=1i=j]Prx,i[D(g(x))=1i=j]\begin{align*} & \Pr_{t\gets U_{n+1}, i} [D'(t) = 1] - \Pr_{x\gets U_n, i} [D'(g(x)) = 1] \\ =& \sum_{j=0}^{\ell-1} \Pr_{t, i} [D'(t) = 1 \cap i=j] - \Pr_{x, i} [D'(g(x)) = 1 \cap i=j] \\ =& \sum_{j=0}^{\ell-1} \left(\Pr_{t, i} [D'(t) = 1 | i=j] - \Pr_{x, i} [D'(g(x)) = 1 | i=j]\right) \cdot \Pr[i=j] \\ =& \frac{1}{\ell} \cdot \sum_{j=0}^{\ell-1} \Pr_{t, i} [D'(t) = 1 | i=j] - \Pr_{x, i} [D'(g(x)) = 1 | i=j] \\ \end{align*}

where the random variable i{0,1,...,1}i \gets \set{0,1,...,\ell-1} is sampled exactly the same as in DD'.

Notice that conditioned on i=ji = j for any fixed jj, the distribution t0t1t2t_0 \| t_1 \| t_2 (given to DD) is identical to

{Hj+1if t{0,1}n+1Hjif x{0,1}n,tg(x).\begin{cases} H_{j+1} & \text{if } t \gets \bit^{n+1}\\ H_{j} & \text{if } x \gets \bit^n, t \gets g(x). \end{cases}

That implies

Prt,i[D(t)=1i=j]=Pr[tHj+1:D(t)=1],Prx,i[D(t)=1i=j]=Pr[tHj:D(t)=1].\begin{align*} \Pr_{t,i} [D'(t) = 1 | i=j] = \Pr[t' \gets H_{j+1} : D(t') = 1], \\ \Pr_{x,i} [D'(t) = 1 | i=j] = \Pr[t' \gets H_{j} : D(t') = 1]. \end{align*}

We thus have the summations cancelling out,

PrtUn+1,i[D(t)=1]PrxUn,i[D(g(x))=1]=1j=01PrtHj+1[D(t)=1]PrtHj[D(t)=1]=1(PrtH[D(t)=1]PrtH0[D(t)=1])11p(n),\begin{align*} & \Pr_{t\gets U_{n+1}, i} [D'(t) = 1] - \Pr_{x\gets U_n, i} [D'(g(x)) = 1] \\ =& \frac{1}{\ell} \cdot \sum_{j=0}^{\ell-1} \Pr_{t'\gets H_{j+1}} [D(t') = 1] - \Pr_{t' \gets H_j} [D(t') = 1] \\ =& \frac{1}{\ell} \cdot \left(\Pr_{t'\gets H_\ell} [D(t') = 1] - \Pr_{t' \gets H_0} [D(t') = 1]\right) \\ \ge& \frac{1}{\ell} \cdot \frac{1}{p(n)}, \end{align*}

where the last inequality follows by (AC). That is, DD' distinguishes g(x)g(x) w.p. at least 1(n)p(n)\frac{1}{\ell(n)p(n)}, contradicting gg is a PRG.

Discuss In the above, we proved it formally and preserved the uniformity (if DD is a uniform TM, then DD' is also uniform). We did not apply Hybrid Lemma (and no triangular ineq), nor did we use Closure Lemma. Alternatively after (AC), one may apply Hybrid Lemma which claims that exists jj^\ast s.t. HjH_{j^\ast} is distinguishable from Hj+1H^{j^\ast+1} w.p. at least 1/(p)1/(\ell p), and then hardwire jj^\ast into DD' in order to distinguish g(x)g(x). This would make DD' non-uniform because jj^\ast would depend on each nn and we would not have an efficient way to find jj^\ast.

We proved in the above that a PRG with 1-bit expansion is sufficient to build any poly-long expansion. We have not yet give any candidate construct of PRG (even 1-bit expansion), but it is useful to firstly see what we can achieve using PRGs.

Example: Now suppose that we have a PRG gg with n(n)n \mapsto \ell(n) expansion for any poly \ell. We can construct a computationally secure encryption by sampling key kk as an nn-bit string and then bitwise XORing g(k)g(k) with the message. That mg(k)m \oplus g(k) encrypts one message. We can encrypt more messages by attaching to each message a sequence number, such as (m1g(k)[1...n],1),(m2g(k)[n...2n],2)(m_1 \oplus g(k)[1...n], 1), (m_2 \oplus g(k)[n...2n], 2), and so on.

What’s the downside of the above multi-message encryption?

Pseudo-Random Functions

In order to apply PRGs more efficiently, we construct a tree structure and call the abstraction pseudo-random functions (PRFs). We begin with defining (truly) random function.

Definition: Random Functions

A random function f:{0,1}n{0,1}nf: \bit^n \to \bit^n is a random variable sampled uniformly from the set RFn:={f:{0,1}n{0,1}n}\RF_n := \set{f : \bit^n \to \bit^n}.

We can view a random function in two ways. In the combinatorial view, any function f:{0,1}n{0,1}nf: \bit^n \to \bit^n is described by a table of 2n2^n entries, each entry is the nn-bit string, f(x)f(x).

f(0000...00),f(0000...01),...,f(1111...11)f(0000...00),f(0000...01), ...,f(1111...11)

In the computational view, a random function ff is a data structure that on any input xx, perform the following:

  1. initialize a map mm to empty before any query
  2. if xmx \notin m, then sample y{0,1}ny \gets \bit^n and set m[x]ym[x] \gets y
  3. output m[x]m[x]

In both views, the random function needs 2nn2^n \cdot n bits to describe, and thus there are 2n2n2^{n2^n} random functions in RFn\RF_n.

Note: the random function FRFnF\gets \RF_n is also known as random oracle in the literature.

Intuitively, a pseudo-random function (PRF) shall look similar to a random function. That is, indistinguishable by any NUPPT Turing machine that is capable of interacting with the function.

Definition: Oracle Indistinguishability

Let {On0}nN\set{\cO^0_n}_{n\in\N} and {On1}nN\set{\cO^1_n}_{n\in\N} be ensembles where On0,On1\cO^0_n, \cO^1_n are probability distributions over functions. We say that {On0}n\set{\cO^0_n}_{n} and {On1}n\set{\cO^1_n}_{n} are computationally indistinguishable if if for all NUPPT machines DD that is given oracle accesses to a function, there exists a negligible function ϵ()\eps(\cdot) such that for all nNn\in\N,

Pr[FO0:DF()(1n)=1]Pr[FO1:DF()(1n)=1]ϵ(n).\Pr[F\gets\cO^0 : D^{F(\cdot)}(1^n) = 1] - \Pr[F\gets\cO^1 : D^{F(\cdot)}(1^n) = 1] \le \eps(n).

Note: Df()D^{f(\cdot)} denotes that the TM DD may interact with the function ff through black-box input and output, while each input-output takes time to read/write but computing ff takes 0 time.

It is easy to verify that oracle indistinguishability satisfies “closure under efficient operations”, the Hybrid Lemma, and the Prediction Lemma.

Also notic that we can transform a distribution of oracle functions to a distribution of strings using an efficient oracle operation, and in that case, the oracle indistinguishability is translated into the comp. indistinguishability of strings (see CPA-secure encryption below).

Definition: Pseudo-random Functions (PRFs)

A family of functions {fs:{0,1}s{0,1}s}s{0,1}\set{f_s: \bit^{|s|} \to \bit^{|s|}}_{s \in \bits} is pseudo-random if

  • (Easy to compute): fs(x)f_s(x) can be computed by a PPT algo that is given input s,xs,x.
  • (Pseudorandom): {s{0,1}n:fs}n{FRFn:F}n\set{s\gets \bit^n : f_s}_n \approx \set{F \gets \RF_n : F}_n.

Note: similar to PRG, the seed ss is not revealed to DD (otherwise it is trivial to distinguish).

Theorem: Construct PRF from PRG [Goldreich-Goldwasser-Micali 84]

If a pseudorandom generator exists, then pseudorandom functions exist.

We have shown that a PRG with 1-bit expansion implies any PRG with poly expansion. So, let gg be a length-doubling PRG, i.e., g(x)=2x|g(x)| = 2 |x|. Also, define g0,g1g_0, g_1 to be

g0(x):=g(x)[1...n], and g1:=g(x)[n...2n],g_0(x) := g(x)[1...n], \text{ and } g_1:= g(x)[n...2n],

where n:=xn := |x| is the input length.

We define fsf_s as follows to be a PRF:

fs(b1b2...bn):=gbngbn1...gb1(s).f_s(b_1 b_2 ... b_n) := g_{b_n} \circ g_{b_{n-1}} \circ ... g_{b_1}(s).

That is, we evaluate gg on ss, but keep only one side of the output depending on b1b_1, and then keep applying gg on the kept side, and then continue to choose the side by b2b_2, and so on.

This constructs a binary tree. The intuition is from expanding the 1-bit PRG, but now we want that any sub-string of the expansion can be efficiently computed. (We CS people love binary trees.) Clearly, fsf_s is easy to compute, and we want to prove it is pseudorandom.

There are 2n2^n leaves in the tree, too many so that we can not use the “switch one more PRG to uniform in each hybrid” technique as in expanding PRG. The trick is that the distinguisher DD can only query fsf_s at most polynomial many times since DD is poly-time. Each query correspond to a path in the binary tree, and there are at most polynomial many nodes in all queries. Hence, we will switch the g(x)g(x) evaluations from root to leaves of the tree and from the first query to the last query.

Note: switching each instance of g(x)g(x) (for each xx) is a reduction that runs DD to distinguish one instance of g(x)g(x); therefore, we switch exactly one in each hybrid.

More formally, assume for contra (AC), there exists NUPPT DD, poly pp s.t. for inf many nNn\in\N, DD distinguishes fsf_s from RF (in the oracle interaction). We want to construct DD' that distinguishes g(x)g(x). We define hybrid oracles Hi(b1...bn)H_i(b_1 ... b_n) as follows:

  1. the map mm is initialized to empty
  2. if the prefix b1...bimb_1 ... b_i \notin m, then sample s(bibi1...b1){0,1}ns(b_{i} b_{i-1} ... b_{1}) \gets \bit^n and set m[bi...b1]s(bibi1...b1)m[b_i ... b_1] \gets s(b_{i} b_{i-1} ... b_{1})
  3. output gbngbn1...gbi1(m[bi...b1])g_{b_n} \circ g_{b_{n-1}} \circ ... g_{b_{i-1}}(m[b_{i} ... b_{1}])

Notice that HiH_i is a function defined using the computational view.

Let PRFn:={fs:s{0,1}n}\PRF_n := \set{f_s : s \gets \bit^n} be the distribution of fsf_s for short. We have H0PRFnH_0 \equiv \PRF_n and HnRFnH_n \equiv \RF_n, but there are still too many switches between Hi,Hi+1H_i, H_{i+1}. The key observation is that, given DD is PPT, we know a poly T(n)T(n) that is the running time of DD on 1n1^n, and then we just need to switch at most T(n)T(n) instances of g(x)g(x). That is to define sub-hybrids Hi,jH_{i,j},

  1. the map mm is initialized to empty
  2. if the prefix b1...bibi+1mb_1 ... b_i b_{i+1} \notin m, then depending on the “number of queries” that are made to Hi,jH_{i,j} so far, including the current query, do the following: sample s{0,1}ns \gets \bit^n, set

    m[bi+1bi...b1]{{0,1}nnumber of queriesjgbi+1(s)otherwise,m[b_{i+1} b_i ... b_1] \gets \begin{cases} \bit^n & \text{number of queries} \le j \\ g_{b_{i+1}}(s) & \text{otherwise} \end{cases},

    and set

    m[bi+1bi...b1]{{0,1}nnumber of queriesjgbi+1(s)otherwise.m[\overline{b_{i+1}} b_i ... b_1] \gets \begin{cases} \bit^n & \text{number of queries} \le j \\ g_{\overline{b_{i+1}}}(s) & \text{otherwise} \end{cases}.
  3. output gbngbn1...gbi(m[bi+1...b1])g_{b_n} \circ g_{b_{n-1}} \circ ... g_{b_{i}}(m[b_{i+1} ... b_{1}])

We have Hi,0HiH_{i,0} \equiv H_i. Moreover for any DD runs in time T(n)T(n), we have Hi,T(n)Hi+1H_{i,T(n)} \equiv H_{i+1} (their combinatorial views differ, but their computational views are identical for T(n)T(n) queries). Now we have nT(n)n \cdot T(n) hybrids, so we can construct D(t)D'(t):

  1. sample i{0,1,...,n1}i \gets \set{0,1,...,n-1} and j{0,...,T(n)1}j\gets\set{0,...,T(n)-1} uniformly at random
  2. define oracle O_i,j,t()O\_{i,j,t}(\cdot) such that is similar to H_i,jH\_{i,j} but “injects” tt to the map mm in the jj-th query if the prefix b_1...b_ib_i+1mb\_1 ... b\_i b\_{i+1} \notin m. (This is constructable and computable only in the next step when queries come from DD.)
  3. run and output DO_i,j,t()(1n)D^{O\_{i,j,t}(\cdot)}(1^n), that is running DD on input 1n1^n when providing DD with oracle queries to O_i,j,tO\_{i,j,t}

It remains to calculate the probabilities, namely, given (AC), DD' distinguishes g(x)g(x) from uniformly sampled string w.p. 1nT(n)p(n)\ge \frac{1}{nT(n)p(n)}, a contradiction. The calculation is almost identical to the proof of PRG expansion and left as an exercise.

Secure Encryption Scheme

Perfect secrecy considers that the adversary gets the ciphertext only (but nothing else). However, there are other natural adversarial models in practical scenarios.

  • Known plaintext attack: The adversary may get to see pairs of form (m0,Enck(m0))...(m_0, \Enc_k(m_0)) ...
  • Chosen plain text, CPA: The adversary gets access to an encryption oracle before and after selecting messages.
  • Chosen ciphertext attack, CCA1: The adversary has access to an encryption oracle and to a decryption oracle before selecting the messages. [“lunch-time attack”, Naor and Young]
  • Chosen ciphertext attack, CCA2: This is just like a CCA1 attack except that the adversary also has access to decryption oracle after selecting the messages. It is not allowed to decrypt the challenge ciphertext however. See Rackoff and Simon, Crypto 1991.

We formalize CPA-security next (but leave CCA1/CCA2 later in authentication).

Definition: Chose-Plaintext-Attack Encryption (CPA)

Let Π=(Gen,Enc,Dec)\Pi = (\Gen, \Enc, \Dec) be an encryption scheme. For any NUPPT adversary AA, for any nN,b{0,1}n\in\N, b\in\bit, define the experiment ExprbΠ,A(1n)\Expr_b^{\Pi, A}(1^n) to be:

Experiment ExprbΠ,A(1n)\Expr_b^{\Pi, A}(1^n):

  1. kGen(1n)k \gets \Gen(1^n)
  2. (m0,m1,state)AEnck()(1n)(m_0, m_1, \state) \gets A^{\Enc_k(\cdot)}(1^n)
  3. cEnck(mb)c \gets \Enc_k(m_b)
  4. Output AEnck()(c,state)A^{\Enc_k(\cdot)}(c, \state)

Then we say Π\Pi is CPA secure if for all NUPPT AA,

{Expr0Π,A(1n)}n{Expr1Π,A(1n)}n.\left\{\Expr_0^{\Pi,A}(1^n)\right\}_n \approx \left\{\Expr_1^{\Pi,A}(1^n)\right\}_n.

Note: the experiment Expr\Expr is often equivalently described as “the adversary AA interacts a challenger CC, where CC performs all other steps that are not belong to AA (such as Gen\Gen, Enc\Enc, and answering the queries to Enck()\Enc_k(\cdot))”.

Compared to Shannon/perfect secrecy, what are the differences?

  • comp. bounded
  • orcale before
  • orcale after
  • choose mm

Suppose that we have a secure encryption even without CPA oracle but the key is shorter than the message. Can we get a PRG/PRF? Can we get a OWF?

Theorem: CPA-Secure Encryption from PRF

Let PRF={fs:{0,1}s{0,1}s}s{0,1}\PRF = \set{f_s : \bit^{|s|} \to \bit^{|s|}}_{s\in\bit^\ast} be a family of PRFs. Then the following (Gen,Enc,Dec)(\Gen, \Enc, \Dec) is a CPA-secure encryption scheme.

  • Gen(1n)\Gen(1^n): sample and output k{0,1}nk \gets \bit^n.
  • Enck(m)\Enc_k(m): given input m{0,1}nm \in \bit^n, sample r{0,1}nr \gets \bit^n, and then output

    c:=mfk(r)  r.c := m \oplus f_k(r) ~\|~ r.
  • Deck(c)\Dec_k(c): given input c=cr{0,1}2nc = c' \| r' \in \bit^{2n}, output

    m:=cfk(r).m := c' \oplus f_k(r').

The correctness and efficiency of the construction follows from PRF directly. It remains to prove CPA security.

To show Expr0\Expr_0 and Expr1\Expr_1 are comp. ind., we define hybrid experiments H0,H1H_0, H_1 as follows.

Hybrid HbA(1n)H_b^{A}(1^n):

  1. FRFnF \gets \RF_n, and then let OFO_F to be the following oracle:

    OF(x):=xF(r)r,O_F(x) := x \oplus F(r) \| r,

    where r{0,1}nr \gets \bit^n is sampled uniformaly.

  2. (m0,m1,state)AOF()(1n)(m_0, m_1, \state) \gets A^{O_F(\cdot)}(1^n)
  3. r{0,1}n,cmbF(r)rr \gets \bit^n, c \gets m_b\oplus F(r) \| r
  4. Output AOF()(c,state)A^{O_F(\cdot)}(c, \state)

By oracle indistinguishability of PRF\PRF and RF\RF and closure under efficient operations, we have {Expr0Π,A(1n)}n{H0A(1n)}n\set{\Expr_0^{\Pi, A}(1^n)}_n \approx \set{H_0^{A}(1^n)}_n and {Expr1Π,A(1n)}n{H1A(1n)}n\set{\Expr_1^{\Pi, A}(1^n)}_n \approx \set{H_1^{A}(1^n)}_n. (Notice that PRF\PRF and RF\RF are oracle ind., but Expr\Expr and HH are comp. ind. of strings.)

Hence, it suffices to prove that the ensembles H0H_0 and H1H_1 are ind. They seem to be indentically distributed (as in OTP). However, there is a difference: AA gets oracle accesses to OFO_F (before and after choosing mbm_b), and OFO_F could sample the same rr in the cipher cc and in another oracle accesses. Fortunately, hitting the same rr twice in polynomial time happens with negligible probability.

We formally prove {H0A(1n)}n{H1A(1n)}n\set{H_0^{A}(1^n)}_n \approx \set{H_1^{A}(1^n)}_n next. Define RR to be the set

R:={r{0,1}n:r is sampled when AOF()},R := \set{r \in \bit^n : r \text{ is sampled when } A^{O_F(\cdot)}},

and let rr be the random variable sampled for the cipher cc. We want to show that Pr[H0A(1n)=1]Pr[H0A(1n)=1]|\Pr[H_0^A(1^n)=1] - \Pr[H_0^A(1^n)=1]| is negligible for all NUPPT AA. Let H0H_0 and H1H_1 be the events for short.

Pr[H0]=Pr[H0rR]+Pr[H0rR]Pr[rR]+Pr[H0rR]Pr[rR]=γ+Pr[H0rR](1γ),\begin{align*} \Pr[H_0] & = \Pr[H_0 \cap r \in R] + \Pr[H_0 \cap r \notin R] \\ & \le \Pr[r \in R] + \Pr[H_0 | r \notin R] \cdot \Pr[r \notin R] \\ & = \gamma + \Pr[H_0 | r \notin R] \cdot (1 - \gamma), \end{align*}

where γ:=R/2n\gamma := |R| / 2^n. We also have Pr[H0rR]=Pr[H1rR]\Pr[H_0 | r \notin R] = \Pr[H_1 | r \notin R], thus

Pr[H0]γ+Pr[H1rR](1γ)=γ+Pr[H1rR]γ+Pr[H1].\begin{align*} \Pr[H_0] & \le \gamma + \Pr[H_1 | r \notin R] \cdot (1 - \gamma)\\ & = \gamma + \Pr[H_1 \cap r \notin R]\\ & \le \gamma + \Pr[H_1]. \end{align*}

Given that R|R| is polynomial in nn for any NUPPT AA, it follows that γ\gamma is negligible in nn, which concludes the proof.

Notice that we could have constructed an efficient CPA-secure encryption from PRG, but using a PRF significantly simplified the construction and the proof.