Contruct PRG from OWF
We have shown that given a PRG, we can construct a CPA-secure encryption. We also showed that PRG, PRF, and CPA-secure encryption implies OWF, so OWFs are necessary if we want cryptography. We also have a universal OWF that is as strong as any candidate OWF. It remains to construct PRG from OWF. The construction implies that OWF is necessary and sufficient for PRG, PRF, and encryption, and that we have an encryption (and PRG and PRF) that is as strong as any encryption (by universal OWF).
Hard-Core Bits from any OWF
So far we have not yet have a candidate construction of PRG (even with 1-bit expansion). We will next construct a PRG from one-way permutations (which is easier).
The construct of PRG comes from two properties of OWF:
- The output of must be sufficiently random when the input is uniform. (If the random variable is constant or taken over a small support for most , then we can invert with good probability.)
- A sufficiently random can still be easily inverted (such as indentity function). By hard to invert, there must be some bits of that are hard to guess even when is given. How many bits are hard to guess for any polynomial-time adversary? Must be (otherwise, it can be guessed correctly w.p. ).
Suppose that is OWP, then we have “fully random” . That is good in terms of the first property.
Definition: One-way Permutations
An OWF for all is called a one-way permutations if is a bijection.
To utilize the second property, we want to obtain some of the “hard bits” from the input . If we can get 1 extra hard bit, we have a construction of PRG by putting together the output of OWP and the extra hard bit. The hard bit must be depend on the output mathematically, but it shall be hard to compute even given the output. The hard bit is formalized below.
Definition: Hard-core bits
A predicate is a hard-core predicate for if is efficiently computable given , and for any NUPPT adversary , there exists a negligible so that for all ,
Notice that here we are not restricting to OWP nor OWF. (If never use a bit in the input, then that bit is hard-core. However, such is not permutation and is more challenging for us. We will focus on OWP in this section.)
That is indeed the case for some OWPs, such as RSA. If we construct OWP from the RSA assumption, then the least significant bit of is that “hard to guess” one, and then we can obtain PRG from RSA assumption.
Theorem: PRG from OWP and hard-core predicate
Suppose that is a OWP and is a hard-core predicate for (for all ). Then, to be defined below is a PRG:
(The proof is a standard reduction: if there exists a NUPPT distinguisher against , then we can build a NUPPT adversary that inverts by running .)
However, we want to obtain PRG from any OWP or any OWF (without depending on specific assumptions). That is unfortunately unclear.
Fortunately, Goldreich-Levin showed that for any OWF (or OWP) , we can obtain another OWF (or OWP) that we know its hard-core predicate. The intuition is: given is hard to invert, in the preimage of , there must be at least bits that are hard to guess (otherwise, a poly-time adversary can invert). The hard-core predicate formalizes those bits. Even we do not know which bits are hard, we can sample randomly and hope to obtain 1 bit out of them.
Theorem: Goldreich-Levin, Hard-Core Lemma
Let for all be a OWF. Define functions to be the following:
where denotes the inner product modulo 2, i.e., for any . Then, is a OWF and is a hard-core predicate for .
Note: in the above definition of and , the thm says that “even we are given the subset and , because is hard to invert, we still do not know the parity of over ”. Since the subset is chosen uniformly, and even we do not know where are them, hits some “hard bits” with overwhelming probability. This is indeed consistent with the earlier intuition.
Clearly is a OWF, and is easy to compute. The main challenge is to prove that is hard-core. We assume for contra that is not hard-core, which is the following, and then to reach contra, we want to construct another adversary that inverts .
Full Assumption:
There exists NUPPT , polynomial , such that for inf many ,
The construct and analysis of is involved, so we will start from a couple of warmups.
Warmup Assumption 1:
There exists NUPPT , such that for inf many , for all ,
To invert , the construction of is simple:
- For , do the following
- Let be the -bit string that only the -th bit is 1 (0 otherwise)
- Run
- Output To see why inverts , observe that , where . Hence, succeeds w.p. 1, a contradiction.
Note: the above assumed “for all ” and “w.p. ”, both are much stronger than we wanted.
Warmup Assumption 2:
There exists NUPPT , polynomial , such that for inf many ,
We would like to use as before, but now may always fail whenever the suffix of is . Hence, we randomize to and and then recover the inner product (this is also called “self correction”).
Fact
For all , any strings , it holds that .
To invert , the construction of is below:
- For each , do
- For to , do
- Sample
- Run
- Let be the majority of
- Output
To prove succeeds with high prob., we first prove that there are many good ’s.
Good instances are plenty.
Define to be the set of good instances,
where .
If the Warmup Assumption 2 holds, then .This is actually a standard averaging argument (or a Markov inequality). Suppose not, . Then,
which contradicts Warmup Assumption 2.
Now, suppose that . fails to invert or w.p. by union bound. So, for any fixed , for each independently. By Chernoff bound, the majority of is w.p. . Choosing , the probability is at least . By union bound over all , recovers except w.p. .
Finally, succeeds w.p. for all uniformly sampled by failing for all .
To complete the full proof, We want to lower the probability from to . The “good set” still holds when modified to (since it is a simple averaging). The main challenges from the previous proof is:
- It is too weak to take the union bound of inverting both and . For , that probability is lowered to only , and then that is too low for the following majority and Chernoff bound.
The first idea is to guess the inner product uniformly at random, which is a correct guess w.p. . Supposing that is a constant, we can choose , all guesses are correct w.p. , then conditioned on correct guesses, we have correct w.p. (when is good), and then we can continue with Chernoff bound (w.p. to fail) and finish the prove. For large , the guesses are too many, and thus the success probability is negligible in .
The second idea is to use pairwise independent guesses. Particularly, we have Chebychev’s bound for the measure concentration of pairwise indep. r.v. (instead of Chernoff bound for fully indep.).
Theorem: Chebychev’s inequality
Let be pairwise independent random variables such that for all , . Then,
where .
[Ps, p189]
We can then reduce the number of guesses from to .
Fact: Sampling pairwise independent random strings
For any , let be strings independently sampled uniformly at random (we abuse notation and round up to the next integer). Define strings for each to be
The random variables are pairwise independent, where denotes such that is the -th subset of .
(The proof is left as exercise.)
Now we are ready to prove the full theorem.
Proof of Hard-Core Lemma (Goldreich-Levin, 1989)
Given NUPPT in the Full Assumption, we construct that inverts as follows.
Algorithm
- Let be a polynomial to choose later.
- Let , be fully independent and be pairwise independent -bit random strings as in sampling pairwise independent.
- For each , sample guess bit uniformly. For each , compute the pairwise independent bits from in the same way as (so that for any , and for all ).
- For each ,
- For each ,
- Run .
- Let be the majority of
- Output
We begin with claiming the number of good instances of .
Good instances are plenty.
Define to be the set of good instances,
where . If the Full Assumption holds, then .
(The proof is almost the same and omitted.)
We condition on the good event that . Next, we condition on the “lucky event” that for all , the guess equals to , which happens w.p. . That implies are all correct: for any , let , we have that
With the conditioning, for any , and are still pairwise independent, and thus are pairwise independent as well. Therefore, by Chebychev’s inequality, the majority of equals to w.p.
where , and denotes the event that outputs correctly. Choosing , we have that .
The above shows that each bit is correctly recovered with a good marginal probability. We want that all bits ’s are correct with a good joint probability, but the different ’s come from the same randomness ’s and ’s. (It is necessary to use the same ’s because we want the number of guessed bits to be strictly logarithmic in ).
Fortunately, union bound works with dependent events. Taking union bound for all , , conditioning on and all ’s are correct. Removing the conditional events* takes and , but still inverts w.p. , contradicting is OWF.
* For any events , . The above applied that is , is , and is the event that is correct for all $$k \in [\ell].
Discuss The number of bits we guessed is , where depends on the (hypothetical) NUPPT . Since the guessed bits entails information about , the proof formally implies that for any OWF, there must be bits that are hard to invert (from to ). Still, having an efficient and uniform attack is non-trivial. Put it in another view. The output of adversary gives a probabilistic bit conveying information about , and the reduction is to learn by repeatedly querying with carefully chosen inputs, so that the probability of finding correct is high and the time and query is small. Hence, the reduction is related to learning at a high level.
Discuss How far does the Hard-core Lemma extend to? Suppose is OWF, and suppose is a hard-core predicate for .
- Is a OWF?
- Let , and let . Is a OWF? If so, is a hard-core predicate for ?
- Let , and let . Is a OWF? If so, is a hard-core predicate for ?
The questions are highly relevant when we want to construct PRG from any one-way function.
Min-Entropy and Leftover Hash Lemma
Definition: statistical difference
For random variables and taking values in , their statistical difference (also known as variation distance) is . We say that and are -close if .
[V, Definition 6.2, p169]
Fact: (Statistical close to uniform, warmup)
Let , , and random variable . If is -close to , then
where denotes the support of .
Definition: Min-entropy
Let be a random variable. Then the min-entropy of is:
where all logs are base 2.
[V, Definition 6.7, p171]
Example:
We say a function is -regular if it is -to-one for every input, i.e., for all , it holds that . We have . Let , and let (so that ).
Suppose someone secretly sampled and computed . We have the following:
- Given nothing, we can only guess correctly w.p. .
- Given nothing, we can only guess correctly w.p. , by the min-entropy of .
- Given , we can guess correctly w.p. . Thus, is viewed as the min-entropy of given (we avoid the formalization of conditional min-entropy).
Theorem: Leftover Hash Lemma [Mazor-Pass 2023, Lemma 3.4]
Let , and let be a random variable over . Let and a random matrix, and let
Then,
where denotes the matrix multiplication modulo 2, denotes the first bits of , and denotes the uniform distribution over .
(See [Mazor-Pass 2023] for a simple proof from Goldreich-Levin’s hard-core lemma.)
Example:
Consider a distribution such that for some . Let be the matrix sampled as in LHL, and let be defined w.r.t. as in LHL. Sample uniformly at random. Then, for any , we have that
where denotes the probability mass function of .
Example:
Consider the distribution , the random matrix , the parameter defined as the above. Then, for any , we have that the following two distributions
- and
- are -close, where and are two independent and uniformly random -bit strings.
(The proof is a simple hybrid.)
Notice that in the above example, and did not need to be the same distribution, and they didn’t even need to be independent.
PRG from any OWF
In this subsection, we begin with describing the construction of Mazor and Pass [MP23].
Construction: Mazor-Pass PRG
Let be a function for all . Define the following functions.
- , where , and is a matrix that will be given later. Looking forward, we will compute for different ’s but the same .
- , which is simply repeatedly applying on independent strings and concatenate the outputs.
The function is defined by the following (deterministic) algorithm, where
- are parameters to be determined later,
- is also a parameter,
- , , , and are all inputs ( is represented in a -bit binary string).
Function :
- For all , compute . This is called a “row,” which consists of bits.
- For each , remove from the prefix bits and suffix bits; call the resulting -bit string as .
- Define for any -bit .
- For each , define . That is, is the concatenation of the -th bits of all , and thus each is -bit.
- Output .
We claim that if is OWF, then the function defined as the above is a PRG. Clearly, is easy to compute if is. For expansion, the input and output sizes of are:
- Input: , for , ’s, ’s, and .
- Output: . Since , we have that
Choosing sufficiently large , we obtain the expansion as the difference between output and input size is
It remains to show pseudorandomness.