Some Information about the Course

In this course, we mainly talk about the foundation of Cryptography. We talk about the big picture of cryptography, not the cryptosystem based on a specific assumption. Basic knowledge of probability theory, computational complexity is necessary for this course.

This serie of notes mainly follows Goldreich’s legendary book Foundations of Cryptograph. With some basic knowledge of computational complexity, one would enjoy the book a lot. (Though still difficult to learn.)

Furthermore, since Goldreich’s book has been finished for several years, newly research results are not included in it. Results being found in resent years might be added in the notes by me.

Thanks to Mr. Yu and Mrs. Liu at SJTU for giving us wonderful cryptographic courses.

Some Information about the Symbols

We are not going to write too many conditions in a probabilistic formula. We sometimes write conditions by distributions under the simbol $\Pr$.

As for make of a probabilistic distribution like $U_n$, it has two means. Firstly, it denotes a distribution as it is. E.g. Random variable $X$ sampled by $U_n$ is denoted by $X\leftarrow U_n$, then we for each string $x\in U_n$ we have $$ \Pr[x\in U_n]=\frac 1 {2^n} $$ In other situations, these kind of mark can represent an anonymous random variable sampled from the distribution it denotes. We must make it clear that in the same math block, same mark of distribution $D$ always represents the same anonymous random variable sampled from $D$. For different anonymous random variable sampled from the same distribution $D$, we can use superscript to distinguish them. E.g. $D^{(1)}, D^{(2)},\cdots,D^{(m)}$.

One-Way Function

Strong One-way Function

The existence of cryptography is based on $\mathbf P\neq \mathbf{NP}$. But $\mathbf P\neq \mathbf{NP}$ is not enough, cryptography asks for stronger precondition, that is the existence of (Strong) One-way Function.

Strong One-way Function (OWF)

A function $f:\lbrace 0,1\rbrace ^\ast\to \lbrace 0,1\rbrace ^\ast$ is a strong one-way function if it is

Easy to compute: There is a deterministic polynomial-time algorithm $\mathsf{EvalF}$ such that for all $n$ and all $x\in\lbrace 0,1\rbrace ^n$ we have $\mathsf{EvalF(x)} = f(x)$.

Hard to invert: There is no probabilitistic polynomial-time algorithm could invert the function with probability better than $\mathsf{negl(n)}$.

The second property can be restate more formally as

There is no PPT algorithm $\mathcal {A}$ satisfy $$ \Pr[\mathcal A(1^n,f(U_n))\in f^{-1}(f(U_n))]>\frac{1}{n^c} $$ for any $c>0$.

When you see a distribution been used as a random variable in a probabilitistic formula, it means you first sample once from that distribution and use the result as a random variable. It’s easy to understant this since we do this just because we don’t want to give the random variable a explicit name.

! Another thing to pay attention to is that $f$ is defined for all $n$. Or wey say, $f$ is uniform.

As it is shown in the definition, a strong one-way function cannot be revert efficiently in average-case. That is, the function might somehow be inverted by a algorithm once, but hard on average.

Theorem

If strong one-way function exists, then $\mathbf{P}\neq \mathbf{NP}$.

Proof: Suppose $\mathbf{P}=\mathbf{NP}$ and there exists an one-way function $f:\lbrace 0,1\rbrace ^\ast\to \lbrace 0,1\rbrace ^\ast$. Then decide $$ L=\lbrace\langle x^\ast,y\rangle| \exists x:x^\ast\sqsubset x \text{and} f(x)=y\rbrace $$ $L$ is the collection of tuples like $\langle x^\ast, y\rangle$ that $x^\ast$ is the prefix of some $x$ that $f(x)=y$. Clearly $x$ itself is a certificate, so $L\in \mathbf{NP}$. But we have $\mathbf{P}=\mathbf{NP}$, so for every $y$, we can recover $x$ one bit by one bit. This can be done by call the oracle of $\mathcal O_L$. To invert $y$, we firstly ask $\langle 0, y\rangle$. If it returns $0$ then we ask $\langle 10, y\rangle$, otherwise we ask $\langle 00, y\rangle$.

One-way function seems to be a very strong assumption, we improve the situation a little by introduce the weak version of one-way function.

Weak One-Way Function

Weak One-Way Function

A function $f:\lbrace 0,1\rbrace ^\ast\to \lbrace 0,1\rbrace ^\ast$ is a one-way function if it is

Easy to compute: There is a deterministic polynomial-time algorithm $\mathsf{EvalF}$ such that for all $n$ and all $x\in\lbrace 0,1\rbrace ^n$ we have $\mathsf{EvalF(x)} = f(x)$.

Slightly Hard to invert: There exists a polynomial $p(\cdot)$ such that for every PPT algorithm $\mathcal A$ and all sufficiently large n’s, $$ \Pr[\mathcal A(f(U_n,1^n)\notin f^{-1}(f(U_n))]<1-\frac 1{p(n)} $$

See, it’s not “very hard” to invert weak one-way function. But pay attention that every algorithm shares the same $p(\cdot)$ in the definition. It means that every algorithms could only invert the function correctly with a upper bound probability. We can regard it as a “gap”.

It seems that weak one-way function is useless, since their might be an adversary almost always inverts it. However, later we’ll see a amazing result about it.

In another view, weak one-way function seems not too weak. The failure probability of weak one-way function has essential difference from such of a PPT algorithm $\mathcal A$ solves some $L\in\mathbf{BPP}$. In the later, since every $x\in L$ can be decide by $\mathcal A$ with large probability, we can decrease the failure probability to negligible by repeating $\mathcal A$ with different coin tosses. But for the former, even repeating for polynomial times can’t help with decreasing the gap. It says these there are cases that be solved by any PPT algorithm.

One-Way Function Defined for Some Lengths.

We could define one-way function for some lengths. Let $I\in \mathbb N$, then $I$ contains some natural numbers. We define the successor of $n$ in $I$ the minimal number in $I$ that is larger than $I$, denoted by $s_I(n)$.

E.g. Let $I= \lbrace 2,3,7,9,10,12,14,\cdots\rbrace $, then $s_I(8)=9$.

$I$ is called polynomial-time-enumerable if there exists a $poly$-time turing machine that on input $n$ output $1^{S_I(n)}$.

To modify the definition of strong/weak one way function, change “every $n$” to “every $n\in I$” to get the definiton of strong/weak one-way functions for some lengths.

Fact One-Way function for polynomial-time-enumerable $I$ can be easily transformed to the original one-way function.

Theorem

Let $f(\cdot)$ be a one way function defined for lengths of polynomial-time-enumerable set $I$. Let $x=x^\prime x^"$, where $x^\prime$ is the longest prefix of $x$ with length in $I$. Define $$ g(x)=f(x^\prime)x^" $$ Then $g(x)$ is a one-way function.

Proof: The proof is simple, which is omitted here. But we review the basic thought of the proof. The technique used here is reduction. Suppose there is a algorithm $\mathcal A$ that invert $g(\cdot)$ efficiently, then we call construct a efficient algorithm $\mathcal A^\prime$ which could invoke $\mathcal A$ polynomial times and invert $f(\cdot)$ efficiently.

! This is the basic ideal for strong one-way function. As for weak one-way function, the basic idea is similar.

Length-Regular & Length-Preserving One-Way Function

The following two definitions can be used to both strong & weak one-way functions.

Length-Regular One-Way function

A one-way function is length-regular if for every $|x_1|=|x_2|$, it holds that $|f(x_1)|=|f(x_2)|$.

Length-Preserving One-Way function

A one-way function is length-preserving if for every $x$, it holds that $|x|=|f(x)|$.

Given a one-way function, we can construct a length-preserving one-way function by the following 2 steps. Given one-way function $f(\cdot)$, we first construct a length-preserving function $g$ by adding the pad $10^\ast$ $$ g(x)\doteq f(x)10^{p(|x|)-|f(x)|} $$ Then $p(\cdot)+1$ is fixed output length of $g(x)$. Then we transform $g$ to a length-preserving one-way function $g^\prime$ by the following: $$ g^\prime(x^\prime x^")\doteq g(x^\prime), \quad\text{where } |x^\prime x^"|=p(|x^\prime|)+1 $$ We’ve done. The one-wayness of $g^\prime$ can be also proved by reduction.

But the previous technique might not preserving the injectivity of a one-way function. For injectivity one-way fucntion $f$, the first step do construct an injectivity length-regular one-way function. But the second step break its injectivity.

Non-uniformly One-Way Function

$\mathbf{BPP}\subseteq \mathbf{P}_{/\text{poly}}$ as we known, a one-way function might still be invert efficiently by an non-uniform algorithm. If we design cryptosystem based on one-way function, it might still be analysis by using non-uniform algorithms. How strong are you going to define cryptosystem? It’s up to you. You might still define security means can’t be efficiently analized by quantum computers…

Non-uniformly one-way function is a stronger version of one-way function, it is those one-way functions has non-uniform one-wayness, which means it can’t be revert efficiently by a non-uniform algorithm. Notice that non-uniform algorithms are deterministic if they are represented by circuit families.

Non-uniformly One-Way Function

A function $f:\lbrace 0,1\rbrace ^\ast\to \lbrace 0,1\rbrace ^\ast$ is a strong one-way function if it is

Easy to compute: There is a deterministic polynomial-time algorithm $\mathsf{EvalF}$ such that for all $n$ and all $x\in\lbrace 0,1\rbrace ^n$ we have $\mathsf{EvalF(x)} = f(x)$.

Hard to invert: For all $poly$-size circuit family $\lbrace C_n\rbrace $ , polynomial $p(\cdot)$ and efficient large $n$, it holds that $$ \Pr[C_n(f(x))\in f^{-1}(f(x))]<\frac 1 {p(n)} $$

Look, to compute the function, a $\mathbf {BPP}$ algorithm is enough, but even a non-uniform circuit familiy of $poly$-size cold invert it.

Weak One-Way Functions Imply Strong Ones

Theorem

Weak one-way function exists if and only if strong one-way function exitst.

Not all weak one-way functions are strong one-way functions, refer to Goldreich’s textbook for a counterexample. And $\Leftarrow$ direction is trivial, so we focus on prove the $\Rightarrow$ direction.

Let $f$ be a weak one-way function with $p(\cdot)$ to be guaranteed by definition. Then we might construct $$ g(x_1,\cdots,x_{t(n)})\doteq (f(x_1),\cdots,f(x_{t(n)})) $$

where $t(n)=n\cdot p(n)$ .

Naive Proof

It is obvious that $g$ is easy to compute. As for invertion, one might think the success probability is less than $(1-\frac{1}{p(n)})^{n\cdot p(n)}$.

However, there’s a severe mistake in this proof. In such proof, one have opened the adversary and limited its strategy on inverting $g$ by inverting every $f(x_i)$ independently. This alert us that not to open the adversary in the proof of universal security.

To see this kind of proof gives wrong results, let’s consider a simple example. Suppose $f,g$ are one-way functions. Then is $h(x)=(f(x), g(x))$ a one way function? By the naive proof, the probability any PPT $\mathcal A$ inverse $f(x)$ and $g(x)$ both with negligible probability, $\epsilon_1,\epsilon_2$ correspondingly. Then it inverse $h$ with probility $1-(1-\epsilon_1)(1-\epsilon_2)=\epsilon_1+\epsilon_2-\epsilon_1\epsilon_2$, which is negligible. However, if $f(x)$ is a length-regular one-way function, one can prove that $g(x)=f(x)\oplus x$ is an one-way function. Let $h(x)=(f(x),g(x))$, an adversary can easily inverse $g(x)$ by computing $f(x)\oplus g(x)=x$.

Right Proof

The fact is the construction is right but the proof is wrong. A detailed right proof is given in the following. Suppose we have the algorithm $\mathcal B$ to invert $g$ efficiently, that is $$ \Pr[\mathcal B(g(U_m))\in g^{-1}(g(U_m))]>\frac 1{q(m)} $$ then as for $m=n^2p(n)$ there is $$ \Pr[\mathcal B(g(U_{n^2p(n)}))\in g^{-1}(g(U_{n^2p(n)}))]>\frac{1}{q(n^2p(n))} $$

Construct Independent Inversion Procedure

To done the proof, we use the pradigm of the proof of unversal security. Namely, by reduction. Assume that $g$ is not a strong one-way function, we construct an algorithm that invert $f$ with probability greater than $1-1/p(n)$. Let $\mathcal B$ be the PPT algorithm that inverts $g$ with non-negligible probability.

Now do a brainstorm: The counterexample of the naive proof shows that invert $f(x)$ independently might not be good as put $f(x)$ into $g(x_1,\cdots, x_{t(n)})=(f(x_1),\cdots,f(x_{t(n)})$ and asks $\mathcal B$ to invert is nonindependently. The counterexample shows that this ideal might work. Thus, we come up with the procedure $I$, shown as follows:

Procedure $I$

Input: $y$

$n\leftarrow |y|$

for $i=1$ to $t(n)$ do

$x_1,\cdots,x_{t(n)}\leftarrow U_n$

$z_1,z_2,\cdots,z_{t(n)}\leftarrow \mathcal B(f(x_1), \cdots, f(x_{i-1}), y,f(x_{i+1}),\cdots,f(x_{t(n)}))$

if $f(z_i)=y$ then return $z_i$

end for

Why don’t we repeat the algorithm that invert $f(x)$ for random $x$ with probability $1/p(n)$? One might think about the definition of $\mathbf{BPP}$ s.t. $\mathbf{BPP}(1/2-1/n^c)=\mathbf{BPP}(2^{-n^d})$. However, this is not the case. $\mathbf{BPP}$ stands for those prombes that every input $x$ can be solved in PPT. For weak one-way function, there might be a small very hard fraction of $f(x)$ that cannot be revert in PPT. As we’ll see, by procedure $I$, we have analogued this fraction of $f(x)$. It it obvious that, if any of $(f(x_{i-1}), y,f(x_{i+1}),\cdots,f(x_{t(n)}))$ is in the very hard fraction, then $\mathcal B$ could not invert it with probability better then $1/p(n)$.

To invert $y$, we test $y$ with procedure $I$ with $a(n)$ times, and with to get right $x$ s.t. $y=f(x)$ with good probability. We say $x$ is good if $x$ satisfies $$ \Pr[I(f(x))\in f^{-1}(f(x))]> \frac{n}{a(n)} $$ Otherwise, we say $x$ is bad. For a good $x$, repeating the procedure for $b(n)$ times one reaches wrong probability $$ \left(1-\frac{1}{a(n)}\right)^{n(n)}< \frac1{2^n} $$ Which can be proved by $e^-1=(e^{-1/x})^x\geq (1-1/x)^{x}$. Thus if one prove that such $x$ contributes an $1/2p(n)$ fraction of $\lbrace 0,1\rbrace^n$, one achieves our results. (This is the best interpretation for me that how comes up with the value $a$. If you have a better one, please remind me). Moreover, we denote $$ S_n\doteq \lbrace\Pr[(f(x))\in f^{-1}(f(x))]>\frac{n}{a(n)}\rbrace $$

Notice that for $x\in S_n$, no matter how large the polynomial $a(n)$ is, we can invert $f(x)$ with overwhelming probability. Though this fact also holds for any inverter that inverts $f(x)$ directly with success probability $1/p(n)$, we’ll see later $I$ improve this probability.

Averaging Arguments for Successful Inversions

Let us assume that $$ s(n)\doteq \Pr[g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))]>\frac{1}{q(n\cdot t(n))} $$ by the assumption that $g$ is not a strong one-way function. (Notice that for any $N>0$, such $n>N$ can be find.)

Now we are going to focus on the size of $S_n$. Since $S_n$ collects all the good ones, if the input $\left(f(x_1),\cdots,f(x_{t(n)})\right)$ of $g$ satisfy that all $x_i$ come from $S_n$, then it will be easy to be inverted; otherwise, it we be hard to be inverted. Then if we bound the probability of $\left(f(x_1),\cdots,f(x_{t(n)})\right)$ that contains a bad $x_i$ being inverted by $I$, we’ve also bound the probability of such that contains no bad $x_i$ being inverted by $I$.

We seperate $\lbrace 0,1\rbrace^{nt(n)}$ into two parts, one contains one or more bad $x_i$’s in $f(x_i)$’s and the other don’t. We bound the probability that the string in these two parts being inverted by $\mathcal B$, that is $$ s_1(n)\doteq \Pr [\mathcal B(g(U_{n\cdot t(n)})))\in g^{-1}(g(U_{n\cdot t(n)}))\wedge(\exists i: U^{i}_n\notin S_n)] $$ and $$ s_2(n)\doteq \Pr[\mathcal B(g(U_{n\cdot t(n)})))\in g^{-1}(g(U_{n\cdot t(n)}))\wedge(\forall i: U^{i}_n\in S_n)] $$ It is clear that $s(n)=s_1(n)+s_2(n)$.

Before giving the acual bound, let’s do a mental experiement and “review what’s happening next”. We want to have a subset $S_n$ of $\lbrace 0,1\rbrace^n$ that every $x\in S_n$ says $f(x)$ can be inverted with overwhelming probability. If this achieves, it would be good that $S_n$ contribute a large fraction of $\lbrace 0,1\rbrace^n$. To have large $S_n$, $s_1$ has to be bounded, by averaging argument.

Probability Bound for $s_1(n)$

$s_1(n)$ can be bounded by following:

$$ \begin{align} s_1(n) &=\Pr[\exists i:\mathcal B(g(U_{n\cdot t(n)})\in g^{-1}(g(U_{n\cdot t(n)}))\wedge(U^{(i)}_n\notin S_n)] \newline & \leq \sum^{n\cdot p(n)}_{i=1}\Pr[\mathcal B(g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))\wedge(U^{(i)}_n\notin S_n)] \newline &\leq \sum^{n\cdot p(n)}_{i=1}\sum_{x\notin S_n}\Pr[\mathcal B(g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))\wedge(U^{(i)}_n\neq x)] \newline &= \sum^{n\cdot p(n)}_{i=1}\sum_{x\notin S_n}\Pr[U^{(i)}_n=x]\cdot\Pr[\mathcal B(g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))|U^{(i)}_n=x] \newline &\leq \sum^{n\cdot p(n)}_{i=1} \max_{x\notin S_n}\lbrace\Pr[\mathcal B(g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))|U^{(i)}_n=x]\rbrace \end{align} $$

Next, since whenever $\mathcal B$ inverts $\left(f(x_1),\cdots,f(x_{t(n)})\right)$ with $y=f(x)$ in the $i$th place, $I$ always invert $y=f(x)$, then it holds that $$ \Pr[I(f(x))\in f^{-1}(f(x))]\geq \Pr[\mathcal B(g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))|U^{(i)}_n=x] $$ Thus, we have $$ \begin{align} s_1(n)&\leq \sum^{n\cdot p(n)}_{i=1} \max_{x\notin S_n}\lbrace\Pr[\mathcal B(g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))|U^{(i)}_n=x]\rbrace \newline &\leq \sum^{n\cdot p(n)}_{i=1} \max_{x\notin s_n}{\Pr[I(f(x))\notin f^{-1}(f(x))]} \newline &\leq n\cdot p(n)\cdot \frac{n}{a(n)} \newline &= \frac{n^2\cdot p(n)}{a(n)} \end{align} $$ Next, we are going to prove that $|S_n|>(1-\frac{1}{2p(n)})\cdot 2^n$

If we assume $|S_n|\leq (1-\frac 1 {2p(n)})\cdot 2^n$, then $$ \begin{align} s_2(n) &\leq \Pr[\forall i:U^{(i)}_n\in S_n]\newline &\leq (1-\frac{1}{2p(n)})^{n\cdot p(n)}\newline &< \frac 1 {2^{n/2}} < \frac{n^2\cdot p(n)}{a(n)} \end{align} $$ holds for efficient large $n$. By contradiction

$$ \frac 1{q(n^2p(n))}<s(n)<s_1(n)+s_2(n)<\frac {2n^2p(n)}{a(n)} $$ let $a(n)=2n^2p(n)q(n^2p(n))$, we have $$ |S_n|>(1-\frac 1 {2p(n)})\cdot 2^n $$

Bound the lower probability of $\mathcal A$

We now bound the success invert probability of $\mathcal A$. For $x\in S_n$, the failure probability can be bounded by $$ \Pr\limits_{x\leftarrow U(S_n)}[\mathcal{A}(f(x))\notin f^{-1}(f(x))]<(1-\frac{n}{a(n)})^{a(n)}<\frac 1{2^n} $$ Now its easy to bold the success probability of $\mathcal A$. $$ \begin{align} &\quad,\Pr[\mathcal A(f(x))\in f^{-1}(f(x))] \newline &\geq \Pr[\mathcal A(f(x))\in f^{-1}(f(x)) \wedge(U_n\in S_n)] \newline &= \Pr[\mathcal A(f(x))\in f^{-1}(f(x))|U_n\in S_n]\cdot\Pr[U_n\in S_n] \newline &\geq (1-\frac 1 {2p(n)})\cdot(1-2^{-n}) > 1-\frac 1{p(n)} \end{align} $$ Which is contradicted with the fact that $f$ is promised to be failed to invert with probability of $p(n)$. We’ve done the proof.

Some Comments

As we can see, the algorithm $\mathcal I$ that invert every $f(x_i)$ independently is only one of such algorithms that inverts $g$, and it is not the best. If $\mathcal I$ is the best, we don’t even need $t(n)$ to be a polynomial of $n$, it only has to be larger than $\operatorname{poly}(n)$ to make $c^{-t(n)}$ negligible.

A Naive Example

On date.

Summary for One-Way Function

As we have learned, there is a universal construction method for strong one-way function by weak one-way function. We are going to use the word “one-way function” in future discussing.

Futhermore, since we’ve known that one-way functions for polynomial-time-enumerable sets exists imply one-way function for all lengths exists. And the existence of the later implies the existence of length-regular or length-preserving functions. We might use these attributes without declaration.

Universal One-Way Function

We can define a function $$ f_{\text{uni}}(\llcorner\boldsymbol M\lrcorner,x) \doteq(\llcorner\boldsymbol M\lrcorner,\boldsymbol M(x)) $$ where $\boldsymbol M$ is a turing machine as $\llcorner\boldsymbol M\lrcorner$ is its description. This is called a universal one-way function, which is obviously computable on a universal TM with a step counter. It is interesting that if there is any one-way function $g$, so is $f_{\text{uni}}$.

Furthermore, we can use $f_{\text{uni}} $ to construct the polynomial-time computable strong one-way function.

One-Way Function Family

(Also refered as one-way functions as s collection)

If we put some of one-way functions together, we get a collection of one-way funtions. This is meaningless unless we have a good way to compute them by a single efficient evaluation function. If such a function do exists, with some extra attributes, the collection would be quite useful. Based on this main idea, we define one-way function family.

One-Way Function Family

A collection of functions $\lbrace f_i:D_i\to\lbrace 0,1\rbrace ^\ast | i\in I\rbrace$ is called a strong (resp., weakly) one-way function family if there exists three PPT algorithms $\mathcal G,\mathcal S,\mathcal E$ satisfy

The generation algorithm $\mathcal G $: given the input $1^n$, output a random variable $i$ from $I_n=I\cap \lbrace 0,1\rbrace ^n$.

The sampling algorithm $\mathcal S$: given the input $1^n$ and $I_n$, output a random value $x$ from $D_i$.

Then evaluation algorithm $\mathcal E$: given the input $1^n, I_n, x$, output $f_i(x)$ correctly.

and one-wayness:

For every PPT algorithm $\mathcal A$, every polynomial $p(\cdot)$ and all sufficient large $n$’s, it holds that

$$ \Pr\limits_{i\leftarrow \mathcal S(1^n); x\leftarrow\mathcal G(1^n,i)}[\mathcal A(i,f_{i}(x))\in f^{-1}_{i}(f_{i}(x))]<\frac 1 {p(n)} $$

for any polynomial $p(n)$.

The definition is straight forward, but there are still somewhere to pay attention to.

In order to efficiently sample from $I\cap \lbrace 0,1\rbrace ^n$ for all efficient large $n$, the size of $I$ has to be noticable, which means $|I\cap \lbrace 0,1\rbrace ^n|\geq 2^n/\text{poly}(n)$. The generation algorithm $\mathcal G$ don’t have to sample $I_n$ uniformly at random on $I\cap \lbrace 0,1\rbrace ^n$. Likewise, the output of sample algorithm $\mathcal S$ on input $i$ do not necessarily distributed uniformly over $D_i$.
The three algorithms $\mathcal G,\mathcal S,\mathcal E$ are allowed to fail with negligible probability. This doesn’t affects its cryptographic usage.
Have you ever find out something wired in the definition? The defintion of “One-way function family” doesn’t mentioned that all the functions in it has to be one-way function! Otherwise, it might allows for a negligible fraction of $I\cap \lbrace 0,1\rbrace^n$, $f_i$ is not a one-way function.

Here we recall the reduction of $\mathbf{RP}$ again. We can always construct a generation or a sampling algorithm that outputs YES with overwhelming probability by repeat $\mathcal G$ or $\mathcal S$ polynomial times. We are going to use this in the definition of trapdoor permutation family.

Trapdoor One-Way Permutations

With various assumptions made, an trapdoor one-way function family can be constructed. An trapdoor one-way function is a one-way function s.t. being easy to be inverted when extra information is given. This extra information is called the trapdoor. A trapdoor one-way function would be quite useful if it is injective, since it allows us to construct public encryption schemes and many other cryptography primitives that can’t be constructed from one-way function.

For OWP family (One-Way Permutation family) with trapdoors, we are going to use the most convenient definition for cryptography.

Trapdoor Permutation Family

Let $I\subseteq \lbrace 0,1\rbrace ^n$ and define $I_n=I\cap\lbrace 0,1\rbrace ^n$. A trapdoor permutation famility with indices in $I$ is a set of functions $\lbrace f_i:D_i\to D_i|i\in I \rbrace$ such that each $f_i$ is a permutation on the corresponding $D_i$. Such a collection is called a trapdoor permutation if there exist four PPT algorithms $\mathcal G, \mathcal S, \mathcal E, \mathcal E^{-1}$ such that the following five conditions hold:

Index and trapdoor generation $\mathcal G$: $$ \Pr[\mathcal G(1^n)\in I_n\times \lbrace 0,1\rbrace ^\ast]>1-2^{-n} $$

Random variable in domain sampling $\mathcal S$, for every $n\in\mathbb N$ and $i\in I_n$

$\Pr[\mathcal S(i)\in D_i]>1-2^{-n}$

If the output of $\mathcal S$ is in $D_i$, then its distributed uniformly at random on $D_i$, that is $$ \Pr[\mathcal S(i)=x|\mathcal{S}(i)\in D_i]=\frac1{|D_i|} $$

Efficient evaluation $\mathcal E$, for every $n\in\mathbb N,i\in I_n$, and $x\in D_i$ $$ \Pr[\mathcal E(i,x)=f_i(x)]>1-2^{-n} $$

Hard to invert:

Hard-Core Predicates

As we know, to invert a trapdoor one-way function is hard without giving the trapdoor. However, given one-way function $f$, and $f(x)$, extract any information $h(x)$ for efficiently computed $h$ is not allowed. The one-wayness doesn’t imply this fact directly. To construct encryption schemes, we need this strong information hiding probability being implied by the existence of one-way function in some sence.

Continue to our example, if $b:\lbrace 0,1\rbrace^{|x|}\to \lbrace 0,1\rbrace$, is hard to be predicate, then we can hide one bit $b$ by

$b(x)\oplus m$. Moreover, such $h$ has to be efficiently computable, otherwise the encryption can’t be done efficiently. Due to the unpredicability of $h(x)$, the information of $b$ would be hide well. If all above $b$ holds, we call it a hard-core bit.

Hard-Core Predicate

A PPT computable predicate $b:\lbrace 0,1\rbrace ^\ast\to\lbrace 0,1\rbrace $ is called a hard-core of a function $f$ if for every PPT algorithm $\mathcal A$, every polynomial $p(\cdot)$ and all sufficient large $n$, it holds that $$ \Pr[\mathcal A(f(U_n))=b(U_n)]<\frac{1}{2}+\frac 1 {p(n)} $$

Don’t be confused with the word “predicate”; it just means a function with range $\lbrace 0,1\rbrace $. It is obvious that a function $f:\lbrace 0,1\rbrace^n\to \lbrace 0,1\rbrace^n$ has a hard-core bit only if it is one-way.

We can also define hard-core for a one-way function family.

Hard-Core Predicate for One-Way Function Family

For one-way function family $(\mathcal G,\mathcal S ,\mathcal E)$, a $poly$-time alogrithm $\mathcal B:\lbrace 0,1\rbrace ^\ast\times \lbrace 0,1\rbrace ^\ast\to \lbrace 0,1\rbrace $ is called a hard-core of the family if for every PPT algorithm $\mathcal A$, every polynomial $p(\cdot)$, and all sufficiently large n’s, $$ \Pr\limits_{i\leftarrow \mathcal G(1^n); x\leftarrow \mathcal S(1^n,i)}[\mathcal A(i,f_i(x))=\mathcal B(i,x)]<\frac 12+\frac 1{p(n)} $$

Hard-Core for Any One-Way Function

In this section, we are going to prove “where there is a one-way function, there is a hard-core bit”. The construction of the hard-core is based on a very simple idea of generalize the XOR lemma of statistical indistinguishable to computational indistinguishable.

Goldreich-Levin Theorem

Let $f$ be an arbitrary strong one-way function, and let $g$ be defined as $g(x,r)=(f(x),r)$, where $|x|=|r|$. Then predicate $b=\bigoplus_{i=1}^{|x|}x\cdot r$ is a hard-core of function $g$.

Well the proof of this theorem is quite long, but is basic idea is simple, by voting. Assume that there is some alogrithm $\mathcal B$ that predict $b$ with non-negligible probability, we are going to construct an algorithm $\mathcal A$ whom invokes $\mathcal B$ for polynomial times and recover $x\in f^{-1}(y)$ bit by bit. We are going to prove a weaker theorem first in order to build up the intuition of this theorem.

Weaker Version: The constant successful prdiction probability $> 0.76$

We are going to show how to recover one bit by invoking the algorithm hard-core predicting algorithm $\mathcal B$ that satisfy $$ \Pr[\mathcal B(f(x),r)=b(x,r)]>0.75+\frac{1}{2p(n)} $$ where $p(n)$ is a polynomial.

Let $e_i$ be the $n$-dimensional binary vector with $1$ in the $i$th component and $0$ in the others. Since $r$ is uniformly random, $r\oplus e_i$ is uniformly random. We can use to $\mathcal B(f(x),r \oplus e_i)$ to predict $b(x,r\oplus e_i)$, and get the right result with probability $> 0.75+\frac 1{2p(n)}$.

If $\mathcal B$ succesfully predicates both $b(x,r)$ and $b(x,r\oplus e+i)$, then $$ \begin{align} \mathcal B(f(x),r)\oplus \mathcal B(f(x),r\oplus e_i) &= b(x,r)\oplus b(x,r\oplus e_i) \newline &= b(x,e_i) \newline & = x_i \end{align} $$ Then we can constuct an algorithm $\mathcal A(f(x),r)=\mathcal B(f(x),r)\oplus \mathcal B(f(x),r\oplus e_i)$ it holds that $$ \Pr[\mathcal A(f(x),r)=x_i]>\frac 1 2 +\frac 1 {p(n)} $$ By Chernoff bound, we can predicate $x_i$ with overwhelming probability by repeating for polynomial times.

Proof of Goldreich-Levin Theorem

For $\mathcal B$ with success probability less than $0.75$, the previous method cannot work.

I think Goldreich has introduced his intuition finely, so I’m going to quote the contents in his book here, with some slightly modify to the symbols.

What is required is an alternative way of using the algorithm $\mathcal B$, a way that does not double the original error probability of $\mathcal B$. They key idea is to generate the $r$’s in a way that requires applying algorithm $\mathcal B$ only once per each $r$ (and $i$), instead of twice. Specifically, we shall use algorithm $\mathcal B$ to obtain a “guess” for $b(x,r\oplus e_i)$ and obtain $b(x,r)$ in a different way. The good news is that the error probability is no longer doubled, since we use $\mathcal B$ get a “quess” of $b(x,r\oplus e_i)$. The bad news is that we still need to know $b(x,r)$ for only one $r$ (or logarithmacally in $|x|$ many $r$’s), but the problem is that we need to know (and hence guess) the values of $b(x,r)$ for polynomially many r’s. An obvious way of guessing these $b(x,r)$’s yields an exponentially vanishing success probability. Instead we generate these polynomially many $r$’s such that, on one hand, they are “sufficiently random,” whereas, on the other hand, we can guess all the $b(x,r)$’s with noticable success probability. Specifically, generating the $r$’s in a particular pairwise-independent manner will satisfy both (seemingly contradictory) reqirements.

Back to the proof with some explaination

Suppose there is an algorithm $\mathcal B$ holds that $$ \Pr\limits_{x\leftarrow U_n;;r\leftarrow U_n}[\mathcal B(f(x),r)=b(x,r)]>\frac12+\frac1{p(n)} $$ where $p(\cdot)$ is a polynomial and $|x|=n$.

We define a “good” subset of $\lbrace 0,1\rbrace ^n$. $S_n$ defines as $$ S_n=\lbrace x:\Pr\limits_{r\leftarrow U_n}[\mathcal B(f(x),r)=b(x,r)]>\frac 12+\frac1{2p(n)}\rbrace $$ We have proved that $|S_n|>\frac1{2p(n)}\cdot 2^n$ in the article Averaging Argument.

We set $l\doteq \lceil \log_2(2n\cdot p(n)^2+1)\rfloor$. We construct a algorithm $\mathcal B$ proceeds as follows:

Sample $s^1,\cdots, s^l\in\lbrace 0,1\rbrace ^\ast $and $\sigma^1,\cdots, \sigma^l\in\lbrace 0,1\rbrace $ uniformly at random.
For every one-empty set $J\subseteq [l]$, computes $r^J=\bigoplus_{j\in J} s_j$ and a bit $\rho^J=\bigoplus_{j\in J}\sigma_j$.
For every $i\in [n]$ and every non-empty $J \subseteq [l]$, computes $$ z^J_i\leftarrow \rho^J\oplus \mathcal B(y,r^J\oplus e_i) $$
For evry $i\in[n]$, it sets $z_i$ to be the majority of the $z^J_i$ values.
Output $z=z_1\cdots z_n$.

We have some comments. $\sigma_j$ is regard as the approximation of $b(x, s_j)$. It might sound wired since $\sigma_j$ is sampled uniformly at random. It sounds more like a “bad guess” since it do predication with a coin. Thanks to the fact we only guess $O(\log n)$ bits, thus the probability of guessing all bits right can be bounded by $$ \Pr[\forall i\in[l].\sigma_i=b(x,s_i)]= 2^{-l}>\frac 1 {4n\cdot p(n)^2} $$ which is an inverse polynomial. However, polynomial time of guess is far from enough. We generate more guess by combine multiple $s$’s as we have done in step 2. Recall that we predicate $x_i$ by $\mathcal B(x,r^J)\oplus \mathcal B(x,r^J\oplus e_i)$, but his doubles the wrong probability of $\mathcal B$, which lead the method can’t be use for such $\mathcal B$ with wrong probability larger than $1/4$. To overcome this, we instead use $\rho^J$ as a guess of $\mathcal B(x,r^J)$. For pairwise-independent $r^J$’s, Chebyshev’s inequality would be helpful. We review Chebyshev’s inequality as follows:

Theorem

If $X$ is a random variable with expectation $\mu$, then $$ \operatorname{Pr}[|X-\mu| \geq \varepsilon] \leq \frac{\operatorname{Var}[X]}{\varepsilon^{2}} $$

We define a new random variable $\zeta^J_i$: $\zeta^J_i = 1$ if and only if we use $r^J$ guesses $x_i$ correctly, which means $b(x,r^J)\oplus \mathcal B(f(x),r^J\oplus e_i)=x_i$. Since for $J\neq K$, there exists $j\notin J\cap K$, such that $r^J\oplus r^K=r^j\oplus r^T$ for some $T\subset [l]$, $r^J$ and $r^K$ are independent, and thus $\zeta^J_i$ and $\zeta^K_i$ are independent. Then for all $J\subset [l]$, $\zeta^J_i$ are pair-wise independent.

We omit $i$ for $\zeta_i^J$ for simplicity. Since $b(x,r^J)\oplus b(x,r^J\oplus e_i)$, if follows that $\zeta^J =1 $ if and only if $\mathcal B(f(x),r^J\oplus e_i)=b(x,r^J\oplus e_i)$. For $x\in S_n$, We have $$ \begin{align} \Pr\left[\sum_{J\subseteq[l]} \zeta^J\leq \frac m 2\right] &\leq \Pr\left[\left|\sum_{J\subseteq[l]}\zeta^J-\left(\frac12+\frac1{2p(n)}\right)\cdot m\right|\geq \frac1{2p(n)m}\right] \newline &\leq \frac{m\cdot \text{Var}[\zeta]}{\left(\frac1{2p(n)}\cdot m\right)^2} \newline &\leq \frac{m\cdot \text{Var}[\zeta]}{\left(\frac1{2p(n)}\right)^2 \cdot (2n\cdot p(n)^2)} \newline &= \frac 1{2n} \end{align} $$ Notice that we have used $$ \begin{align} \operatorname{Var}\left[\left|\sum_{J\subseteq[l]}\zeta^J-\left(\frac12+\frac1{2p(n)}\right)\cdot m\right|\right] &=\operatorname{E}\left[\left(\sum_{J\subseteq[l]}\zeta^J-\left(\frac12+\frac1{2p(n)}\right)\cdot m\right)^2\right] \newline &= \sum_{J,K\subset [l]}\operatorname{E}\left[(\zeta^J-u)(\zeta^K-u)\right] \newline &= \sum_{J\subset [l]}\operatorname{E}\left[(\zeta^J-u)^2\right] \newline &= m\cdot \operatorname{Var}[\zeta] \end{align} $$ for $u=1/2+1/(2p(n))$, where the second last equality rely on the pairwise independent of $\zeta^J$ and $\zeta^K$ for $J\neq K$.

Thus, if we output the majority of $\zeta^J$, we have probability of wrong predication of $1-1/2n$.

And now its time to evaluate the probability of $z_1\cdots z_n=x_1\cdots x_n$, $$ \begin{align} \Pr[z_1\cdots z_n=x_1\cdots x_n|x\in S_n]&\geq 1- \sum^n_{j=1}x\in \Pr[x\in S_n]\cdot \Pr[x_j\neq z_j] \newline & =\frac {(1-\frac n{2n})}{4n\cdot p(n)^2}=\frac 1{8n\cdot p(n)^2} \end{align} $$ Then $\Pr[z_1\cdots z_n=x_1\cdots x_n] \geq \Pr[z_1\cdots z_n=x_1\cdots x_n|x\in S_n]\cdot \Pr[x\in S_n]$ which is polynomial and can be made overwhelming by repeating for polynomial times (Chernoff bound).