Averaging Argument

The following example can be regarded as a generalization of Pigeonhole Principle.

Example 1

If every person likes at least $1/3$ of the books in a library, then the library has a book, which at least $1/3$ of people like.

Proof: Suppose there are $N$ people in total and $B$ books in the library. Let all the people leave a mark on the books they like, then there are at least $NB/3$ marks. Presume there are no such book that has at least $M/3$ books on it, then the total number of marks would be less than $NB/3$, contradicted.

When dealing with problems solved by the pigeonhole principle, we know the total number of pigeons and pigeonholes. For problems solved by averaging argument, we usually don’t see the number of pigeonholes or the number of pigeons or both of them. In this case, we “imagine” the number of pigeons and pigeonholes and keep them in mind.

When talking about probability, the case is always the latter, because we don’t count the exact number of pigeons and their holes.

Let’s introduce two formal discriptions of averaging argument.

Averaging Argument

Let $X$ and $Y$ be sets, and $p$ be a prediction on $X\times Y$. Let $\alpha$ be a real number in the interval $[0,1]$. If for each $x$ in $X$, there exists at least $\alpha|Y|$ in $Y$ that satisfy $p(x,y)$. Then, there exists a $y$ in $Y$ such that there exist at least $\alpha|X|$ elements $x$ in $X$ that satisfy $p(x,y)$.

Proof: We can construct a proof of the same style as the proof of Example 1.

Notice that averaging argument can be generalized to continuous sets $X$ and $Y$ as well as probabilistic situations in Averaging Argument.

Averaging Argument

Let $f$ be a function. If we have a circuit $C$ such that when $x$ is chosen uniformly at random and $y$ is chosen independently from some distribution $\mathcal Y$ over $\lbrace 0,1\rbrace^m$, we have $C(x,y)=f(x)$ holds with probability at least $\rho$. Then there exists a single string $y_0\in\lbrace0,1\rbrace^m$ such that $\Pr[C(x,y_0)]>\rho$.

Proof: Keeping the unseen pigeonhole number in mind, the proof might get simple. But we can also prove it in probabilistic style.

If there’s no such $y_0\in\lbrace0,1\rbrace^m$ s.t. $\Pr[C(x,y_0)]>\rho$, then lets count the probability of $C(x,y)=f(x)$ s.t. $x$ is chosen uniformly at random and $y$ is chosen independently from some distribution $\mathcal Y$. $$ \begin{align} \Pr_{x\gets U,y\gets \mathcal Y}[C(x,y)=f(x)] &= \sum_{y\in\lbrace0,1\rbrace^m} \Pr_{x\gets U}[C(x,y)=f(x)]\cdot \Pr[\mathcal y\gets \mathcal Y] \
&< \sum_{y\in\lbrace0,1\rbrace^m}\rho\Pr[\mathcal y\gets \mathcal Y]=\rho \end{align} $$ Proof done.

Averaging Argument is a generalization of Averaging Argument, since it makes no limitation of $y$’s distribution, other than being independent to $x$.

Averaging argument are used in the proofs of some interesting results of complexity. A simple example is the proof of $\mathbf{BPP}\in\mathbf P_{/\operatorname{poly}}$.

Theorem

$\mathbf{BPP}\in\mathbf P_{/\operatorname{poly}}$.

Proof: For any $L\in\mathbf{BPP}$, there exists a polynomial time turing machine $\mathsf {TM}$ s.t. $$ \Pr_{r\gets\lbrace0,1\rbrace^{p(n)}}[\mathsf{TM}(x,r)\neq L(x)]\leq 1/2^{n+1} $$ holds for every $x\in\lbrace0,1\rbrace^n$. We find out a $r_0$ s.t. $\mathsf{TM}(x,r_0)=L(x)$ for all $x\in\lbrace0,1\rbrace^n$. We interprete the above fact as follows:

For each $x\in\lbrace0,1\rbrace^n$, there exists at least a $1-1/2^{n+1}$ fraction of $r$’s that makes $\mathsf{TM}(x,r)=L(x)$ holds. Then by Averaging Argument, there exists a $r_0$ s.t. at least $1-1/2^{n+1}$ fraction of $x\in\lbrace0,1\rbrace^n$ satisfy $\mathsf{TM}(x,r_0)=L(x)$. Take this $r_0$ as advice, we get $L\in \mathbf P_{/\operatorname{poly}}$.

The basic idea of Averaging Argument can be used elsewhere.

The next exam is quite important, which will be used in the proof of Goldreich-Levin Theorem.

Example 2

Let $f$ be a function $f:\lbrace0,1\rbrace^n\to \lbrace0,1\rbrace^{l(n)}$ and $f^\prime :\lbrace0,1\rbrace^n \times\lbrace0,1\rbrace^n\to \lbrace0,1\rbrace^{l(n)}\times \lbrace0,1\rbrace^n$ defines as $$ f^\prime(x,r)=(f(x),r) $$ where $|x|=|r|$. Let $\mathsf{gl}(x,r)=\bigoplus_{i=1}^n x_i\cdot r_i$.

Suppose there exists a $\mathsf{PPT}$ algorithm $\mathcal A$ with running time $t$ and satisfy $$ \Pr\limits_{ x\leftarrow{0,1}^n ;; r\leftarrow{0,1}^n } [\mathcal A(f(x),r)=\mathsf{gl}(x,r)]\geq\frac 1 2+\varepsilon(n) $$ Then there exists a set $\mathcal S_n\subseteq \lbrace0,1\rbrace^n$ of size at least $\frac{\varepsilon(n)}2\cdot 2^n$ such that for every $x\in \mathcal S_n$ it holds that $$ \Pr\limits_{; r \leftarrow {0,1}^n}[\mathcal A(f(x),r)=\mathsf{gl}(x,r)]\geq \frac 1 2+\frac{\varepsilon (n)} 2 $$

Proof: Still, we keep in mind the unseen number of pigeonholes. We have “1” pigeonholes. “1” means if we choose a pigeonhole uniformly at random from all the pigeonholes then we have probability “1” that we chosen from the set of all pigeonholes… This is not verbose, since there are cases that we choose a pigeonhole from a subset with less than “1” pigeonhole.

We assume we have probability $p$ to choose $x$ from $\mathcal S_n$ and then we have probability $(1-p)$ to choose $x$ from $\overline{\mathcal S_n}=\lbrace0,1\rbrace^n-\mathcal S_n$. We want to reckon the lower bound of $\mathcal S_n$, so we make event $\Pr[\mathcal A(f(x),r)=\mathsf{gl}(x,r)]$ to happen as frequent as posible when $x\gets \mathcal S_n$. As for cases in $\overline{\mathcal S_n}$, by the assumption made it has to happen with probability as least $\frac 1 2+\frac{\varepsilon (n)} 2$, then we have $$ \frac 1 2+\varepsilon(n) \leq \Pr\limits_{x\leftarrow{0,1}^n ;; r\leftarrow{0,1}^n}[\mathcal A(f(x),r)=\mathsf{gl}(x,r)] \leq p\times 1+(1-p)\times (\frac 1 2+\frac{\varepsilon (n)} 2)\leq p+\frac 1 2+\frac{\varepsilon(n)}{2} $$ The las t inequality is obtained by drop $-p$ in $1-p$. By solving the inequality $$ p\geq \frac{\varepsilon (n)}2 $$ Then we have $$ |\mathcal S_n|\geq \frac{\varepsilon(n)}2\cdot 2^n $$