MAS473 summery
HDS
Contentration near the equator
Then, at least $1-\frac{2}{c}e^{-\frac{1}{2}c^2}$ fraction of the volume of the $d$-dimensional unit ball has $|x_1|\le \frac{c}{\sqrt{d-1}}$ where $\mathbf{x}=(x_1,\cdots,x_d)^T\in B_1(\mathbf{0})$.
Near otrhogonality
Consider drawing $n$ points $\mathbf{x}_1,\cdots,\mathbf{x}_n$ in $\mathbb{R}^d$ at random from the unit ball $B_1(\mathbf{0})$. With probability $1-O(\frac{1}{n})$,
1. $|\mathbf{x}_i|\ge 1-\frac{2\ln n}{d}$ for all $i;
2. $|\mathbf{x}_i\cdot \mathbf{x}_j|\le \frac{\sqrt{6\ln n}}{\sqrt{d-1}}$ for all $i\ne j$.
Gaussian Integral / Distribution
$$\int^\infty_{-\infty}e^{-a(x+b)^2}dx=\sqrt{\frac{\pi}{a}}$$
$$\int_{\mathbb{R}^n}\exp\left( -\frac{1}{2}x^TAx+b^Tx+c\right) d^nx=\sqrt{\det(2\pi A^{-1})}e^{\frac{1]{2}b^TA^{-1}b+c}$$
$$f(\mathbf{x})=\frac{1}{(2\pi)^{d/2}\sigma^d}\exp \left(-\frac{|\mathbf{x}|^2}{2\sigma^2}\right)$$
Concentration of Gaussian Distribution
For a $d$-dimensional spherical Gaussian with unit variance in each direction, for any $\beta\le \sqrt{d}$, at least $1-3e^{-c\beta^2}$ of the probability mass concentrates within the annulus $$\{ \mathbf{x}\in \mathbb{R}^d: \sqrt{d}-\beta\le |\mathbf{x}|\le \sqrt{d}+\beta\} = \{ \mathbf{x}\in\mathbb{R}^d: ||\mathbf{x}|-\sqrt{d}|\le \beta\}$$ where $c$ is a fixed positive constant which is at least $\frac{1}{24}$.
Random Projection Theorem
Let $\mathbf{v}$ be a fixed vector in $\mathbb{R}^d$ and let $f(\cdot)$ be defined as above. There exists a constant $c>0$ such that for $\epsilon\in(0,1)$, $$\mathbb{P}\left( \left|\left| \frac{1}{\sqrt{m}}f(\mathbf{v})\right| - |\mathbf{v}|\right| \ge \epsilon |\mathbf{v}|\right)\ge 3e^{-cm\epsilon^2}$$ where the probability is taken over the random draws of vectors $\mathbf{u}_i$ used to construct $f$.
CI
Markov Inequality
Let $X$ be a non-negative random variable. Then for any $a>0$, $$\mathbb{P}(X\ge a)\le \frac{\mathbb{E}[X]}{a}.$$
Chebyshev's Inequality
Let $X$ be a random variable. Then for any $a>0$, $$\mathbb{P}(|X-\mathbb{E}[X]|\ge a)=\mathbb{P}((X-\mathbb{E}[X])^2\ge a^2)\le \frac{\mathbb{V}ar(X)}{a^2}.$$
Master Tail Bound Theorem
Let $S_n=X_1+\cdots+X_n$ where $X_1,\cdots,X_n$ are mutually independent random variables (possible not identical) with zero mean and variance at most $\sigma^2$. Let $0<a\le \sqrt{2}n\sigma^2$. Assume that $|\mathbb{E}[X^r_i]|\le r! \sigma^2$ for $r=3,4,\cdots,\frac{a^2}{4n\sigma^2}$. Then $$\mathbb{P}(|S_n|\ge a)\le 3e^{-\frac{a^2}{12n\sigma^2}}.$$
Chernoff's Inequality
cumulant function $$\psi_X(\theta):=\ln \mathbb{E}[e^{\theta X}]$$
Then we can set $$\psi^*(t)=\sup_{\theta>0}\{ t\theta - \psi(\theta)\}$$ and $$\mathbb{P}(X\ge t)\le e^{-\psi^*(t)}$$
Gaussian Random Variables
Let $X\sim N(0,\sigma^2)$, $$\psi(\theta)=\ln \mathbb{E}[e^{\theta X}]=\frac{1}{2}\theta^2\sigma^2$$ from calculate Gaussian integral.
Then Chernoff bound of Gaussian is $$\mathbb{P}(X\ge t)\le e^{-\frac{t^2}{2\sigma^2}}$$
Sub-Gaussian Random Variables
For random variable $X$, when $X'=X-\mathbb{E}[X]$, $$e^{\psi_{X'}(\theta)}=\mathbb{E}[e^{\theta(X-\mathbb{E}[X])}]\le e^{\frac{\theta^2\sigma^2}{2}}\quad \text{ or }\quad \psi_{X'}(\theta)\le \frac{1}{2}\sigma^2\theta^2\quad \forall\theta \in \mathbb{R}$$ then sub-Gaussian.
Fenchel-Legendre dual is $$\psi^*_{X'}(t)=\sup_\theta \{t\theta-\psi_{X'}(\theta)\}\ge \sup_\theta \{t\theta -\frac{1}{2}\sigma^2\theta^2\}=\frac{t^2}{2\sigma^2}$$ then Chernoff bound is $$\mathbb{P}(X-\mathbb{E}[X]> t)=\mathbb{P}(X'>t)\le e^{-\psi^*_{X'}(t)}\le e^{-\frac{t^2}{2\sigma^2}}$$
Karush-Kuhn-Tucker (KKT) Condition
For $$\max \{ f(\mathbf{x}):g_1(\mathbf{x})=0,\cdots,g_m(\mathbf{x})=0,\ h_1(\mathbf{x})\le 0,\cdots,h_n(\mathbf{x})\le 0\}$$
$$\nabla f(\mathbf{x})=\sum^m_{i=1}\lambda_i\nabla g_i(\mathbf{x})+\sum^n_{j=1}\mu_j\nabla h_j(\mathbf{x})$$
$$\lambda_1,\cdots,\lambda_m\in\mathbb{R},\ \mu_1\ge 0,\cdots, \mu_n\ge 0\\ \mu_1h_1(\mathbf{x})=0,\cdots,\mu_nh_n(\mathbf{x})=0\\ g_1(\mathbf{x})=0,\cdots, g_m(\mathbf{x})=0\\ h_1(\mathbf{x})\le 0,\cdots, h_n(\mathbf{x})\le 0$$
IE
Cross Entropy and Kullback-Leibler Divergence (Relative Entropy)
$$H(\mathbb{P},\mathbb{Q})=\sum_i p_i\ln\frac{1}{q_i}$$
$$D(\mathbb{P}||\mathbb{Q})=H(\mathbb{P},\mathbb{Q})-H(\mathbb{P})=\sum_i p_i\ln\frac{1}{q_i}-\sum_i p_i\ln\frac{1}{p_i}=\sum_i p_i\ln\frac{p_i}{q_i}$$
Gaussian case: $D(\mathbb{P}||\mathbb{Q})=\frac{(\mu_1-\mu_2)^2+\sigma^2_1-\sigma^2_2}{2\sigma^2_2}+\ln\frac{\sigma_2}{\sigma_1}$
Distance Between Probability Measures
$$\frac{1}{2}||\mathbb{P}-\mathbb{Q}||_1=\sup_{A\in F}(\mathbb{P}(A)-\mathbb{Q}(A))=\int_{\{\frac{d\mathbb{P}}{a\mathbb{Q}}\ge 1\}} \left(\frac{d\mathbb{P}}{d\mathbb{Q}}-1\right)d\mathbb{Q}=\frac{1}{2}\int\left| \frac{d\mathbb{P}}{d\mathbb{Q}}-1\right| d\mathbb{Q}$$