## Lecture #22: Random Sampling for Fast SVDs

Some clarifications from Monday’s lecture (and HW #5):

• Firstly, Akshay had a question about the Matrix Chernoff bound I mentioned in Lecture today — it seemed to have a weird scaling. The bound I stated said that for ${\beta > 0}$,

$\displaystyle \Pr[ \lambda_1(A_n - U) \geq \beta ] \leq d\cdot\exp\left\{- \frac{ \beta^2 n }{4R} \right\}.$

The worry was that if we scaled each little random variable by some ${K \geq 1}$, we would get

$\displaystyle \Pr[ K \lambda_1(A_n - U) \geq K \beta ] \leq d\cdot\exp\left\{- \frac{ (K\beta)^2 n }{4KR} \right\}.$

So the numerator of the exponent would increase from ${-\beta^2 n}$ to ${-(K\beta)^2 n}$ whereas the denominator would go from ${4R}$ to ${4KR}$. As ${K}$ gets large, we get better and better bounds for free. Super-fishy.

I went over it again, and the bound is almost correct: the fix is that the bound holds only for deviations ${\beta \in (0,1)}$. You will prove it in HW#5. (I updated HW #5 to reflect this missing upper bound on ${\beta}$; the upper bound on ${\beta}$ subtly but crucially comes into play in part 5(c).) This upper bound on ${\beta}$ means one can’t get arbitrarily better bounds for free by just scaling up the random variables.

However, you can get a little bit better: if you do try to play this scaling game, I think you can get a slightly better bound of

$\displaystyle d\cdot\exp\left\{ - \frac{\beta n}{4R} \right\}.$

(Follow the argument above but set ${K = 1/\beta}$ so that the deviation you want is about ${1}$, etc.) The resulting bound would improve the sampling-based SVD result from today’s lecture to ${s = O(r \log n/ \varepsilon^2)}$ sampled rows from the weaker bound of ${s = O(r \log n/ \varepsilon^4)}$ rows I claimed.

• Secondly, for any real symmetric matrix ${A}$ assume the eigenvalues are ordered

$\displaystyle \lambda_1(A) \geq \lambda_2(A) \geq \cdots \geq \lambda_n(A).$

The statement I wanted to prove was the following: given ${A, B}$ real symmetric, we have that for all ${i}$,

$\displaystyle | \lambda_i(A) - \lambda_i(B) | \leq \| A - B \|_2.$

Akshay pointed out that it follows from the Weyl inequalities. These say that for all integers ${i,j}$ such that ${i+j-1 \leq n}$,

$\displaystyle \lambda_{i+j-1}(X+Y) \leq \lambda_i(X) + \lambda_j(Y).$

Hence setting ${j = 1}$, and setting ${X+Y = A}$ and ${X = B}$,

$\displaystyle \lambda_{i}(A) - \lambda_i(B) \leq \lambda_1(A-B).$

Similarly setting ${X+Y = B}$ and ${X = A}$, we get

$\displaystyle \lambda_{i}(B) - \lambda_i(A) \leq \lambda_1(B-A).$

Hence,

$\displaystyle | \lambda_{i}(A) - \lambda_i(B)| \leq \max\big(\lambda_1(A-B), \lambda_1(B-A)\big)$

$\displaystyle = \max\big(\lambda_1(A-B), -\lambda_n(A-B)\big) = \| A - B \|_2.$

• John found Rajendra Bhatia‘s book on Matrix Analysis quite readable.
• A clarification asked after class: all matrices in this class are indeed matrices over the reals.
• There are minor fixes to the HW, in particular to problem #5. Please look at the latest file online, changes are marked in red.