Lecture #22: Random Sampling for Fast SVDs

Some clarifications from Monday’s lecture (and HW #5):

  • Firstly, Akshay had a question about the Matrix Chernoff bound I mentioned in Lecture today — it seemed to have a weird scaling. The bound I stated said that for {\beta > 0},

    \displaystyle  \Pr[ \lambda_1(A_n - U) \geq \beta ] \leq d\cdot\exp\left\{- \frac{ \beta^2 n }{4R} \right\}.

    The worry was that if we scaled each little random variable by some {K \geq 1}, we would get

    \displaystyle  \Pr[ K \lambda_1(A_n - U) \geq K \beta ] \leq d\cdot\exp\left\{- \frac{ (K\beta)^2 n }{4KR} \right\}.

    So the numerator of the exponent would increase from {-\beta^2 n} to {-(K\beta)^2 n} whereas the denominator would go from {4R} to {4KR}. As {K} gets large, we get better and better bounds for free. Super-fishy.

    I went over it again, and the bound is almost correct: the fix is that the bound holds only for deviations {\beta \in (0,1)}. You will prove it in HW#5. (I updated HW #5 to reflect this missing upper bound on {\beta}; the upper bound on {\beta} subtly but crucially comes into play in part 5(c).) This upper bound on {\beta} means one can’t get arbitrarily better bounds for free by just scaling up the random variables.

    However, you can get a little bit better: if you do try to play this scaling game, I think you can get a slightly better bound of

    \displaystyle  d\cdot\exp\left\{ - \frac{\beta n}{4R} \right\}.

    (Follow the argument above but set {K = 1/\beta} so that the deviation you want is about {1}, etc.) The resulting bound would improve the sampling-based SVD result from today’s lecture to {s = O(r \log n/ \varepsilon^2)} sampled rows from the weaker bound of {s = O(r \log n/ \varepsilon^4)} rows I claimed.

  • Secondly, for any real symmetric matrix {A} assume the eigenvalues are ordered

    \displaystyle  \lambda_1(A) \geq \lambda_2(A) \geq \cdots \geq \lambda_n(A).

    The statement I wanted to prove was the following: given {A, B} real symmetric, we have that for all {i},

    \displaystyle  | \lambda_i(A) - \lambda_i(B) | \leq \| A - B \|_2.

    Akshay pointed out that it follows from the Weyl inequalities. These say that for all integers {i,j} such that {i+j-1 \leq n},

    \displaystyle  \lambda_{i+j-1}(X+Y) \leq \lambda_i(X) + \lambda_j(Y).

    Hence setting {j = 1}, and setting {X+Y = A} and {X = B},

    \displaystyle  \lambda_{i}(A) - \lambda_i(B) \leq \lambda_1(A-B).

    Similarly setting {X+Y = B} and {X = A}, we get

    \displaystyle  \lambda_{i}(B) - \lambda_i(A) \leq \lambda_1(B-A).

    Hence,

    \displaystyle    | \lambda_{i}(A) - \lambda_i(B)| \leq \max\big(\lambda_1(A-B),   \lambda_1(B-A)\big)

    \displaystyle  = \max\big(\lambda_1(A-B), -\lambda_n(A-B)\big)   = \| A - B \|_2.

  • John found Rajendra Bhatia‘s book on Matrix Analysis quite readable.
  • A clarification asked after class: all matrices in this class are indeed matrices over the reals.
  • There are minor fixes to the HW, in particular to problem #5. Please look at the latest file online, changes are marked in red.
Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s