Parallelizing SGD Symbolically

|

In here, for the sake of simplicity, we use w instead of w and instead of k for
the size of the projected space, we use r since k is used for summation indices
in here, heavily. We want to estimate v = M w with 1
rM A AT w, where A
is a f r matrix, where aij is a random variable with the following properties.