Partial Least Squares NIPALS Algorithm Question: How is w chosen to maximize cov(Xw, Y)

Question

Recently I found a nice slideshow that explains PLS and the idea behind it pretty well. I think I understand the majority of the slides but I'm a bit confused with the first step of the NIPALS algorithm. Here the author the slides describes the choice in w as the unit vector that maximizes cov(Xw, Y). My question is how does cov(Xw, Y) = w'X'Y? Or how does the other reason that we can just maximize w'X'Y? I understand the rest of this first step but I'm a bit confused by this one line.

Thank you!!

This is called the *multilinear* (or *bilinear*) property of covariance. It follows directly from your favorite definition of covariance. Some definitions, along with a brief discussion of multilinearity, appear at https://stats.stackexchange.com/a/222091/919. Search our site on [cov* bilinear*](https://stats.stackexchange.com/search?q=cov*+bilinear*) for details. — whuber, Nov 02 '21 at 12:53
I do understand that but I guess I'm more unsure how cov(X,Y) (after pulling out the vector w) is equal to X'Y — Steven Turnbull, Nov 02 '21 at 21:59
It's not. However, when $X$ and $Y$ are vector-valued random variables with expected values of zero and you take expectations, you are just looking at the definition of covariance. — whuber, Nov 03 '21 at 14:37

Partial Least Squares NIPALS Algorithm Question: How is w chosen to maximize cov(Xw, Y)

0 Answers0