1

Say we have an $N \times q$ matrix $Y$ with $N>q$. Also, we have an $N \times p$ data matrix $X$.

We are interested in a model of $Y = X \times W + \epsilon$, where $W$ is a $p \times q$ matrix with $q<p$.

1) Is this possible and how would you do it? 
2) Can you modify PCA to some kind of supervised/guided PCA that takes $Y$ into account?

EDIT: It has come to my attention (thanks to @Whuber) that you can estimate (1) by vector regression, keeping the linear formulation.

More generally, given the setup, is it possible to estimate a function $Y = f(X)$, where $f:\mathbb{R}^{N\times P}\mapsto\mathbb{R}^{N\times Q}$? I'm happy to hear about machine learning approaches.

adam
  • 95
  • 6
  • 2
    What is $M\phantom{}$? – amoeba Sep 21 '18 at 08:59
  • 1
    Take a look at my answer here https://stats.stackexchange.com/questions/152517, does it answer your question? – amoeba Sep 21 '18 at 10:49
  • 1
    Could you explain what you mean by (2)? What exactly is intended by the phrase "takes $Y$ into account"? As far as (1) goes, this is ordinary least squares (which you can see by writing out the model in detail). For instance, the least squares fitting function `lm` in `R` automatically models vector-valued responses in this fashion. – whuber Sep 21 '18 at 12:11
  • @amoeba: $M$ is just the dimension, it could be anything. In my particular application, it's a complicated matrix that arises from multiple inversions of other matrices. But in the end, it's a real-valued matrix. – adam Sep 21 '18 at 15:49
  • @whuber: (2) was just an idea, because $Y$ just has fewer dimensions than $X$, so maybe you could apply some dimensionality reduction method but subject to the fact that the resulting matrix should look similar to $Y$. Is it really a simple OLS when $Y$ and your coefficients are not vectors, but matrices? Can you provide me with a link? Thank you! – adam Sep 21 '18 at 15:52
  • I don't follow that at all because you haven't yet stated what you would be applying PCA to! Would it be $X$, $Y$, or the block matrix $(X\mid Y)$? The reason @amoeba asked about $M$ is that without explaining what it means, its appearance is superfluous, signifying nothing. Did you perhaps intend it to equal $q$? – whuber Sep 21 '18 at 15:57
  • M is a dimension of what?? You said Y is N times q. – amoeba Sep 21 '18 at 16:03
  • @whuber, amoeba: Sorry! I used $M$ to begin with, but I mean $q$. This is a mistake, and I'll update the question right now - apologize! – adam Sep 21 '18 at 16:09
  • @whuber: I'm sorry for not being more precise, it was more a brainstorming idea. Originally, I wanted to apply PCA to $X$, but instead of only maximizing the variance, it should also be subject to something like $Y\approx XW$. But it seems to be what vector regression does, so I'm happy about that! – adam Sep 21 '18 at 16:11
  • 1
    You might like to know that your formulation is really just a set of $q$ models, one for each column of $Y.$ This *multivariate* model comes to the fore when the errors $\epsilon$ are not independent: that is, the error in one column may be associated with errors in other columns. (Otherwise, you can just fit each column separately.) Thus, the approach you take, regardless of the form of the regression function $f,$ depends strongly on what you assume about the multivariate distribution of $\epsilon.$ Do you have a particular problem in mind that might help narrow the possibilities? – whuber Sep 21 '18 at 16:24
  • Yes, I have. It's a bit complicated, but I will try. I have access to $N$-dim vector $Y$, an $N \times p$ matrix $X$ and finally, a $q$-dim vector $Z$. Ideally, I like to learn $f$ and $g$ such that this holds: $\underset{N\times1}{\underbrace{Y}}=\underset{N\times K}{\underbrace{f\left(X\right)}}\underset{K\times1}{\underbrace{g\left(Z\right)}}$ – adam Sep 21 '18 at 16:33
  • My approach was a two-step iterative approach. First, assuming knowledge of $f(X)$, I could rewrite the model to $\underset{K\times1}{\underbrace{\left(f\left(X\right)'f\left(X\right)\right)^{-1}f\left(X\right)'Y}}=\underset{K\times1}{\underbrace{g\left(Z\right)}}$. Then try to learn $g$ from $q$-vector $Z$. Second, assuming knowledge of $g(Z)$, I would rewrite the model to get: $Yg\left(Z\right)'\left(g\left(Z\right)g\left(Z\right)'\right)^{-1}=f\left(X\right)$, where I would learn $f$ from $N \times K$ matrix $X$. – adam Sep 21 '18 at 16:36
  • @whuber: I've posted this question as a separate question: https://stats.stackexchange.com/questions/368055/decomposition-of-vector-into-product-of-a-function-on-a-matrix-and-a-function-on. I would appreciate inputs! – adam Sep 21 '18 at 16:52
  • 1
    @amoeba: I found my answer to the original question in your post - closing this question! – adam Sep 21 '18 at 16:54

1 Answers1

2

All credits to @amoeba and @whuber for helpful comments! I slightly altered the question and posted it in a new post: Decomposition of vector into product of a function on a matrix and a function on a vector - Possible?

The original regression can be solved by a vector regression or multivariate multiple regression (Thanks to @whuber to point it out in the comments). More "machine-learning"-style approached include Reduced-Rank-Regression, and for a nice brief overview, see @amoeba post: What is "reduced-rank regression" all about?

adam
  • 95
  • 6