5

In grad school, I was always taught the general linear model $$\mathbf{y} = \mathbf{X}\boldsymbol\beta + \boldsymbol\epsilon\tag{1}$$ where $\mathbf{y}$ is a vector, $\mathbf{X}$ is some matrix, $\boldsymbol\beta$ is a parameter vector, and $\boldsymbol\epsilon$ is a vector of error terms satisfying $\mathbb{E}[\boldsymbol\epsilon] = \mathbf0$.

However, in Izenman's Modern Multivariate Statistical Techniques, the following model is used instead to explain multivariate linear regression, as well as the reduced-rank regression model: $$\mathbf{y} = \boldsymbol\mu + \mathbf{C}\mathbf{x}+\boldsymbol\epsilon\tag{2}$$ where $\mathbf{y}, \boldsymbol\mu, \boldsymbol\epsilon \in \mathbb{R}^s$, $\mathbf{C}$ is a $s \times r$ matrix, and $\mathbf{x} \in \mathbb{R}^r$.

At first, this seems confusing - but I think I understand now why Izenman does this: it's so that the covariance matrix of $\begin{bmatrix}\mathbf{x} \\ \mathbf{y}\end{bmatrix}$ is defined. Note that in this case, $\mathbf{C}$ is taken to be a matrix of parameters and $\mathbf{x}$ is the design matrix (vector?).

Is there a way to rewrite $(2)$ in the form of $(1)$? I imagine $(2)$ could be reformulated to look like $(1)$ somehow, but I can't get it to work.

amoeba
  • 93,463
  • 28
  • 275
  • 317
Clarinetist
  • 3,761
  • 3
  • 25
  • 70
  • 3
    It's more likely because Izenman's definition of $x$ does not include a constant term, so he needs to subtract the mean from $y$ in order to have the mean of $\epsilon = 0$. The way to change it into the first form is to add a column of $1$s to $x$ and expand the coefficient matrix to include one more term.. – jbowman Oct 18 '17 at 13:44
  • @jbowman Yes, but it's not only this: see my answer. – amoeba Dec 12 '17 at 10:10
  • In Inzemann (1975) the matrices $X$ and $Y$ are turned sideways. GLM would have $N \times r$, one row per case, but the original RRR paper models Y as $r \times N$. Same with X. So rather than $X\beta$ in RRR $CX$. For me, that's confusing. It all works out, but notation is difficult comparing documents. – pauljohn32 Feb 13 '19 at 10:56

1 Answers1

4

In usual multiple regression the response variable $y$ is 1-dimensional so for each sample we can write an equation $$y = \boldsymbol \beta^\top \mathbf x + \epsilon,\tag{1a}$$ where $\mathbf x$ is an $r$-dimensional vector of predictors and $\boldsymbol\beta$ is the vector of regression coefficients. If the sample size is $n$, we can combine $n$ such equations into one, stacking all $y_i$ into one vector $\mathbf y$ and all $\mathbf x_i$, as rows, into one data matrix $\mathbf X$. This yields the form that you gave as Equation 1: $$\mathbf y = \mathbf X\boldsymbol\beta + \boldsymbol\epsilon.\tag{1b}$$

Your Equation 2 describes a single sample of multivariate regression where response variable $\mathbf y$ is a vector. Using the similar notation as above, we could write for each sample $$\mathbf y = \mathbf B^\top\mathbf x + \boldsymbol\epsilon,\tag{2a}$$ with the only difference to your Equation 2 being that here the intercept is included in $\mathbf x$, i.e. the first element of $\mathbf x$ is always equal to $1$, and so $\mathbf B = [\boldsymbol \mu, \mathbf C^\top]$ stacked together. Here we can also combine $n$ such equations together by using data matrices: $$\mathbf Y = \mathbf {XB} + \mathbf E.\tag{2b}$$

The key point and probably the source of confusion for you was that $\mathbf y$ in Equations (1b) and (2a) above are very different things! In (1b) it denotes an $n$-dimensional data vector comprising $n$ one-dimensional sample points while in (2a) it denotes an $s$-dimensional response vector which is one single $s$-dimensional sample point.

amoeba
  • 93,463
  • 28
  • 275
  • 317