5

Edited:

I would like to work out the above relationship, more precisely:

Let $(Y_{1}, ..., Y_{m})$ be a zero-mean vector with covariance matrix $\Sigma$, and let $S \subset \{1, ..., m\}.$ The best linear predictor of $Y_{S}$ upon $Y_{\setminus{S}}$, defined as $\tilde{\beta}_{S} = \Sigma^{-1}_{\setminus{S}, \setminus{S}} \Sigma_{\setminus{S}, S}$, is a matrix multiple of $\Sigma^{-1}_{\setminus{S}, S}$.

It is assumed that$(Y_{1}, ..., Y_{m})$ is jointly normally distributed.

The above statement seems to imply that there is a number of steps from

$\tilde{\beta} = [X^{T}X]_{\setminus{S}, \setminus{S}}^{-1} [X^{T}Y]_{\setminus{S}, S}$

to

$[X^{T}X]_{\setminus{S}, S}^{-1} c$, where c is some scalar.

I would be interested in those steps.

Thanks!

jmb
  • 645
  • 1
  • 5
  • 13
  • Your question seems to indicate $X^\prime Y$ is a scalar--but it never is unless $\beta$ has just a single component (that is, can be considered itself a scalar). Although it's not completely clear what "proof" you are referring to, perhaps you will find your questions answered simply by counting dimensions. – whuber Sep 18 '14 at 20:55
  • Assuming all $X$ columns are centered then $X'Y$ is the covariance of $Y$ (or $X_{1}$) with $X$ (or all remaining $X_{\s}$) - sorry for using two different notations. The dimensions do match, but I was looking for a way to explicitly work out this relationship for p dimensions. – jmb Sep 18 '14 at 21:02
  • 1
    That is done in many threads on this site. Although they may be hard to find--one can't search for the $\TeX$ expressions that would appear--I recall providing one answer at http://stats.stackexchange.com/questions/54943. I am sure others can be found; consider searching on keywords such as "multiple", "regression," and "formula." Ah... here is a [formal Calculus-based derivation](http://stats.stackexchange.com/a/46171/919). – whuber Sep 18 '14 at 21:06
  • Maybe I am missing the point, but I was not asking for the derivation of $\beta$ but how to show that the above mentioned submatrix of the inverse covariance is a scalar multiple of the limit of the linear regression vector for $X_{s}$ on all other variables. Also I would be interested in the meaning/interpretation of this scalar. – jmb Sep 18 '14 at 21:22
  • I think I begin to see what you want, but it is not at all evident what you mean by a "limit of the linear regression vector." I still cannot even find an explicit, unambiguous statement of the relationship you seek to prove. Perhaps the malformed $\TeX$ expressions prevent your intentions from being understood. If you don't mind, please edit the question to reflect the content of your comment, explain your notation, and make it clearer how this differs from the other questions. – whuber Sep 18 '14 at 21:57
  • Thanks. I restated it more precisely, now with correct TEX expressions and coherent notation. The statement of the relationship I would like to proof is in the quote. – jmb Sep 19 '14 at 06:19
  • Thank you! I take it that "$\setminus S$" refers to the complement of $S$ in $\{1,2,\ldots, m\}$. For the connection between covariance matrices and the traditional least squares formula for $\tilde{\beta}$, please see http://stats.stackexchange.com/a/108862. That might give you some clues about how to proceed here. Additional clues come from the one-variable-at-a-time characterization of multiple regression, as explained http://stats.stackexchange.com/a/113207/919 and http://stats.stackexchange.com/a/46508/919. – whuber Sep 19 '14 at 14:08

1 Answers1

2

Here is the solution:

Starting with regression equation:

$y = Xb$

$X^{T} y = X^{T} X b$

$[X^{T}X]^{-1} X^{T} y = [X^{T}X]^{-1} [X^{T} X] b $

$[X^{T}X]^{-1} X^{T} y = Ib$

$[X^{T}X]^{-1} X^{T} y = b$

Partitioning the covariance matrix:

Let $\{X_{1}, ... ,X_{p}\}$ be zero-mean random vectors. Divide the set of random vectors in two subsets $X_{s}, X_{\setminus{s}}$ with $X_{s} \in \mathbb{R} $ and $X_{\setminus{s}} \in \mathbb{R}^{p-1}$

$\Sigma_{11}$ is a $1\times1$ matrix and the variance of $X_{s}$, $\Sigma_{22}$ is a $(p-1)\times (p-1)$ matrix and the the covariance matrix of $X_{\setminus{s}}$ and $\Sigma_{21} = \Sigma_{12}^{T}$ is a $(p-1)\times1$ matrix and the covariance matrix of $X_{s}$ and $X_{\setminus{s}}$.

$\Sigma = \left[\begin{array}{cc} \Sigma_{11}&\Sigma_{12}\\ \Sigma_{21}&\Sigma_{22} \end{array}\right] $

Getting inverse covariance by matrix inversion:

using: http://en.wikipedia.org/wiki/Schur_complement

Inverse covariance matrix:

$\Sigma^{-1} = \left[\begin{array}{cc} I&0\\ -\Sigma_{21}\Sigma^{-1}_{11}&I \end{array}\right] \left[\begin{array}{cc} \Sigma_{11}&\Sigma_{12}\\ \Sigma_{21}&\Sigma_{22} \end{array}\right] \left[\begin{array}{cc} I&-\Sigma^{-1}_{11}\Sigma_{12}\\ 0&I\\ \end{array}\right] $

$= \left[\begin{array}{cc} \Sigma^{-1}_{11.2}&-\Sigma^{-1}_{11.2} \Sigma_{12}\Sigma^{-1}_{22}\\ -\Sigma^{-1}_{22} \Sigma_{21}\Sigma^{-1}_{11.2}&\Sigma^{-1}_{22}+\Sigma^{-1}_{22}\Sigma_{21}\Sigma^{-1}_{11.2}\Sigma_{12}\Sigma^{-1}_{22}\\ \end{array}\right] $

where $\Sigma_{11.2} = \Sigma_{11} - \Sigma_{12} \Sigma^{-1}_{22} \Sigma_{21} $

Show equivalence up to scalar multiple: inverse cov & regression coefs

As all vectors have mean equal to zero:

$[X^{T}X]^{-1} = \Sigma^{-1}_{22}$ and $X^{T} y = \Sigma_{21}$ (Sorry for defining X in two different ways in the regression setup and the definition of the random vectors, but I think it's still clear)

... and we can write: $b = \Sigma^{-1}_{22} \Sigma_{21}$.

Let $K = \Sigma^{-1}$.

Then $K_{21} = -\Sigma^{-1}_{22} \Sigma_{21} \Sigma^{-1}_{11.2}$

As $\Sigma_{11}$ is a $1\times 1$ matrix $\Sigma_{11}$ is a scalar and $\Sigma^{-1}_{11.2}$ is also a scalar.

Therefore we can rearrange $K_{21} = [- \Sigma^{-1}_{11.2}] \Sigma^{-1}_{22} \Sigma_{21}$

We can now write $K_{21} = [- \Sigma^{-1}_{11.2}] b$.

As $[- \Sigma^{-1}_{11.2}]$ is a scalar we showed what was claimed in the question.

jmb
  • 645
  • 1
  • 5
  • 13