What is the relationship between $\text{Cov}[\mathbf{X},\mathbf{X}]$ and $(\mathbf{X}^T \mathbf{X})^{-1}$ in multiple regression?

Question

The matrix formulation of multiple regression for $n$ observations is $$ \mathbf{Y} = \mathbf{X}^T \beta + \varepsilon, $$ where the error $\varepsilon$ has finite variance $\sigma^2$. Let $\mathbf{b}$ be the estimated coefficients found when we solve the multiple regression problem with least squares.

In Theorem 4.3 of the book Econometric Analysis by William H. Greene it says that asymptotically $\mathbf{b}$ is distributed as $$ \mathbf{b} = \mathcal{N}\bigg(\beta,\frac{\sigma^2}{n}Q^{-1}\bigg), $$ where $Q$ is defined in equation 4.19 as \begin{align*} Q := \text{plim}_{n \to \infty} \frac{\mathbf{X}^T \mathbf{X}}{n}. \end{align*} which is a positive definite matrix. Suppose we write the covariance matrix of $\mathbf{X}$ as $$ \text{Cov}[\mathbf{X},\mathbf{X}] = \begin{bmatrix} \sigma_{X_1 X_1} & \sigma_{X_1 X_2} & \dots \\ \sigma_{X_2 X_1} & \sigma_{X_2 X_2} & \dots \\ \vdots & & \ddots \\ \sigma_{X_n X_1} & \dots & \dots & \sigma_{X_n X_n} \end{bmatrix}, $$ where $$ \sigma_{X_i X_j} = E[(X_i-E[X_i])(X_j-E[X_j])], $$

I want to know how the matrix $(\mathbf{X}^T \mathbf{X})^{-1}$ relates to the covariance matrix $\text{Cov}[\mathbf{X},\mathbf{X}]$.

For example, if we modify the elements of $\mathbf{X}$ to make them more collinear, the off-diagonal elements of $\text{Cov}[\mathbf{X},\mathbf{X}]$ will increase in magnitude and $(\mathbf{X}^T \mathbf{X})^{-1}$ will become 'harder' to invert. But I am looking for a precise mathematical expression that relates the two matrices.

Can $(\mathbf{X}^T \mathbf{X})^{-1}$ be expressed in terms of $\text{Cov}[\mathbf{X},\mathbf{X}]$ somehow? E.g., can the elements of $(\mathbf{X}^T \mathbf{X})^{-1}$ be written in terms of the elements of $\text{Cov}[\mathbf{X},\mathbf{X}]$?

This question is not the same as the linked one. That other question talks about the relationship between $(\mathbf{X}^T \mathbf{X})^{-1}$ and $\text{Cov}[\mathbf{X},\mathbf{y}]$ but it doesn't specify the relationship between $(\mathbf{X}^T \mathbf{X})^{-1}$ and $\text{Cov}[\mathbf{X},\mathbf{X}]$. — Bertus101, Dec 01 '20 at 16:25
On the contrary, the duplicate is *exactly* the same. It specifies the relationship in great detail (as a row reduction of one matrix to produce the other). Note that the expression "$(X^\prime X)^{-1}$" is not generally well-defined as a matrix, but when it is understood as an instruction to solve a system of linear equations, it *is* well-defined. — whuber, Dec 01 '20 at 17:27
For $X \in \mathbb{R}^{n \times d}$ with row mean $\mu \in \mathbb{R}^d$, the covariance matrix can be written as $C = \frac{1}{n} X^T X - \mu \mu^T$. Therefore, $X^T X = n (C + \mu \mu^T)$, and taking the inverse gives $(X^T X)^{-1} = \frac{1}{n} (C + \mu \mu^T)^{-1}$ — user20160, Dec 01 '20 at 20:43

What is the relationship between $\text{Cov}[\mathbf{X},\mathbf{X}]$ and $(\mathbf{X}^T \mathbf{X})^{-1}$ in multiple regression?

0 Answers0

Related