5

I am currently working through understanding the mechanics of OLS estimates and the hat matrix. One thing I have been searching for without luck is how we know that the term $X'X$ is invertible where $X'$ represents the transpose of $X$.

I understand that $X'X$ is a symmetric matrix, but I also know that being symmetric alone does not guarantee nonsingularity.

For reference I am referring to this equation:

$$ H = X(X'X)^{-1}X' $$

Any help with this is greatly appreciated.


Through the helpful answers below and a few other google searches I think I have found an answer to my question (at least for most cases).

When performing OLS. We have organized our data with $n$ observations and $p$ parameters. In almost every case $n > p$. This means that the columns of $X$ must be linearly independent. $X'X$ results in a matrix with $dim(X'X) = p$. This means that $X'X$ must also have columns that are linearly independent. Because $X'X$ is a square matrix (rows equal columns), it must have rows which are linearly dependent as well (i.e. $rank(X'X) = p$ aka "full rank"). A full rank matrix is always invertible.

Please correct me if I am wrong here, but I think the logic follow.

I used these questions as resources:

https://math.stackexchange.com/questions/2430179/if-x-is-linearly-independent-prove-xtx-is-positive-definite

https://math.stackexchange.com/questions/691812/proof-of-when-is-a-xtx-invertible

https://math.stackexchange.com/questions/214542/linear-independent-sets-of-non-square-matricies

samvoit4
  • 185
  • 1
  • 8
  • 3
    It isn't necessarily nonsingular. For those concerned about the singular case, you have to understand the inverse as a *generalized inverse.* – whuber Sep 07 '18 at 20:53
  • Am I understanding you correctly that the inverse of the $X'X$ matrix can be thought of as a generalized inverse in this case? Analogous to $(n)1/n$? That goes against what I thought was a fundamental property of matrices (not all are invertible). – samvoit4 Sep 07 '18 at 21:36
  • See https://en.wikipedia.org/wiki/Generalized_inverse. I believe your question may have answers at https://stats.stackexchange.com/questions/63143/ and https://stats.stackexchange.com/questions/84036. – whuber Sep 07 '18 at 21:40
  • To answer the title question, all you need to do is to calculate the determinant of the matrix. If the determinant is zero, it is singular; if not, it is non-singular. – Ben Sep 07 '18 at 23:57
  • This is the second of two posts about matrix intuition that you may find useful: https://blog.stata.com/2011/03/09/understanding-matrices-intuitively-part-2/ – dimitriy Sep 08 '18 at 02:57
  • @Ben It seems like this means that OLS regression is impossible when the matrix resulting from $(X'X)$ has a zero determinant. Is there a reason that $det(X'X) = 0$ could never be true? It seems that any chance of $det(X'X) = 0$ implies that the formula for OLS regression cannot be said to be generally true which is rather troubling. – samvoit4 Sep 08 '18 at 14:57
  • 3
    That conclusion does not follow. The Normal equations can still be solved even when the determinant of $X^{\prime} X$ is zero. That $X^\prime X$ *can* be singular is evident: simply repeat one column in $X$, for instance. This sort of thing happens so often that all general-purpose OLS software will automatically handle it (typically by dropping the smallest number of columns needed to make the design matrix of full rank). – whuber Sep 08 '18 at 15:56
  • @whuber It sounds like what you are saying is that $X'X$ can be singular and we just have software handle those cases. It sounds like I am right in thinking that the equation for minimizing $\beta$ cannot be said to be generally true, but it can be coerced into being true. – samvoit4 Sep 08 '18 at 19:09
  • I don't see how you can read my comments as stating that. At the outset I pointed out how the formula is understood as meaning the generalized inverse in case $X^\prime X$ is not invertible. That makes it thoroughly true in every circumstance. The formula for minimizing $\beta$ is known as the *normal equations:* it is always correct; there's no "coercion" of any sort. – whuber Sep 08 '18 at 21:42

1 Answers1

7

It's a property of the $\text{rank}$ operator when its used on real matrices $\mathbf{A}$: $$ \text{rank}(\mathbf{A}) = \text{rank}(\mathbf{A}') = \text{rank}(\mathbf{A}'\mathbf{A}) = \text{rank}(\mathbf{A}\mathbf{A}'). $$

In your case, the data matrix $\mathbf{X} \in \mathbb{R}^{n \times p}$ is usually tall and skinny ($n > p$), so the rank of everything is the number of linearly independent columns/predictors/covariates/independent variables. If everything is linearly independent $\text{rank}(\mathbf{X}) = p$, and so you have $\mathbf{X}'\mathbf{X}$ is invertible. If you have collinearity, or columns that can be written as linear combinations of others, then $\text{rank}(\mathbf{X}) < p$, and you cannot find a unique inverse for $\mathbf{X}'\mathbf{X}$ (you can, however, find generalized inverses for it).

Taylor
  • 18,278
  • 2
  • 31
  • 66