7

I'm implementing OLS linear regression without using the built-in functions in Matlab with normal equation:

enter image description here

I know this is probably very basic, but I want to double check, the input X yields a unique solution, right? I just figured that there could only be one solution because 1) it's a straight line and 2) because the term $\bf X_{ext}\bf X_{ext}^{T}$ doesn't give me a singular matrix warning when I perform inv() in Matlab, and 3) when I perform rank() on it in Matlab it says rank=2.

Is this correct, is there a more important reason, and is there any generalization to this that I should remember?

Also small side question, should I always subtract the mean to center my data?

Austin
  • 683
  • 7
  • 19
  • 1
    About your side question: the interpretation of parameters in models including interaction effects becomes easier when centering. In particular, you may interpret the fixed covariates at 0 as their mean. – Carol Eisen Apr 07 '17 at 01:32

1 Answers1

6

I'm going to transpose your notation so that it is consistent with how these things are usually written. You have a predictor matrix $X \in \mathbb R^{n \times p}$ (with the first column taken to be all $1$s) and a random response $Y \in \mathbb R^n$. We are going to fit the model $E(Y|X) = X\beta$. Your case corresponds to $p=2$.

In general you know that $$\hat \beta = \textrm{argmin}_{b \in \mathbb R^p} ||Y - Xb||^2.$$ If $X$ is full rank then this problem is strictly convex and therefore the argminimum is unique.

Now suppose that $X$ is not full rank. Then the problem no longer is strictly convex and there are infinitely many solutions. In your case, this would be the case where $x_1 = \dots x_n = c$ for some constant $c \in \mathbb R$.

Note that $X$ is full rank $\iff$ $X^TX$ is invertible.

We could also use a more direct linear algebra argument rather than a convexity argument. We know that a solution $\hat \beta$ to $\min ||Y - Xb||^2$ is a solution to the normal equations $$ X^TY = X^TX \hat \beta. $$

Suppose $\hat \beta$ and $\tilde \beta$ both satisfy the normal equations so that $$ (X^TX)\hat\beta = (X^TX)\tilde\beta $$

If $X$ is full rank, or equivalently $X^TX$ is invertible, then $X^TX\hat\beta = X^TX\tilde\beta \implies \hat\beta = \tilde\beta$. In terms of linear transformations, $X^TX$ being invertible means it's a bijection (and therefore 1-1) so this is just saying $f(a) = f(b) \implies a = b$ for a bijection $f$.

Now if $X^TX$ is singular then its null space $N(X^TX)$ is at least 1-dimensional so let $v \neq 0 \in N(X^TX)$. This tells us that $$ X^TX\hat\beta = X^TX(\hat \beta + v) $$ so it is not necessarily the case that two solutions to the normal equations are equal.

jld
  • 18,405
  • 2
  • 52
  • 65