Adding a Constant to Every Column of X (OLS)

Question

In OLS, if I have design matrix X (an NxK matrix of full column rank) and I add a constant, such as 2, to every entry of X, how does that change my estimators?

Let's denote $\tilde{X} = X + 2$.

I can't compute the OLS estimator $\beta_{OLS} = (\tilde{X}'\tilde{X})^{-1}\tilde{X}'y $ because $\tilde{X}$ doesn't have full column rank (or does it? If so, I cannot prove it).

I'm thinking, my intercept term will change while the other coefficients do not, but I'm having trouble proving it.

To be precise, are you asking about (i) adding a constant 2 to every entry of matrix $X$, (ii) appending a row to $X$ where every entry in the new row is 2, or (iii) appending a column to $X$ where every entry in the new column is 2? — Matthew Gunn, May 16 '18 at 22:00
There's no reason why adding 2 to every element of a full-rank $X$ should, in general, make $\tilde X$ be less than full rank. — shadowtalker, May 16 '18 at 22:12
If the design matrix contains a column for an intercept term then this will cancel out the added constant. — Sextus Empiricus, May 16 '18 at 22:26
@shadowtalker That's not true for all possible matrices. Consider $X=-2$. $\operatorname{rank}(X)=1$ but $\operatorname{rank}(X+2) = 0$. Consider $A = \begin{bmatrix} -1 & 0 \\ 0 & 2 \end{bmatrix}$ then $\operatorname{rank}(A)=2$ but $\operatorname{rank}(A+2) = 1$. — Matthew Gunn, May 16 '18 at 22:31
I meant (i) adding a constant 2 to every entry of matrix $X$. — FWL, May 16 '18 at 23:22

Sextus Empiricus · Accepted Answer · 2018-05-16T23:32:05.377

Rank

When one of the columns is constant (an intercept term) then you can use: https://math.stackexchange.com/questions/676333/prove-that-if-ranka-n-then-rankab-rankb

For $X_{m \times n}$ and $Z_{n \times k}$, where $Z$ is of rank $n$, then

$$rank(XZ) = rank(X)$$

The addition of the constant can be expressed by multiplication of X with n x n matrix Z of rank n. This is done by taking the identity matrix and add the constant, such as $x=2$ (but $x$ can not be -1), to the row that corresponds to the column $i$ that is related to the intercept: $$Z = I + C, \qquad \text{with $c_{jk}=x$ if $j=i$ and $c_{jk}=0$ otherwise }$$

For instance:

$$\small\begin{bmatrix}1 & 1 & 1 & 1 \\ 1 & 2 & 4 & 8 \\ 1 & 3 & 9 & 27 \\ 1 & 4 & 16 & 64 \\ 1 & 5 & 25 & 125 \\ 1 & 6 & 36 & 216 \\ \end{bmatrix} \times \begin{bmatrix}3 & 2 & 2 & 2 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} = \begin{bmatrix}1 & 1 & 1 & 1 \\ 1 & 2 & 4 & 8 \\ 1 & 3 & 9 & 27 \\ 1 & 4 & 16 & 64 \\ 1 & 5 & 25 & 125 \\ 1 & 6 & 36 & 216 \\ \end{bmatrix} + \begin{bmatrix}2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 \\ \end{bmatrix}$$

Estimators change

You could see OLS as projection of observations Y onto the span of the columns in X. The span does not change by adding the constant (iff the X contains an intercept term) so $\tilde{y}_{OLS}=y_{OLS}$

You can use the same matrix Z to show how the coefficients change $Z \tilde\beta_{OLS} = \beta_{OLS}$ making all the coefficients the same except the one related to the intercept.

The dimension of $Z$ should be $KxK$ in this case, but otherwise, clear answer. Great idea of making $\tilde{X}$ with a linear transformation rather than adding 2's everywhere like I tried. I didn't think of utilizing the first column of ones. — FWL, May 17 '18 at 02:36
@FLW I may have switched some letters. I used a m x n design matrix instead of n x k. I did this switch to match the linked question notation. — Sextus Empiricus, May 17 '18 at 06:20

jld · Answer 2 · 2018-05-17T01:51:49.350

$\newcommand{\one}{\mathbf 1}$Others have discussed the effect on the estimator (and +1 to Martijn) but I want to more carefully address the effect of adding a constant to $X$ on the rank of $\tilde X$. For the rank of $\tilde X$, it's not the presence of an intercept by itself that matters but whether the constant column is in the column space of $X$.

Let $\one_k$ be the column vector of $k$ $1$s. Then adding a constant $c$ to every element of $X$ can be done by $$ \tilde X = X + c\one_n\one_p^T $$ so this is a rank 1 update to $X$. It is indeed possible for this to result in $\tilde X$ becoming reduced rank. For instance, if $c=2$ and the first column of $X$ is all $-2$ then we'll get a column of $0$s in $\tilde X$ which means the rank will be at most $p-1$. I'll let $\mathcal C(X)$ denote the column space of $X$ and I'll assume throughout that $c \neq 0$.

Result 1: If $\one \notin \mathcal C(X)$ then $\tilde X$ is always full rank, or in other words $\one \in \mathcal C(X)$ is a necessary condition for $\tilde X$ to be reduced rank.

Pf: (by contrapositive) we will suppose $\tilde X$ is reduced rank and will show $\one \in \mathcal C(X)$. So if $\tilde X$ is reduced rank there must be some nonzero $\alpha \in \mathbb R^p$ such that $$ 0 = \tilde X\alpha = X\alpha + c(\one_p^T\alpha)\one_n. $$ Note that if $\alpha^T\one_p = 0$ then we have $X\alpha = 0 \implies \alpha=0$ by $X$ being full column rank, but that's a contradiction, so we must have $\alpha^T\one_p \neq 0$. This means $$ X\alpha = -c(\one_p^T\alpha)\one_n \implies X\left(\frac{-\alpha}{c\alpha^T\one_p}\right) = \one_n $$ so there exists a vector $\gamma \in \mathbb R^p$ such that $X\gamma = \one_n$, i.e. $\one \in \mathcal C(X)$.

$\square$

Result 2: if $\one \in \mathcal C(X)$ then there is at most one $c$ such that $\tilde X$ is reduced rank.

Pf: if $\one_n \in \mathcal C(X)$ then there is some non-zero $\alpha \in \mathbb R^p$ with $X\alpha = \one_n$. By $X$ being full rank this $\alpha$ is unique.

Case I: $\alpha^T\one_p \neq 0$. This lets us do $$ X\alpha - \one = X\alpha + \left(\frac{-1}{\alpha^T \one_p}\right)\one_n \one_p^T\alpha= (X + c\one_n\one_p^T)\alpha = 0 $$ for $c = \frac{-1}{\alpha^T \one_p}$.

Now for uniqueness, if we are to have any chance of making $\tilde X$ reduced rank we need $X\alpha \propto \one$ otherwise it can't be eliminated. But we can produce a $\gamma$ such that $X\gamma = d\one$ for any $d \in \mathbb R$ (although we'll take $d\neq 0$ since that's for $\gamma=0$). If we do this, then the corresponding calculation for $c$ is $$ X\gamma - d\one = X\gamma + \left(\frac{-d}{\gamma^T\one}\right)\one_n\one_p^T\gamma = 0 $$ so $c =\frac{-d}{\gamma^T\one}$. But $X\gamma = d\one=d(X\alpha) \implies \gamma = d\alpha$ so actually there is just a single $c$ that works. Thus if $\one \in \mathcal C(X)$ we can find a $c$ that makes $\tilde X$ low rank but there's just one such $c$ so a "random" $c$ is very unlikely to make this happen.

Case II: $\alpha^T\one_p = 0$. Again we'll try to find a $\gamma$ with $\tilde X\gamma=0$, so as before we'll have to take $\gamma = d\alpha$ for some $d$. Assuming we have such a $\gamma$ then $$ \tilde X\gamma = X\gamma + c\one_n\one_p^T\gamma = dX\alpha + cd\one_n^T\one^T\alpha = d\one \neq 0 $$ so in this special case there is no way to make $\tilde X$ reduced rank.

$\square$

So ultimately it's all about the column space rather than the individual vectors in $X$. If $\one \in \mathcal C(X)$ it's possible to get $\tilde X$ reduced rank, like in my example at the beginning with $c=2$, but in that case this is in fact the only such $c$ that works, so if $c$ is not carefully chosen we probably don't need to worry.

Here's an example where there's no such $c$: take $$ X = \left(\begin{array}{cc} 1&0 \\ 1&0 \\ 0&-1 \\ 0&-1\end{array}\right) $$ and note how $\one \in \mathcal C(X)$ and the way to get it is $X\alpha$ with $\alpha = {1\choose -1}$. Thus $\alpha^T\one = 0$. There's no way to make this matrix low rank by adding a constant to it. If we add $-1$ then we eliminate the top half of the first column, but we add to its lower half and the rank is preserved. And etc.

Thanks for adding to the discussion. So in the case of full rank $X$ with the column of 1's (and thus $\mathbf{1} \in \mathcal{C}(X)$), there is only one $c$ that would make the transformation $X + c$ reduced rank? $c= -1$, in particular. — FWL, May 17 '18 at 02:52
@FWL yeah (unless I made a mistake, although i went through that proof a couple of times so hopefully not), if the first column for example is all ones then $\alpha=e_1$ is the coordinate of $\mathbf 1$ in $\mathcal C(X)$ so $c=-1/e_1^T\mathbf 1 = -1$ is the only way to drop a rank — jld, May 17 '18 at 02:57
@MartijnWeterings i think if $\mathbf 1$ is a column then we'd have $\alpha=e_1$ so $c$ would have to be negative since $\alpha^T\mathbf 1 > 0$, so i think even though $\one \in \mathcal C(X)$ you are still correct that adding a positive constant can't decrease the rank in this case. — jld, May 18 '18 at 01:48

Adding a Constant to Every Column of X (OLS)

2 Answers2

Rank

Estimators change

Linked