replace variable in a linear model with new variable with same covariance that yields the same least-sqares parameter estimate

Question

Consider the following linear model, which explains the relation between a $d$-dimensional set of explanatory variables $\{\mathbf{X},D \}$ and a 1-dimensional effect variable $Y$ ($\{\mathbf{X},D \}$ is an $n \times d$ matrix that contains $n$ observations of $d-1$ variables in $\mathbf{X}$ and $n$ observations of one additional variable $D$):

$$Y = \{\mathbf{X},D \}\beta + \varepsilon \hspace{50pt}[1]$$

The $d$-dimensional least-squares estimated parameter vector is called $\hat\beta$.

I want to draw a random variable $D_s$ as a replacement for $D$ which fulfills the following two conditions:

$Cov(\mathbf{X},D_s) = Cov(\mathbf{X},D)$
if I replace $D$ with $D_s$ in model [1] above, $$Y = \{\mathbf{X},D_s \}\gamma + \varepsilon, \hspace{50pt}[2]$$ the least-squares estimate $\hat\gamma$ should be equal to $\hat\beta$, i.e. $$\hat\gamma = \hat\beta.$$ In particular the element in the least-squares parameter estimate $\hat\gamma$ corresponding to variable $D_s$ should be equal to that corresponding to $D$ in the model [1] above.

The first condition relates to @whuber's algorithm to draw a random variable with a given covariance structure to a given set of random variables: https://stats.stackexchange.com/a/313138/3277

However, this algorithm does not include make sure that the second condition is fulfilled. Is there a way to update that algorithm (or altogether different way) to draw a random variable that fulfills conditions 1. and 2. above, i.e. $\beta_D = \gamma_{D_s}$.

According to the formulas, it looks like in both equations the vector of errors are the same. Correct me if I am wrong. — javierazcoiti, Sep 18 '20 at 11:34
You can always add a vector to $D$ that is orthogonal to all the columns of $X$ and to $Y$ and then rescale the result appropriately. — whuber, Sep 18 '20 at 11:58
@javierazcoiti ideally the vectors $\varepsilon$ should be the same, yes — pf11, Oct 07 '20 at 15:25
@whuber, can you please be more specific about your proposal. Adding a vector to $D$ that is orthogonal to all columns in $\mathbf{X}$ and $Y$ (call that vector $D'$) wouldn't change anything about Step 1, and therefore would not change the element in $\gamma$ corresponding to $D_s$ (it would still be zero, since $D_s$ is not correlated with $Y$ by construction). It seems that I would need to add a vector to $D_s$ that is correlated with $Y$ conditional on $\mathbf{X}$ but not correlated with $\mathbf{X}$ and then rescale such that $\beta_d = \gamma_{D_s}$. Frankly, I don't see how to do that. — pf11, Oct 08 '20 at 11:04

replace variable in a linear model with new variable with same covariance that yields the same least-sqares parameter estimate

0 Answers0