Using Covariance Estimator to Perform Linear Regression?

Question

Suppose you had a method for estimating the population covariance of a vector-valued random variable given observations of that random variable, say $f(Z) \rightarrow C$, where the rows of $Z$ are observations of the random variable. Can one abuse this process to perform a least squares regression $y = x^T\beta + \epsilon$, for $n$-dimensional vector $x$? The idea is that one would have a vector of observations of $y$, call it $Y$, and a matrix of paired observations of $x$, call it $X$, then one would compute the covariance of $Z = [X\; Y]$ (concatenate the matrix next to the vector) via $f$, call it $C$, then let $\hat{\beta} = C_{1:n,1:n}^{-1} C_{1:n,n+1}$.

A few questions:

Will this work under optimistic conditions? (a simple simulation in Matlab shows that the precision is not great, but the results are within 4 sig figs, so I am guessing it will.)
Is this a known trick? if so, does it have a name I can google search for, or is it so trivial that it doesn't require a name?
Most importantly, if $f$ can deal with input where some values are missing (say, MCAR--missing completely at random), under what conditions will this technique behave reasonably for regression with missing values?

edit I am assuming that $x$ is drawn from a zero mean process and that the regression has no intercept term.

This bears a strong resemblance to *kernel regression*, which is a class of inference methods requiring only calculation of inner products. — eric_kernfeld, Oct 04 '16 at 01:17

score 3 · Accepted Answer · answered Oct 03 '10 at 04:14

3

your "trick" seems to be the solution to the [so-called] normal equations for multiple regression - which is the usual least-squares answer in multiple regression.

as for missing data - what $f$ do you have in mind that knows how to get $C$ in that case?

there are methods like imputation for filling in missing values. perhaps little and rubin can give further information on the issues involved.

answered Oct 03 '10 at 04:14

ronaf

371
2
6

one such $f$ is based on EM, as described in Little & Rubin (I was somewhat put off by their treatment of regression, but perhaps I should revisit). – shabbychef Oct 03 '10 at 04:19
It seems like imputation would not work for covariance estimation. – shabbychef Oct 04 '10 at 01:51

Using Covariance Estimator to Perform Linear Regression?

1 Answers1

Linked