Does anyone have a reference to an explicit formulation of multiple regression, but in which the bias term is taken out and treated separately? I would especially be interested if either ridge regression and/or press residuals are also treated.
Asked
Active
Viewed 18 times
0
-
Please tell us what you mean by the "bias term." That's not a standard part of any kind of least squares procedure, as far as I know, but maybe it goes by another name. – whuber Aug 28 '18 at 19:26
-
@whuber In simple least squares, $y_i = \alpha + \beta x_i + \epsilon_i$ . Say, $\alpha$ and $\beta$ are parameters. $y_i$ and $x_i$ are observations, and $\epsilon_i$ is the an error. Here, the $\alpha$ would be the bias term here, in some perspective. If understand correctly, in typical treatment of linear regression with multiple variables, the "constant" bias term is effectively folded as a feature into the feature matrix. – Yi Liu Aug 28 '18 at 20:03
-
The usual names for $\alpha$ are "constant" or "intercept." Do you really mean "multivariate" least squares, where the response is a *vector*-valued variable, or are you referring to *multiple regression* in which there may be more than one explanatory ("independent") variable? Your question says one thing but your tags say another. – whuber Aug 28 '18 at 20:11
-
@whuber Firstly, thank you for the correction on the terminology -- I was confused on the distinction. I have updated the tags to be more inclusive. Ideally, I would like to see the case where the data-by-feature matrix $X$ has multiple rows and multiple columns, where the intercepts for each response variable are treated separately from the other features, and with multiple response variables -- and thus the intercepts would be a vector. However, I will also settle for _multiple regression_ if no one has ever bothered writing out the multivariate case. – Yi Liu Aug 28 '18 at 21:05
-
@whuber Also, I apologize for probably sounding stubborn, a result of my own ignorance. Probably, I don't really fully appreciate what I am asking for -- I really am rather ignorant in this domain, and could really use more of your help in rephrasing and reframing the question into something that better suits the senses and clarifies confusion. – Yi Liu Aug 28 '18 at 21:23
-
The multivariate case is well studied. It is characterized by the need to specify the covariance matrix among the conditional response vector. (After all, if there are no associations among the responses, you might just as well carry out a set of multiple regressions, one per response.) But if you're going to engage in a multivariate regression, you will find the issue of including an intercept or not to be trivial, because it is subsumed as a special case: the intercept is merely another explanatory variable. Could you then please edit your post to describe the specific problem you face? – whuber Aug 28 '18 at 22:14
-
Re the new phrasing: this is called "regression through the origin." See https://stats.stackexchange.com/search?q=regression+origin. The comments at the end of the answer at https://stats.stackexchange.com/a/130114/919 also illuminate this issue. – whuber Aug 29 '18 at 16:42