How can I impose such constraints on regression coefficients in practice?

Question

I am estimating a linear regression model with the dependent variable $y$ and $k$ explanatory variables $x_1, \ldots, x_k$: $$y=\beta_0+x_1\beta_1+ \ldots+ x_k\beta_k + \epsilon.$$ In the estimation I want to impose the following type of constraint: $$\beta_i \geq 0, \quad i=1, \ldots, k, \text{ and } \exists \ell \,\, \beta_l>0.$$

If I run OLS under constraints $\beta_i \geq 0, \quad i=1, \ldots, k$, which is easy to do in any software, depending on how I simulate the data I can easily end up with estimates $\widehat{\beta}_1=\ldots = \widehat{\beta}_k=0$, which is not what I am looking for. I am looking for a good fit when at least one of the coefficients is strictly poisitive.

I was thinking about introducing some very small $\epsilon>0$ and considering instead constraints $$\beta_i \geq \epsilon, \quad i=1, \ldots, k,$$ but this seems a bit too restrictive as it requires all the coefficients to be strictly positive.

I can impose $\beta_i \geq \epsilon$ for one $i$ only and non-negativity constraints on all the others but then I have to pick which $i$ or do this for every $i$ and then pick the best fit among all $i$.

It seems to me (and I may be wrong of course) that there could be a better approach to do this.

But what if the best fit for the given data set actually is at $\hat{\beta}_1 \dots = 0$? — jbowman, Nov 25 '17 at 23:27
@jbowman I didn't say "best" fit as it is not well defined with strict inequality constraints. I said "good" fit. I only mention "best" fit when all the constraints are non-strict. — Alik, Nov 25 '17 at 23:28
Yes, but all but an arbitrary one of your constraints are not strict inequalities. If you change the single $>$ constraint to be a $\geq$ constraint, and the optimal solution is actually at $\hat{\beta}_1 \dots = 0$, why would you want to move to a worse solution just so you can have one nonzero coefficient? This is not a rhetorical question. It is hard to see why you are asking the question, given that you can, as you may have realized, simply set $\beta_i \geq \epsilon$ and set $\epsilon = 10^{-99999}$, thereby satisfying your constraint and effectively having $\hat{\beta}_1 \dots = 0$. — jbowman, Nov 25 '17 at 23:36
@jbowman Suppose I was interested in testing the null hypothesis that in the regression function the effect of each individual regressor on the dependent variable is non-negative and that that the conditional mean $E[y|x_1, \ldots, x_k]$ (I impose $E[\epsilon|x_1, \ldots, x_k]=0$) does in fact vary with $x=(x_1, \ldots, x_k)$? — Alik, Nov 25 '17 at 23:42
You might want to look at https://stats.stackexchange.com/questions/90143/should-h-0-be-specified-as-an-equality-or-an-inequality-for-one-sided-tests to see why "non-negative" won't work as a null hypothesis; you'll have to extrapolate a little from the text. I'm sure there's a better explanation out there somewhere, though! — jbowman, Nov 26 '17 at 00:01
@jbowman Do you mean that there is no way to test the hypothesis that I mentioned? — Alik, Nov 26 '17 at 00:31
Look at this as well: https://stats.stackexchange.com/questions/140658/null-hypothesis-with-a-strict-inequality. There the test is of the form $H_o: \beta > 0$ vs $H_A: \beta \leq 0$, but in your case, due to the nonnegativity constraint, it's more like $H_o: \beta > 0$ vs $H_A: \beta = 0$. If you read through the question and answer in this light you may see why the test would never reject the null: basically, the least favorable value of $\beta = 0$, which also happens to be the value of $\beta$ under the alternative. — jbowman, Nov 26 '17 at 02:21

How can I impose such constraints on regression coefficients in practice?

0 Answers0