2

I am estimating a linear regression model with the dependent variable $y$ and $k$ explanatory variables $x_1, \ldots, x_k$: $$y=\beta_0+x_1\beta_1+ \ldots+ x_k\beta_k + \epsilon.$$ In the estimation I want to impose the following type of constraint: $$\beta_i \geq 0, \quad i=1, \ldots, k, \text{ and } \exists \ell \,\, \beta_l>0.$$

If I run OLS under constraints $\beta_i \geq 0, \quad i=1, \ldots, k$, which is easy to do in any software, depending on how I simulate the data I can easily end up with estimates $\widehat{\beta}_1=\ldots = \widehat{\beta}_k=0$, which is not what I am looking for. I am looking for a good fit when at least one of the coefficients is strictly poisitive.

I was thinking about introducing some very small $\epsilon>0$ and considering instead constraints $$\beta_i \geq \epsilon, \quad i=1, \ldots, k,$$ but this seems a bit too restrictive as it requires all the coefficients to be strictly positive.

I can impose $\beta_i \geq \epsilon$ for one $i$ only and non-negativity constraints on all the others but then I have to pick which $i$ or do this for every $i$ and then pick the best fit among all $i$.

It seems to me (and I may be wrong of course) that there could be a better approach to do this.

Alik
  • 578
  • 3
  • 16
  • But what if the best fit for the given data set actually is at $\hat{\beta}_1 \dots = 0$? – jbowman Nov 25 '17 at 23:27
  • @jbowman I didn't say "best" fit as it is not well defined with strict inequality constraints. I said "good" fit. I only mention "best" fit when all the constraints are non-strict. – Alik Nov 25 '17 at 23:28
  • 1
    Yes, but all but an arbitrary one of your constraints are not strict inequalities. If you change the single $>$ constraint to be a $\geq$ constraint, and the optimal solution is actually at $\hat{\beta}_1 \dots = 0$, why would you want to move to a worse solution just so you can have one nonzero coefficient? This is not a rhetorical question. It is hard to see why you are asking the question, given that you can, as you may have realized, simply set $\beta_i \geq \epsilon$ and set $\epsilon = 10^{-99999}$, thereby satisfying your constraint and effectively having $\hat{\beta}_1 \dots = 0$. – jbowman Nov 25 '17 at 23:36
  • @jbowman Suppose I was interested in testing the null hypothesis that in the regression function the effect of each individual regressor on the dependent variable is non-negative and that that the conditional mean $E[y|x_1, \ldots, x_k]$ (I impose $E[\epsilon|x_1, \ldots, x_k]=0$) does in fact vary with $x=(x_1, \ldots, x_k)$? – Alik Nov 25 '17 at 23:42
  • 2
    You might want to look at https://stats.stackexchange.com/questions/90143/should-h-0-be-specified-as-an-equality-or-an-inequality-for-one-sided-tests to see why "non-negative" won't work as a null hypothesis; you'll have to extrapolate a little from the text. I'm sure there's a better explanation out there somewhere, though! – jbowman Nov 26 '17 at 00:01
  • @jbowman Do you mean that there is no way to test the hypothesis that I mentioned? – Alik Nov 26 '17 at 00:31
  • 1
    Look at this as well: https://stats.stackexchange.com/questions/140658/null-hypothesis-with-a-strict-inequality. There the test is of the form $H_o: \beta > 0$ vs $H_A: \beta \leq 0$, but in your case, due to the nonnegativity constraint, it's more like $H_o: \beta > 0$ vs $H_A: \beta = 0$. If you read through the question and answer in this light you may see why the test would never reject the null: basically, the least favorable value of $\beta = 0$, which also happens to be the value of $\beta$ under the alternative. – jbowman Nov 26 '17 at 02:21

0 Answers0