Linear model with constraints on coefficients in terms of ratios

Question

How to fit a linear model (such as lm() in R) of the form:

$$y = a_1 x_1+ a_2 x_2+ a_3 x_3 + a_4 x_4 + a_5 x_5 + \epsilon$$

with the constraint that : $a_2a_5 = a_3a_4$. There are no other conditions on the coefficients.

Can you explain us (as an edit to the original post) where that constraint comes from? As for fitting it, I would try (for instance in R) by formulating it as a constrained optimization problem, how to do that could depend if there are other restrictions (like positivity) on (some) of the coefficients. If coefficient are positive, for instance, solve to get $a_5=a_3a_4/a_2$ and use ´nls()´. — kjetil b halvorsen, Nov 30 '16 at 08:49
@kjetil: Many thanks. There are no other constraints on them. The coefficients can be either be positive as well as negative. Can you please hint at using nls() further with the constraint? — ahr1729, Nov 30 '16 at 14:40
That reformulation above cannot be used if $a_2=0$. So just treat that as a special case! — kjetil b halvorsen, Nov 30 '16 at 15:35
See http://stats.stackexchange.com/search?q=regression+constraint+coefficient for many related posts. — whuber, Nov 30 '16 at 18:58

score 2 · Answer 1 · answered Nov 30 '16 at 18:33

You could just use a solver for constrained minimization, there are multiple ones available in R. Or we could solve for the restrictions. You have $a_2 a_5= a_3 a_4$, assuming $a_2\not = 0$ we find that $a_5=\frac{a_3 a_4}{a_2}$. we can incorporate that into the model and use nonlinear least squares. First we simulate some data:

a_1 <- 1.0
a_2 <- 1.0
a_3 <- 1.0
a_4 <- 1.0
a_5 <- 1.0
set.seed(12345)
N <- 100
x_1 <- runif(N)
x_2 <- runif(N)
x_3 <- runif(N)
x_4 <- runif(N)
x_5 <- runif(N)
y <- a_1*x_1 + a_2*x_2 + a_3*x_3 + a_4*x_4 + a_5*x_5 + rnorm(N,0,sd=2)

mydata <- data.frame(x_1,x_2,x_3,x_4,x_5,y)

Then we estimate with nls():

mod_1  <-  nls(y ~ a_1*x_1+a_2*x_2+a_3*x_3+a_4*x_4+(a_3*a_4/a_2)*x_5, data=mydata,
               start=list(a_1=0.5,a_2=0.5,a_3=0.5,a_4=0.5), trace=TRUE)
mod_1

co <-  coef(mod_1)
co[3]*co[4]/co[2]

Then, in the case $a_2=0$, we must solve separately, and then compare residual sum squares (or maybe AIC) to choose the best (sub)model:

### Then we need a separate model for the case a_2=0:  This is in reality three cases:
##  a__2=a__4=0
## a_2=a_3=0
## a_2=a_3=a_4=0
### which gives three different linear models:
mod_01 <-  lm(y~0+x_1+x_4+x_5,data=mydata)
mod_02 <-  lm(y~0+x_1+x_3+x_5,data=mydata)
mod_03 <-  lm(y ~0+x_1+x_5, data=mydata)
mod_01
mod_02
mod_03

with output:

 mod_01 <-  lm(y~0+x_1+x_4+x_5,data=mydata)
 mod_02 <-  lm(y~0+x_1+x_3+x_5,data=mydata)
 mod_03 <-  lm(y ~0+x_1+x_5, data=mydata)
 mod_01

Call:
lm(formula = y ~ 0 + x_1 + x_4 + x_5, data = mydata)

Coefficients:
  x_1    x_4    x_5  
1.474  1.779  2.227  

 mod_02

Call:
lm(formula = y ~ 0 + x_1 + x_3 + x_5, data = mydata)

Coefficients:
  x_1    x_3    x_5  
1.884  1.113  2.437  

 mod_03

Call:
lm(formula = y ~ 0 + x_1 + x_5, data = mydata)

Coefficients:
  x_1    x_5  
2.421  2.907

I will leave it for you to find the best submodel, and then choose the parameter estimates from that one.

Thank you so much, kjetil. this is very helpful. There has to be a non-zero constraint on all the coefficients in the problem I am dealing with.Thank you once again :) — ahr1729, Dec 01 '16 at 05:29

Linear model with constraints on coefficients in terms of ratios

1 Answers1

Linked