How to fit a linear model (such as lm()
in R) of the form:
$$y = a_1 x_1+ a_2 x_2+ a_3 x_3 + a_4 x_4 + a_5 x_5 + \epsilon$$
with the constraint that : $a_2a_5 = a_3a_4$. There are no other conditions on the coefficients.
How to fit a linear model (such as lm()
in R) of the form:
$$y = a_1 x_1+ a_2 x_2+ a_3 x_3 + a_4 x_4 + a_5 x_5 + \epsilon$$
with the constraint that : $a_2a_5 = a_3a_4$. There are no other conditions on the coefficients.
You could just use a solver for constrained minimization, there are multiple ones available in R. Or we could solve for the restrictions. You have $a_2 a_5= a_3 a_4$, assuming $a_2\not = 0$ we find that $a_5=\frac{a_3 a_4}{a_2}$. we can incorporate that into the model and use nonlinear least squares. First we simulate some data:
a_1 <- 1.0
a_2 <- 1.0
a_3 <- 1.0
a_4 <- 1.0
a_5 <- 1.0
set.seed(12345)
N <- 100
x_1 <- runif(N)
x_2 <- runif(N)
x_3 <- runif(N)
x_4 <- runif(N)
x_5 <- runif(N)
y <- a_1*x_1 + a_2*x_2 + a_3*x_3 + a_4*x_4 + a_5*x_5 + rnorm(N,0,sd=2)
mydata <- data.frame(x_1,x_2,x_3,x_4,x_5,y)
Then we estimate with nls():
mod_1 <- nls(y ~ a_1*x_1+a_2*x_2+a_3*x_3+a_4*x_4+(a_3*a_4/a_2)*x_5, data=mydata,
start=list(a_1=0.5,a_2=0.5,a_3=0.5,a_4=0.5), trace=TRUE)
mod_1
co <- coef(mod_1)
co[3]*co[4]/co[2]
Then, in the case $a_2=0$, we must solve separately, and then compare residual sum squares (or maybe AIC) to choose the best (sub)model:
### Then we need a separate model for the case a_2=0: This is in reality three cases:
## a__2=a__4=0
## a_2=a_3=0
## a_2=a_3=a_4=0
### which gives three different linear models:
mod_01 <- lm(y~0+x_1+x_4+x_5,data=mydata)
mod_02 <- lm(y~0+x_1+x_3+x_5,data=mydata)
mod_03 <- lm(y ~0+x_1+x_5, data=mydata)
mod_01
mod_02
mod_03
with output:
mod_01 <- lm(y~0+x_1+x_4+x_5,data=mydata)
mod_02 <- lm(y~0+x_1+x_3+x_5,data=mydata)
mod_03 <- lm(y ~0+x_1+x_5, data=mydata)
mod_01
Call:
lm(formula = y ~ 0 + x_1 + x_4 + x_5, data = mydata)
Coefficients:
x_1 x_4 x_5
1.474 1.779 2.227
mod_02
Call:
lm(formula = y ~ 0 + x_1 + x_3 + x_5, data = mydata)
Coefficients:
x_1 x_3 x_5
1.884 1.113 2.437
mod_03
Call:
lm(formula = y ~ 0 + x_1 + x_5, data = mydata)
Coefficients:
x_1 x_5
2.421 2.907
I will leave it for you to find the best submodel, and then choose the parameter estimates from that one.