6

I am converting a high-dimensional model to a lower dimensional model by fitting a sliding window of it to a linear (parametric) model and looking at the evolution of parameter values over time. I'm going from 6.3 million points to about 2500 values of 6 parameters.

Physics says the intercept should be a constant value, but when I use LM it moves around. I think the motion is because of noise, and that it causes other parameter values not to indicate properly. I would like to set it to a known constant value.

How do I make a linear model in R that has a prescribed intercept (not zero).

Current code:

for (i in 1:(n-k)){

  fit <- lm(y ~ x1 + x2 + I(x3^2) + x4 + x5 + x6 , data=data[i:(i+k),])

  #STORE PARAMETERS INTO VARIABLES
...  #truncated for brevity
}

Code that doesn't do the job:

  fit <- lm(y ~ I(9.81) + x1 + x2 + I(x3^2) + x4 + x5 + x6 , data=data[i:(i+k),])
  fit <- lm(y ~ 9.81 + x1 + x2 + I(x3^2) + x4 + x5 + x6 , data=data[i:(i+k),])

Question:

  • How do I prescribe the constant?
  • I tried searching for this both at google and CV - is there a vocabulary that I am missing?
  • Can you comment on how something like AIC or R2 are impacted by this model? I prefer to use AIC or BIC and I think that, as model selection criteria they should account for parameters, but the R2 changes (I think) in a fundamental way between the two.
  • I tried searching in CV for an answer to this question, but did not find it. One alternate solution was suggested but its form is substantially different than what was requested. It is about massaging the inputs, not about formatting the command without fundamentally altering the data. The answer that I liked (and found most useful) is about the form of the formula entered, not about creating new variables.

As usual, comments and suggestions are solicited.

EngrStudent
  • 8,232
  • 2
  • 29
  • 82
  • 3
    Does it matter if the intercept moves around a bit? I presume that the estimate of this constant term is so uncertain as to have a confidence interval on its estimate that includes 0? Just because physics dictates that the intercept be a certain value, doesn't mean you have to force *exactly* that value. By forcing that value exactly, you may induce bias in the estimation of other parameters of the model. The point is that you have noise in your measurements and you can account for that or at least look at the effect of that on the estimates of the constant term. – Gavin Simpson Nov 18 '14 at 14:43
  • It turns out that it does. I'm dealing with noisy data, so the noise adds a phenomena that, in this case, is non-physical. Physics says no so I'm imposing that on the intercept so that I can get more physics-consistent parameter estimates. – EngrStudent Nov 18 '14 at 15:53
  • 1
    I found a duplicate by searching on "regression constant point". – whuber Nov 18 '14 at 15:57
  • I had a fun read through the first 10 entries of a CV search on that topic. Thanks. – EngrStudent Nov 18 '14 at 16:19

1 Answers1

9

Something like this should do it:

fit <- lm( I(y-9.81) ~ 0 + x1 + x2 + I(x3^2) + x4 + x5 + x6 , data=data[i:(i+k),])

Something similar should be possible in many packages.

An alternative:

interc <- rep(9.81,k+1)
fit <- lm(y ~ 0 + x1 + x2 + I(x3^2) + x4 + x5 + x6 + offset(interc),data=data[i:(i+k),])

While the coefficients and standard errors should be the same, one advantage of the second one is it's actually giving a model for y rather than a shifted y. In some cases that may be useful.

(If you want to test the intercept value, remove the "0+".)

--

AIC should be fine working this way.

$R^2$ won't really work - at least not without some thought, and even then, probably not the way you'd like. Its meaning will change from a model with an intercept, since a pre-specified intercept is effectively a no-intercept model (in fact it is, for a shifted y).

Depending on the exact form of calculation of $R^2$, you might get values outside $[0,1]$, for example, and different forms that were equivalent may not be. Not having a free intercept renders the comparison with an intercept-only model tricky.

If you need an $R^2$ you need to think carefully about which properties of $R^2$ you most need to preserve, because you're going to have to give some up.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • nice. simple solutions can be the best. clever. – EngrStudent Nov 18 '14 at 14:04
  • Can you comment on how something like AIC or $R^2$ are impacted by this model? It is more than the original question asked, but it would help. – EngrStudent Nov 18 '14 at 14:21
  • Actually, add it to your question, and I'll also move my response there - it will make it a better question for this site. AIC should be okay. $R^2$ won't make sense (it's meaning will change from a model with an intercept) since a pre-specified intercept is effectively a no-intercept model. In fact it is, for a shifted y. If you need an $R^2$ you need to think carefully about which properties of $R^2$ you most need to preserve, because you're going to have to give some up. – Glen_b Nov 18 '14 at 14:29
  • You can add `offset` in `lm()` too. Was there a reason to switch to `glm()` here? – Gavin Simpson Nov 18 '14 at 14:45
  • @GavinSimpson only that it didn't work in `lm` when I tried it. I may perhaps have mistyped or something. – Glen_b Nov 18 '14 at 14:47
  • @GavinSimpson Well, it's working for me now. I'll edit. When it didn't work the first time, two things in the help on `offset` led me to think it didn't work in `lm` (which seemed odd, admittedly). – Glen_b Nov 18 '14 at 14:48