2

Suppose I collected data of crop yield at a location for mutliple years and constrcut a model of the form

lm(yield ~ drought_index + solar_radiation + heat_stress)

where my drought_index is defined on a scale of 0 to 1 where 0 means total absence of water for the crop and 1 means complete water. heat_stress goes from 0 to 1 with 0 meaning no heat stress and 1 means complete complete heat_stress (opposite of drought_index), solar_radiation also goes from 20 to 30. I did not observe zero solar radiaiton for obvious reasons.

From theory, if solar radiation and drought_index were zero and heat_stress was 1, you would expect zero yield. So I wonder if this is the case where I can fit a model without the intercept i.e.

lm(yield ~ drought_index + solar_radiation + heat_stress + 0)

Does this seem correct?

The answers marked is just a general guidance on when and when not to fit the intercept. What I am looking for is to how do I use it in my special case.

89_Simple
  • 751
  • 1
  • 9
  • 23
  • From help "when fitting a linear model y ~ x - 1 specifies a line through the origin. A model with no intercept can be also specified as y ~ x + 0 or y ~ 0 + x" – Dave2e Mar 18 '19 at 19:08
  • hmm. I know what it means but wondered in my case, does it seem correct? – 89_Simple Mar 18 '19 at 19:11
  • You have to fit the data and then compare the prediction with the data and/or the expected results. – Dave2e Mar 18 '19 at 19:25
  • 2
    Possible duplicate of [When is it ok to remove the intercept in a linear regression model?](https://stats.stackexchange.com/questions/7948/when-is-it-ok-to-remove-the-intercept-in-a-linear-regression-model) – kjetil b halvorsen Mar 18 '19 at 20:13

1 Answers1

1

While forcing the intercept to be zero sounds theoretically reasonable, what you really say is that the intercept should be zero given the particular form of model (linear model in your case). However, the form of the model is probably not really known in reality. The form of model we use may not be entirely correct. In such situation, forcing the intercept through a fixed value may cause erroneous prediction of the model. On the other hand, you can fit the model without constraining the intercept. If the intercept is significantly different from 0, it might tell you that the form of model is not correct.

Here is a good discussion on this topic from the Dynamic Ecology blog.

Chao Song
  • 276
  • 1
  • 5