I have an experiment where we measure the energy used by a building and want to regress this energy linearly against so-called degree-days, calculated with two different methods. The data looks like this:
A regression line has been added to each group, that has been forced to go through the origin.
I want to compute the slope of these lines (with std. error), but I'm not sure what is the right way. My data looks like this:
> alvDegreeDays[sample(nrow(alvDegreeDays), 4),]
Energy BaseTemp DegreeDays
Feb 2014 984.7 Estimated 365.9771
Mar 2014 864.7 Estimated 307.2246
Apr 20151 512.8 SIA 50.0000
Sep 2015 239.2 Estimated 95.4787
I've tried this first:
lm(Energy ~ DegreeDays * BaseTemp + 0, alvDegreeDays)
Call:
lm(formula = Energy ~ DegreeDays * BaseTemp + 0, data = alvDegreeDays)
Coefficients:
DegreeDays BaseTempEstimated BaseTempSIA
2.436 23.094 174.390
DegreeDays:BaseTempSIA
1.181
But this yields BaseTempEstimated
and BaseTempSIA
terms which are, in effect, intercept terms.
Next I tried the following:
(foo <- lm(Energy ~ DegreeDays + DegreeDays:BaseTemp + 0, alvDegreeDays))
Call:
lm(formula = Energy ~ DegreeDays + DegreeDays:BaseTemp + 0, data = alvDegreeDays)
Coefficients:
DegreeDays DegreeDays:BaseTempEstimated DegreeDays:BaseTempSIA
4.401 -1.897 NA
This looks better, but when I try to call predict
on this model I get weird error messages:
> predict(foo, list(DegreeDays = 1, BaseTemp = "Estimated"))
1
2.504507
Warning message:
In predict.lm(foo, list(DegreeDays = 1, BaseTemp = "Estimated")) :
prediction from a rank-deficient fit may be misleading
Any idea what I may be doing wrong (or right) here?