Logarithmic regression of form $y=a+b \log(x_1)+c\log(x_2)$ using R

Question

How can I fit a logarithmic regression equation of form $y=a+b (\log (x_1)) + c(\log(x_2))$ on a data set using R?

Here the main concern is that data contain zeros multiple times, so R will give infinity in the output.

I have tried adding some constant to the $x$ variables, such as log(x1+0.00001) to avoid Inf but

Is there any specific way to calculate this constant so that results are not affected?

https://dl.dropbox.com/u/53624395/11.csv : LINK FOR DATA FILE ON WHICH I WANT TO PERFORM THE OPERATION.This is time series data and i have to perform logarithmic regression of form y=a+b(logx1)+c(logx2). and find a,b,c and then check is there any such type of relation exists or not.

Of course results are affected by any change to the transformation. 'Not affected' compared to what? What are you seeking to achieve? Why are you taking logs of something that can be zero? — Glen_b, Jan 10 '13 at 06:38
Any statistical package should give infinity. Try remove zeros from your data before modelling. — mpiktas, Jan 10 '13 at 08:08
**Very** closely related: http://stats.stackexchange.com/questions/30728/how-small-a-quantity-should-be-added-to-x-to-avoid-taking-the-log-of-zero. — whuber, Jan 10 '13 at 13:55
I have added the data file on which I want to perform the operation.And if this question was totally related to the link which you have given I would not have asked this question.It has to be performed using R tool that's the reason I am asking this question :) @whuber — Komal, Jan 11 '13 at 05:32
Komal, because the new and distinctive part of your question is solely about programming in `R`, please post your question on SO rather than here. If you would like, flag this post for moderator attention and a moderator will migrate it for you. — whuber, Jan 11 '13 at 14:20
@whuber: Okies I have flagged this question for transferring it to stackoverflow. :) thank you. :) — Komal, Jan 11 '13 at 15:43
@Komal I tried, but the "question owner is blocked from asking questions" on SO. Sorry. — whuber, Jan 11 '13 at 15:45
Ironically, we keep getting variations of this question on SO from this same person and we keep referring him to you guys. I hope you'll agree that this is more of a modeling question (how to handle the zeroes) than a programming one. For as long as the guy is unwilling to make a model choice (your website has plenty of solutions.), we'll keep closing his question on SO and refer him to you... Sorry! — flodel, Jan 12 '13 at 13:02

score 2 · Answer 1 · edited Jan 10 '13 at 13:53

2

You can shift your data ( $x_i\mapsto x_i+\mathrm{constant}_i $)
You can try and do $y= a + b \log\left(\frac{x_1}{x_2}\right) $ or similar, you can use a sort of Laplace smoothing in this case $y= a + b * \log\left(\frac{x_1+1}{x_2+1}\right) $
You can weight your data such that you kind of kill the very high ones, you can use $t\mapsto e^{-\frac{1}{t}}$ as weight for example.
...etc

It would be easier if you have a data example I guess .. But then it is ups to you to have a play around and see what suits you the best!

edited Jan 10 '13 at 13:53

whuber

281,159
54
637
1,101

answered Jan 10 '13 at 09:51

dfhgfh

401
1
4
9

Presumably the subscript $i$ on the constant is a typo. – Scortchi - Reinstate Monica Jan 10 '13 at 12:14
I meant $x1 \mapsto x1+ c_1$ where $c_1$ is a shift for every single component of $x_1$ and same for $x_2$ – dfhgfh Jan 10 '13 at 12:35
Of course, I didn't read carefully enough – Scortchi - Reinstate Monica Jan 10 '13 at 12:59
[DATA ON WHICH IT IS TO PERFORMED](https://dl.dropbox.com/u/53624395/11.csv) This is time series data and i have to perform logarithmic regression of form y=a+b(logx1)+c(logx2). and find a,b,c and then check is there any such type of relation exists or not. – Komal Jan 11 '13 at 05:25

Logarithmic regression of form $y=a+b \log(x_1)+c\log(x_2)$ using R

1 Answers1