-1

How can I fit a logarithmic regression equation of form $y=a+b (\log (x_1)) + c(\log(x_2))$ on a data set using R?

Here the main concern is that data contain zeros multiple times, so R will give infinity in the output.

I have tried adding some constant to the $x$ variables, such as log(x1+0.00001) to avoid Inf but

Is there any specific way to calculate this constant so that results are not affected?

https://dl.dropbox.com/u/53624395/11.csv : LINK FOR DATA FILE ON WHICH I WANT TO PERFORM THE OPERATION.This is time series data and i have to perform logarithmic regression of form y=a+b(logx1)+c(logx2). and find a,b,c and then check is there any such type of relation exists or not.

Komal
  • 61
  • 1
  • 3
  • 11
  • 1
    Why are you transforming the predictors? – Macro Jan 10 '13 at 04:43
  • 2
    Of course results are affected by any change to the transformation. 'Not affected' compared to what? What are you seeking to achieve? Why are you taking logs of something that can be zero? – Glen_b Jan 10 '13 at 06:38
  • Any statistical package should give infinity. Try remove zeros from your data before modelling. – mpiktas Jan 10 '13 at 08:08
  • 2
    **Very** closely related: http://stats.stackexchange.com/questions/30728/how-small-a-quantity-should-be-added-to-x-to-avoid-taking-the-log-of-zero. – whuber Jan 10 '13 at 13:55
  • I have added the data file on which I want to perform the operation.And if this question was totally related to the link which you have given I would not have asked this question.It has to be performed using R tool that's the reason I am asking this question :) @whuber – Komal Jan 11 '13 at 05:32
  • Komal, because the new and distinctive part of your question is solely about programming in `R`, please post your question on SO rather than here. If you would like, flag this post for moderator attention and a moderator will migrate it for you. – whuber Jan 11 '13 at 14:20
  • 1
    @whuber: Okies I have flagged this question for transferring it to stackoverflow. :) thank you. :) – Komal Jan 11 '13 at 15:43
  • @Komal I tried, but the "question owner is blocked from asking questions" on SO. Sorry. – whuber Jan 11 '13 at 15:45
  • Ironically, we keep getting variations of this question on SO from this same person and we keep referring him to you guys. I hope you'll agree that this is more of a modeling question (how to handle the zeroes) than a programming one. For as long as the guy is unwilling to make a model choice (your website has plenty of solutions.), we'll keep closing his question on SO and refer him to you... Sorry! – flodel Jan 12 '13 at 13:02

1 Answers1

2
  1. You can shift your data ( $x_i\mapsto x_i+\mathrm{constant}_i $)

  2. You can try and do $y= a + b \log\left(\frac{x_1}{x_2}\right) $ or similar, you can use a sort of Laplace smoothing in this case $y= a + b * \log\left(\frac{x_1+1}{x_2+1}\right) $

  3. You can weight your data such that you kind of kill the very high ones, you can use $t\mapsto e^{-\frac{1}{t}}$ as weight for example.

  4. ...etc

It would be easier if you have a data example I guess .. But then it is ups to you to have a play around and see what suits you the best!

whuber
  • 281,159
  • 54
  • 637
  • 1,101
dfhgfh
  • 401
  • 1
  • 4
  • 9