0

Im checking for reverse causality with a regression including leads. The reasoning is that the coefficient of the lead should not be significant if no problem with endogeneity. The original model is: $$ \ln(y_{it})=a_i+b_t+\ln(c_{it})+u_{it}, $$ where $y$ is the dependent variable, $a$ and $b$ are the individual and time fixed effects and $c$ is the independent variable of interest. $u$ is the error term. When creating the regression with leads, is it ok to model it in its logarithms form, or should I model it without the transformation? With logarithms it would look like: $$ \ln(y_{it})=a_i+b_i+\ln(c_{it})+\ln(c_{it+1})+u_{it}. $$ Im asking since the lead of the logarithm model is not significant while the lead of $$ y_{it}=a_i+b_i+c_{it}+c_{it+1}+u_{it} $$ is significant.

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
Jam.Wil
  • 45
  • 5
  • I don't know much about what you are proposing, but your transformations (and modelling decisions in general) shouldn't be based on the resulting p-value. – mkt May 01 '21 at 15:00
  • Do you have a reference that explains how leads test for endogeneity? The logic is not obvious to me. – dimitriy May 01 '21 at 15:44
  • @DimitriyV.Masterov I got the suggestion from my prof. i econometrics. He said that "The analogous version of the check for parallel trends assumption is the test for strict exogeneity. If the assumption of the model is correct X shouldn’t affect Y before it has happened. In other words, leads of X should not affect Y. We can test this by including leads of X". – Jam.Wil May 01 '21 at 16:28
  • @mkt-ReinstateMonica Can you elaborate what you mean? – Jam.Wil May 01 '21 at 16:28
  • 1
    Is $c_{it}$ a continuous regressor? Maybe individuals receive different doses of some treatment? In general, yes, effects shouldn’t emerge in your response before some intervention is put in place. – Thomas Bilach May 01 '21 at 17:19
  • I mean: don't choose whether to do a log-transformation based on whether you get a significant p-value. – mkt May 01 '21 at 17:49
  • 1
    Relevant threads: https://stats.stackexchange.com/questions/18844/when-and-why-should-you-take-the-log-of-a-distribution-of-numbers and https://stats.stackexchange.com/questions/298/in-linear-regression-when-is-it-appropriate-to-use-the-log-of-an-independent-va/3530#3530 – mkt May 01 '21 at 17:50
  • @mkt-ReinstateMonica Thank you! My main question is not whether or not to use the log-transform, but rather if the robustness test of the main independent variable should be in its logarithmic form or not. Im using a log-log model to be able to express the coefficients as elasticities. Both coefficients of the log-log model and its original form are significant, but the question is if the regression with leads should be in log-log form or not. My best guess is to express the leads regression in its original form, as the log-transformation is only used as a means to express elasticities. – Jam.Wil May 04 '21 at 09:47
  • 1
    I think the answers in the linked threads do address the main question, but if not, it may be helpful to edit your question to be more clear. Also, in general I would think your robustness test should be on the same scale as the primary model of interest. – mkt May 04 '21 at 10:20

0 Answers0