Regression where predictors are correlated with past values of y

Question

Setup

We are interested in estimating a model for the following setup:

$Y_t=\beta_0 + \beta_1^{'}X^{'}_t + \epsilon_t$

$COV(X^{'}_t,Y_{t-1,t-2,...,1} | X_{t-1,t-2,...,1}) = 0$

Where $\epsilon_t$ is iid normal. In other words: what would the effect of $X_t$ be if it were set “randomly” (independently of past $Y_t$s) at each timestep?

Unfortunately, what we actually have in front of us is:

$Y_t=\beta_0 + \beta_1X_t + \epsilon_t$

$X_{t}=\phi(Y_{t-1,t-2,...,1})$

Where $\phi$ is an unknown function. In other words, $X_{t}$ is correlated with past-values of $Y$.

Just to make this more concrete, let's imagine a situation where this might arise: we are measuring customer behavior (spending) in response to some marketing action (coupons). We'd like to understand the impact of coupon \$ amount $X_t$ on customer-spending $Y_t$. Unfortunately, the coupon-amount that is given at $t$ is determined by taking a fraction of average of past-spending, i.e.:

$Xt = \phi(Y_{t-1}...Y_1) = m_t/(t-1) * \sum_{1}^{t-1}(Y_t-1,...Y_1)$

Where $m_t$ is the fraction (let's say it's drawn from $U(.10,.20)$).

Goal/Questions

We'd like to recover $\beta_1^{'}$ (e.g. uncover the effect of coupons on spending if coupons had been given randomly).

Problem: If we simply fit an OLS regression model $Y_t=\beta_0 + \beta_1X_t + \epsilon_t$ for each customer, we will not get the correct value (see link to simulation below).

Questions:

Is there a name for a situation such as this? I have had no luck finding this situations when reviewing literature on autocorrelation, exogenous/endogenous predictors, and propensity-scoring/causal inference.
What are some approaches to modeling data in a situation like this?

UPDATE:

Here is some R code that shows a simulation of the process described above.

In cleaning up the code for this post, I realized that (as I'm simulating it), a mixed-effects model can actually handle the above-described situation -- i.e., it can recover simulated coefficients without bias. (I had previously incorrectly concluded that it cannot -- reasons described in the code.)

So I suppose I'd like to add a third question:

Why is the mixed-effects approach doing so well here? OLS on each group produces a negative bias in $\beta_1$; it’s not clear to me how/why the machinery of mixed-effects models avoids that. Does this success generalize (i.e., generalize to other $\phi$s)? I had thought that including a fixed-effect $X_t$ that’s so heavily correlated with the random-intercepts would be problematic (e.g. Mundak, 1978).

My concern is that I’ve inadvertently created a simulation where everything works out OK; but since I don’t actually understand why things worked out OK, I risk trying the same approach on a real-world dataset with a different $\phi$ and succumbing to bias.

I did not have the time to look at your simulation, but it might help to conceptually separate statistical inference from causal inference when approaching such problems. I.e. start with your causal model and derive causal quantities of interest (there are many in temporal settings) non-parametrically. Think about which parametric assumptions are reasonable and why. Then, derive a suitable estimator. Sequential G-estimation is often useful in time-series settings — persephone, Sep 21 '20 at 21:06
Your problem sounds complicated and I don't even have the time to understand it. I just read the beginning but check out the koyck distributed lag. that's good for when a predictor is related to past values of the response because the past value of the response is on the RHS. There is tons of literature on the koyck distributed lag. Another term used in the literature is auto-regressive distributed lag of which the koyck distributed lag is a special case. I could be wrong but using lagged dependent variables I think will help you with your issue. — mlofton, Sep 24 '20 at 05:24

Regression where predictors are correlated with past values of y

Setup

Goal/Questions

0 Answers0