0

Suppose I have two regressors, task availability (Xa) and task participation (Xp), and a DV Y. One can only participate in a task if it is available, but one can choose not to participate even if a task is available. Baseline model Y ~ Xa, should be the total effect of task availability. Y ~ Xa + Xa * Xp, adds the effect of participation. Now, if the interaction model is the true model, would Y ~ Xa be biased, because of omitted variable problem? That is, the total effect cannot be estimated using the baseline model?

user2869951
  • 51
  • 1
  • 3
  • I assume you (somehow) get a $Y$-score whether or not a subject participates in the task. Then, just from your few sentences, it seems as if you may really have _one_ categorical variable with categories: {Task Unavailable, Available/Declined, Available/Accepted}. How the first two of the three categories might produce different scores is unclear. (Does 'Available/Declined' indicate a grumpy uncooperative spirit---or a logical decision to use personal time efficiently?) – BruceET Apr 30 '19 at 20:29
  • Yes, there is a Y with or without the task participation. The first two might be different because participation is a choice and it reflects people's differences just like what you said. My question though, is, in light of such different participation choices, can I still estimate the baseline model, i.e., Y~X? – user2869951 May 01 '19 at 01:44
  • I'm thinking in terms of a one-way ANOVA with $Y$ as variable and U, A/D, A/A as three levels of the factor. If residuals of $Y$ are nearly normal, then standard ANOVA would work. If not then non-parametric procedure such as Kruskal-Wallis. – BruceET May 01 '19 at 03:30
  • 1
    Might be relevant: https://stats.stackexchange.com/questions/372257/how-do-you-deal-with-nested-variables-in-a-regression-model – kjetil b halvorsen May 01 '19 at 12:53

0 Answers0