0

I am implementing a Tobit model, since my dependent variable (educational expenditure shares) is left-censored at 0. Below you'll find a swarmplot of the dependent variable and the explanatory variable of interest (gender specific bargaining power converted to dummies: 1 for male; 2 for female; 3 for mixed). enter image description here

Aa you can see I have some large outliers I would like to drop. Naturally I can just identify them in my dataset and delete the observations. However, since I am using the Tobit model, which also allows for right-censoring (two-limit-Tobit model) I was wondering if I could just set the right-sided limit f.e at 0.35 and thereby drop the outliers? I thought about this: \begin{align} y_i = \begin{cases} y^*_i & \text{if } 0 < y^*_i < 0.35\\ 0 & \text{if } y^*_i \leq 0 \end{cases} \end{align} I have the feeling that this would have a different implication as I understood that the truncated Tobit model "drops" observations. Could I then truncate from above, censor from below? Or is this in general a bad idea?

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
XsLiar
  • 11
  • 3
  • I wouldn't call your variable censored when it's just a question of natural bounds. Is number of children censored because there's a lower limit of 0? Is an indicator "owns car" censored because there are limits of 0 and 1? I'd use a logit link function here. For more discussion see e.g. https://ageconsearch.umn.edu/bitstream/122595/2/sjart_st0147.pdf (The outliers are just high values and unless you can show independently that they are incorrect you shouldn't want to omit them. There is likely to be an interesting and informative story there.) – Nick Cox May 15 '18 at 13:48
  • https://www.stata-journal.com/sjpdf.html?articlenum=st0147 is a better link. Same paper at the time of writing, but less likely to be fragile. – Nick Cox May 15 '18 at 14:02
  • Hi Nick! I thought that in this econometric problem regarding expenditures, zero observations represents corner solutions, such as that a consumer is either unwilling or not able at all to make a consumption choice. Thus Tobit is a good choice. Here is a paper, where the authors faced a similar problem an applied Tobit regression: https://ideas.repec.org/a/eee/wdevel/v38y2010i4p555-566.html Thank you for your advice though, I will make use of the logit link function and see if it produces sound results! – XsLiar Jun 06 '18 at 13:49
  • As I understand the question is whether in principle you can go lower than 0, and the answer to that appears to be No. I am not an economist or econometrician. – Nick Cox Jun 06 '18 at 14:05
  • Hello Nick, I noticed that the logit link function you proposed simply drops all zero observation in stata. Having around 30% zero observations this obviously leads to a biased sub-sample. I thought it is just important to add this remark. – XsLiar Jun 19 '18 at 14:56
  • Absolutely not so. In Stata using a logit link will **not** drop (you mean ignore) zero values; otherwise classical logit for a binary response with values of (0, 1) would be impossible. So, something else is biting and/or you are misunderstanding your Stata results and/or you are using the wrong code. I'd suggest asking a separate question on Cross Validated or Statalist to take that forward. But you need to show a reproducible example. (Note that a logit **transformation** $\ln[p / (1 - p)]$ necessarily returns missing for $p = 0$ or $1$.) – Nick Cox Jun 19 '18 at 17:01
  • I was referring to the fact that the values are returned as missing after applying the logit link function. I am sorry for the confusion and unclarity. I followed your advice and created a new question: https://stats.stackexchange.com/questions/352836/fixed-effects-for-fractual-response-variable-with-many-zero-observations – XsLiar Jun 23 '18 at 12:27
  • 1
    Sorry, still lost, as I don't know what "fact" you're referring to. Again, I suspect you're confusing logit link (as in Stata commands such as `glm` and `xtgee`) and logit transformation. No commands based on link functions create or change the data. – Nick Cox Jun 23 '18 at 15:05

0 Answers0