2

I use a Tobit model to predict censored data. I use the AER package in R.

A toy example looks as follows:

library(AER)
N = 10
f = rep(c("s1","s2","s3","s4","s5","s6","s7","s8"),N)
fcoeff = rep(c(-1,-2,-3,-4,-3,-5,-10,-5),N)
set.seed(100) 
x = rnorm(8*N)+1
beta = 5
epsilon = rnorm(8*N,sd = sqrt(1/5))
y.star = x*beta+fcoeff+epsilon ## latent response
y = y.star 
y[y<0]<-0 ## censored response

my.data = data.frame(x,f)
fit <- tobit(y~0+x+f,data=my.data)

my.range = range(y,y.star,predict(fit))
plot(y,ylim = my.range)
lines( ifelse(predict(fit)>0,predict(fit),0),col="red")

The values returned by predict(fit) give me the expected value under the model. How can I derive a e.g. 90% confidence interval around this expected value?

Richi W
  • 3,216
  • 3
  • 30
  • 53

1 Answers1

4

As discussed previously in Censored regression in R the predict() method predicts the expectation of the latent uncensored variable y.star, not the expectation of the censored variable y. So simply cutting at 0 as you do might not be the ideal solution.

Regarding the prediction interval: You can set predict(fit, se.fit = TRUE) as you would do in predict.glm() to obtain the estimated standard error of the prediction.

This behavior is inherited from the survreg infrastructure that is called internally. See help("predict.survreg", package = "survival") for more details and further options.

Achim Zeileis
  • 13,510
  • 1
  • 29
  • 53
  • 1
    Thank you (once again) for the explanation. I remember the discussion back then. I just had the impression that the usual predict better fits the observations. I have to go through my script and you explanation again ... thanks a lot! – Richi W Aug 28 '15 at 13:57