2

I have 320 data points - each has a redshift and a turnover-frequency, and I want to fit a correlation between them (a linear fit). However, 120 of the turnover-frequency values are upper limits. As shown below the relationship is very weak: enter image description here

the red points have upper limits for their turnover-frequency value. I would like to fit a straight line to this data that accounts for the upper limits. I think the tobit model in R can do this, but I'm not too sure (I'm new to R by the way); before I start with that, are there other methods in R I should consider?

Matt Majic
  • 163
  • 4

1 Answers1

2

If you have observations that assume the upper and lower limits of the possible observation range, then a tobit (= censored Gaussian) could be one possible way to go.

In R this and other censored regression models can be fitted in the survival package using the survreg() function. The manual page for the ?tobin data set contains a worked example from the original left-censored Tobin data. For your data, you would have to flag the right-censored points that assume their upper limits with the Surv() function.

A convenience interface to survreg() that facilitates setting the upper and lower limit without calling Surv() by hand is the tobit() function in the AER package. By default, observations are assumed to be left-censored at zero and uncensored on the right but both limits can be changed easily. Observation-specific censoring points can also be used using -Inf or Inf for those points that are not left- or right-censored, respectively.

Achim Zeileis
  • 13,510
  • 1
  • 29
  • 53