Questions tagged [truncation]

Truncation is a process that results in the omission of data that are beyond a threshold.

Truncation is a process that results in data that are beyond a threshold being omitted from a dataset. Truncation is related to, but distinct from, censoring and missing data. When data are censored, they are listed in the dataset, but their values are only partially known, e.g., we might know that a value is $>t$. On the other hand, with a truncated dataset, values $>t$ are not included in the dataset. With missing values, we know that an observation exists, but we do not know what the value is for some variable, whereas with truncation that observation is not recorded at all. Another distinction is between data that are not in our dataset, but are assumed to exist (truncation), and data that cannot exist for some reason.

An example of truncated data might be the scores for people in a program where people are not eligible for the program unless their scores are beyond some cutoff.

147 questions
37
votes
2 answers

What is the difference between censoring and truncation?

In the book Statistical Models and Methods for Lifetime Data , it is written : Censoring: When an observation is incomplete due to some random cause. Truncation: When the incomplete nature of the observation is due to a systematic selection process…
ABC
  • 1,367
  • 3
  • 13
  • 31
35
votes
1 answer

Maximum likelihood estimators for a truncated distribution

Consider $N$ independent samples $S$ obtained from a random variable $X$ that is assumed to follow a truncated distribution (e.g. a truncated normal distribution) of known (finite) minimum and maximum values $a$ and $b$ but of unknown parameters…
33
votes
5 answers

What are the relative merits of Winsorizing vs. Trimming data?

Winsorizing data means to replace the extreme values of a data set with a certain percentile value from each end, while Trimming or Truncating involves removing those extreme values. I always see both methods discussed as a viable option to lessen…
Brian
  • 551
  • 1
  • 5
  • 8
18
votes
1 answer

Are truncated numbers from a random number generator still 'random'?

Here 'truncating' implies reducing precision of the random numbers and not truncating the series of random numbers. For example, if I have $n$ truly random numbers (drawn from any distribution, e.g., normal, uniform, etc.) with arbitrary precision…
steadyfish
  • 1,772
  • 2
  • 15
  • 30
17
votes
3 answers

Simulate constrained normal on lower or upper bound in R

I'd like to generate random data from a constrained normal distribution using R. For example I might want to simulate a variable from a normal distribution with mean=3, sd= 2 and any values larger than 5 are resampled from the same normal…
Jeromy Anglim
  • 42,044
  • 23
  • 146
  • 250
15
votes
3 answers

What does truncated distribution mean?

In a research article about sensitivity analysis of an ordinary differential equation model of a dynamic system, the author provided the distribution of a model parameter as Normal distribution (mean=1e-4, std=3e-5) truncated to the range [0.5e-4…
Kavka
  • 453
  • 3
  • 10
14
votes
4 answers

R/Stata package for zero-truncated negative binomial GEE?

this is my first post. I'm truly grateful for this community. I am trying to analyze longitudinal count data that is zero-truncated (probability that response variable = 0 is 0), and the mean != variance, so a negative binomial distribution was…
Iris Tsui
  • 681
  • 4
  • 14
12
votes
2 answers

Censoring/Truncation in JAGS

I have a question on how to fit a censoring problem in JAGS. I observe a bivariate mixture normal where the X values have measurement error. I would like to model the true underlying 'means' of the observed censored values. \begin{align*} \lceil…
Glen
  • 6,320
  • 4
  • 37
  • 59
12
votes
2 answers

Efficiently sampling a thresholded Beta distribution

How should I efficiently sample from the following distribution? $$ x \sim B(\alpha, \beta),\space x > k $$ If $k$ is not too big then rejection sampling may be the best approach, but I am not sure how to proceed when $k$ is large. Perhaps there is…
user1502040
  • 291
  • 2
  • 7
11
votes
2 answers

How should I model a continuous dependent variable in the $[0, \infty]$ range?

I have a dependent variable that can range from 0 to infinity, with 0s actually being correct observations. I understand censoring and Tobit models only apply when the actual value of $Y$ is partially unknown or missing, in which case data is said…
Robert Kubrick
  • 4,078
  • 8
  • 38
  • 55
10
votes
1 answer

What is the Fisher information for the truncated poisson distribution?

The zero-truncated poisson distribution has probability mass function: $$P(X=k) = \frac{e^{-\lambda}\lambda^k}{(1-e^{-\lambda})k!}$$, $k=1,2,...$ And the expectation of the truncated Poisson distribution via MLE is given as…
10
votes
1 answer

How to derive the mean and variance of a $k$-truncated Poisson?

How can I derive the mean and variance of a $k$-truncated Poisson? Here, $k$ is the cutoff value such that only values strictly larger than $k$ are allowed, i.e. the probability mass function is $$p_j = q_k^{-1} \lambda^j e^{-\lambda}/j!, \qquad…
cath
  • 101
  • 1
  • 3
9
votes
1 answer

Properties of bivariate standard normal and implied conditional probability in the Roy model

Sorry for the long title, but my problem is quite specific and hard to explain in one title. I am currently learning about the Roy Model (treatment effect analysis). There is one derivation step at my slides, which I do not understand. We calculate…
Ivanov
  • 240
  • 2
  • 7
9
votes
2 answers

How to calculate the truncated or trimmed mean?

How can I calculate the truncated or trimmed mean? Let's say truncated by 10%? I can imagine how to do it if you have 10 entries or so, but how can I do it for a lot of entries?
Queops
  • 461
  • 1
  • 4
  • 9
9
votes
2 answers

Is sampling from a folded normal distribution equivalent to sampling from a normal distribution truncated at 0?

I wish to simulate from a normal density (say mean=1, sd=1) but only want positive values. One way is to simulate from a normal and take the absolute value. I think of this as a folded normal. I see in R there are functions for truncated random…
Glen
  • 6,320
  • 4
  • 37
  • 59
1
2 3
9 10