1

Let $X$ be a normally-distributed random variable. How can I transform this $X \in \mathbb{R}$ into another r.v. $Y \in [0; 1]$ whilst maintaining its normal-like shape?

One common way of performing a one-to-one transformation from $\mathbb{R}$ into $[0; 1]$ is to calculate the CDF of the original variable. The "problem" with that approach is that the result will be uniformly-distributed. This is a well-known property, but here is an example to illustrate:

set.seed(2345)
x <- rnorm(1000)
hist(x)
y <- pnorm(x)
hist(y)

The histograms of $x$ and $y$ will look like this: Histograms of x and y

However, as I mentioned, I would like to transform $X \in \mathbb{R}$ into $Y \in [0; 1]$ and have $Y$ bell-shaped?

Waldir Leoncio
  • 2,137
  • 6
  • 28
  • 42
  • 1
    I'm probably misunderstanding something, but what are you trying to get? If $x$ is already normally distributed, what do you mean by "one-to-one transform into a normally-distributed $Y$"? You could put a `qnorm` on the runif, but that will only get you the original variable... – juod Feb 27 '17 at 10:08
  • @juod, I've slightly edited my question, I hope it will be clearer now. I want $Y$ to be bell-shaped and $\in (0, 1)$. – Waldir Leoncio Feb 27 '17 at 10:10
  • 1
    Your question (see title and first paragraph) still calls Y normal. It can't be normal if it's bounded. "Bell shaped" could be almost anything unimodal and more or less symmetric. What properties should this thing on (0,1) have? Is Beta(3,3) close enough, for example? If you don't actually need normality, what is it supposed to achieve? – Glen_b Feb 27 '17 at 10:26
  • 3
    It's not clear what are you trying to achieve, if you just want bell-shape data in $(0,1)$ why not simply do `(x - min(x))/(max(x) - min(x))`? – Francis Feb 27 '17 at 10:41
  • @Glen_b, thanks for the tip, I've updated the title. I only care about $Y$ having a bell shape and being constrained between 0 and 1. This is important to me because I want to feed them as the probability parameters of different binomials. After generating one observation from each of those binomials, I should get a vector of results that resemble something normally-distributed. – Waldir Leoncio Feb 27 '17 at 10:50
  • @Francis, thank you for your input, your answer achieves what I want! I just need a minute to understand why. – Waldir Leoncio Feb 27 '17 at 10:52
  • 1
    @WaldirLeoncioL it's in fact [simple re-scaling](https://en.wikipedia.org/wiki/Feature_scaling). – Francis Feb 27 '17 at 10:55
  • @Francis, thanks again. I could really use a simple solution. I had tried other scaling methods such as standardization, but re-scaling itself didn't cross my mind. I'd be glad to upvote your solution if you post it as an answer. – Waldir Leoncio Feb 27 '17 at 11:01
  • Although "bell-shaped" is merely qualitative, most people who use this term in statistics mean "normal." As such this question is unfathomable, because it appears to be asking how to transform a Normal variable into a Normal variable. To be answerable, it needs clarification and more precision. – whuber Feb 27 '17 at 14:18
  • @Waldir for a distribution over the probability parameter of a binomial (that info should be in your question), the beta distribution is a very common choice (in part because it's conjugate which is convenient); I'll write an answer (which will also suggest other possibilities) if you clarify the question enough for it to re-open (its status won't change if you don't edit it). {Note that you can't linearly rescale a normal distribution to be in (0,1); you can normalize a set of data values into that range, of course.] – Glen_b Feb 27 '17 at 21:32
  • @whuber, thank you for your input. I've edited my question and I hope I've made it clear enough to warrant its reopening. The idea is that the transformation would be from a normally-distributed variable (spread over all the real line) into another variable which takes real values between 0 and 1 and has a similar shape to the first one. – Waldir Leoncio Feb 28 '17 at 07:53
  • @Glen_b, I hope my most recent edit has made the question clear enough to warrant a reopening. I've already solved the issue through rescaling (as proposed by Francis above), but I believe this page could be useful for someone else in the future. – Waldir Leoncio Feb 28 '17 at 07:55
  • How does rescaling take a variable on $(-\infty,\infty)$ to $(0,1)$? – Glen_b Feb 28 '17 at 09:16
  • While there are still some issues with the question, I think it's just about clear enough to give some reasonable answers, and so I have reopend. Perhaps it can be further clarified. – Glen_b Feb 28 '17 at 09:19
  • @Glen_b, simple rescaling would apply `{x - min(x)} / {max(x) - min(x)}` to transform $x \in (-\infty, \infty)$ into $(0, 1)$. In my application, I've actually used `min - IQR` and `max + IQR` instead of `min` and `max` so I could have values a bit more concentrated around 0.5. – Waldir Leoncio Feb 28 '17 at 10:33
  • 1
    @Waldir If you're transforming a normal *variable*, $X$ (as you clearly state in your question), then there is no finite maximum value of $X$ (it is on the whole real line). If you're transforming a *sample* based on its sample characteristics - like the sample maximum and minimum) then the shape of the resulting distribution is not what you expect. For example in R try `res=replicate(10000,{x=rnorm(6);(x-min(x))/(max(x)-min(x))});hist(res,n=50)` to see what it does to samples of size 6. – Glen_b Feb 28 '17 at 11:14
  • Indeed, even if you just standardize by sample mean and standard deviation the result may not be distributed reasonably close to what you expect unless the sample sizes are moderately large; in small samples it has surprising effects. e.g. see `hist(replicate(10000,scale(rnorm(4))),n=50)` – Glen_b Feb 28 '17 at 11:20
  • @Glen_b, those are intersting results indeed. Thank you once again for your contribution! – Waldir Leoncio Feb 28 '17 at 11:32

3 Answers3

4

As you correctly point out, $U=F_X(X)$ will be uniform. Consider now that we have (somehow) identified some sufficiently "normal-like" continuous strictly monotonic distribution function on (0,1) (which we'll call $G$).

Then $Y=G^{-1}(U)$ has distribution $G$. So $Y=G^{-1}(F_X(X))\sim G$

So if we can find some suitable $G$, we're done.


Possible distributions with the kind of properties you want:

  1. The beta distribution. This is very often used as a distribution for the binomial parameter in Bayesian statistics. If you choose the two parameters to be approximately equal and larger than 2, then you have something that looks roughly normal. The inverse cdf is widely available.

  2. A suitably truncated normal. Rescale a normal so that it has mean near the middle of (0,1) and most of its probability in (0,1). For example, $N(\frac12,\frac16^2)$ truncated to the unit interval. The inverse cdf is fairly convenient and (if you don't need the tails of the original distribution) this can avoid the need to go via the uniform.

  3. Logit-normal, for some parameter values; to get those, rescale your normal to have a suitable parameter combination such that the logit-normal looks like you want. Typically you'll want $\mu$ not too far from $0$ and $\sigma<1$ (a value like $\frac12$ or $\frac13$ -- or less -- will often be reasonable). Incidentally I think your own answer is a particular case of this with $\sigma=1$. The inverse cdf is quite convenient and this also avoids the need to convert to uniform first.

  4. A standard Bates distribution. Take the mean of $k$ independent standard uniforms. The inverse cdf isn't particularly convenient if you're doing this from scratch, but if you can find an existing function to do this (or the related Irwin-Hall, followed by a rescaling), this may be convenient.

  5. Raised cosine. Specifically, the one with $f_Y(y) = 1+\cos(\pi(2x-1))\mathbb{I}_{(0,1)}$.

There are many other fairly obvious choices; for example a suitably truncated $t$ would allow you to push up the kurtosis a little (I think all of the previous examples have lower kurtosis than the normal; you may in some circumstances want to get it a bit closer to that of the normal); you could also consider scale mixtures for example.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
1

Thanks to Francis, I ended up doing a simple rescaling of the data in order to achieve what I wanted.

Scaling is achieved by performing the following transformation:

$$ y = \frac{x - \min(x)}{\max(x) - \min(x)} $$

This is a great solution for me because it's fairly shape-constant. No matter how my original variable $X$ looks, $Y$ will be a horizontally-distorted version of it, squished from its original range to $(0, 1)$. This works for me because I had a bell-shaped $X$ which could take any real value and wanted a bell-shaped $Y$ which was $\in (0, 1)$.


One drawback of the transformation above for my case was that $\min(x)$ becomes 0 and $\max(x)$ becomes 1 by definition. I wanted $Y$ to be a little more concentrated around 0.5, so the transformation I've actually used was this:

$$ y = \frac{x - [\min(x) - \text{IQR}(x)]}{[\max(x) + \text{IQR}(x)] - [\min(x) - \text{IQR}(x)]} = \frac{x - \min(x) + \text{IQR}(x)}{\max(x) - \min(x) + 2\text{IQR}(x)}, $$ where $\text{IQR}(x) = Q_3(x) - Q_1(x)$ is the interquartile range of $X$.

What happens here is I'm giving the minimum and maximum values extra padding, thus increasing the "range of the observed values of $X$" and, consequently, having the transformed values lay farther from the extremities 0 and 1.

Waldir Leoncio
  • 2,137
  • 6
  • 28
  • 42
0

One solution would be to use the CDF of a logistic distribution. In the example:

set.seed(2345)
x <- rnorm(1000)
y <- 1 / (1 + exp(-x))  # assumes mu = 0 and sd = 1 for Y.

This should give the following histogram for $y$: histogram of y

At first this seems to answer my question, but I welcome other solutions very much, since this transformation may be too similar to another method I am using and comparing.

Waldir Leoncio
  • 2,137
  • 6
  • 28
  • 42
  • Why not just sample from the logistic distribution directly? – Neil G Feb 28 '17 at 11:04
  • @NeilG, because I need the original data for something else. $X$ in my case is a matrix of test answers (correct/incorrect per individual and question); I am exploring several pre-smoothing techniques for this matrix. – Waldir Leoncio Feb 28 '17 at 11:30
  • Then $X$ isn't normally distributed since it's bounded? – Neil G Feb 28 '17 at 12:02
  • @NeilG, sorry for misleading you, $X$ is actually the skills of the test takers, which is modeled as a normally-distributed latent trait. From that I am trying to derive the probability of acing a test ($Y$). – Waldir Leoncio Feb 28 '17 at 12:20