2

I have some data where the response variable is a proportion, and I am experimenting with transformation using Tukey's family of folded powers, $f(p) = p^\lambda - (1 - p)^\lambda$, with values of $\lambda$ from 0 to 1.

Folded roots are nicely described by @Nick Cox here: https://stats.stackexchange.com/a/195305/212689

and @whuber here: https://stats.stackexchange.com/a/10979/212689.

Trying to get my head round this, I have a couple of questions:

  1. @whuber states that 'When $\lambda = 1/2$ we get the folded root, or "froot," $f(p) = \sqrt{1/2}\left(\sqrt{p} - \sqrt{1-p}\right)$.' Assuming that this is the same use of $\lambda$ as in the equation I give above, I'm struggling to see how to rearrange to get $\sqrt{1/2}$ at the beginning of the equation. Can anyone explain?
  2. How do I back-transform from $f(p)$ to get back to $p$? Again, I'm struggling with the maths (especially as $(1 - p)^\lambda$ is a never-ending binomial expansion)!
Izy
  • 579
  • 5
  • 17
  • 2
    For the answer to (1), please use the formula I gave in my answer rather than the (not quite correct) formula you quote at the outset. For (2), use the fact that $\lambda$ typically is a small integer or a reciprocal of a small integer. – whuber Mar 28 '19 at 15:55
  • @whuber, thanks for the reply. (1) folded root of p = root of p - root of (1 - p) is the definition given at https://www.stata.com/users/njc/topichlp/transint.hlp . I also read Nick Cox's version in the linked question above to be the formula as I gave it (he also quotes Tukey). I haven't been able to find a copy of Tukey's book 'Exploratory Data Analysis' online to check there yet. Is stata.com wrong? – Izy Mar 28 '19 at 16:51
  • (2) I'm afraid I still can't see where to start on the maths for this. I am looking for an equation for back transformation (I haven't been able to find one whereas they are widely explained for common transformations like logit). I would definitely appreciate a step by step run through of the maths too. – Izy Mar 28 '19 at 16:55
  • It makes little sense to ask why my formula isn't the same as somebody else's! In my answer I explain that this is *Tukey's* formula and I provide the reason why the coefficient is what it is. To understand the mathematics, start with a simple case such as $\lambda=1$ or $\lambda=1/2$ or even $\lambda=0$ (the logarithm). – whuber Mar 28 '19 at 18:05
  • @whuber, I'm still new here and this is my first time asking a question so apologies if I'm not following the etiquette correctly. I think both you and Nick Cox refer to the same Tukey reference in your respective answers, and both of you call it the folded root, but from my understanding you give different equations for it. So I was asking if my understanding of Nick Cox's equation and the stata.com definition was wrong, or if you and Nick Cox are calling something different the same thing (folded root), or if they are in fact equivalent? – Izy Mar 28 '19 at 23:13
  • 2
    There is a subtle difference between @whuber's presentation and mine -- which corresponds to different versions in Tukey's work. The basic idea of folded transformations for proportions $p$ between $0$ and $1$ is to use a transformation $g()$ in folded manner, as $g(p) - g(1 - p)$. For example, $g(p)$ might be $\ln p$, $\sqrt p$, etc. In all cases, the result is $0$ at $p = 0.5$, positive for $p > 0.5$ and negative for $p < 0.5$. Tukey sometimes refines this to ensure not only identical values at $p = 0.5$, but also identical slopes at that point. The refinement is to divide by $2g'(1/2)$. – Nick Cox Mar 28 '19 at 23:49
  • 3
    As far as I can recall, Tukey just uses examples of the general rule for such "matching" but @whuber neatly gave the general recipe. I omit this divisor for simplicity, as did Tukey much of the time and have many other expositors, but there is no fundamental objection to it, and it certainly allows emphasis on the transformations as being a close family. – Nick Cox Mar 28 '19 at 23:53
  • There is a more up-to-date version of transint.hlp at http://fmwww.bc.edu/repec/bocode/t/transint.html but this particular nuance is not discussed there or in the previous version you cite. – Nick Cox Mar 28 '19 at 23:59
  • Thanks a lot @Nick Cox, that helps to clear that one up for me (so I'm not going mad after all!). I think I have rearranged the back transformation to get p(1-p)=((1-2(f(p)^2))/2)^2, if that's right then I'll see if I can make the final step in the morning. – Izy Mar 29 '19 at 00:16
  • "In all cases" in my first comment wouldn't satisfy a rigorist as it's easy to imagine functions for which the folded transformation behaves differently from what is stated. "In all cases" can fairly mean as discussed in this context. – Nick Cox Mar 29 '19 at 00:20
  • Note that any copies of _Exploratory Data Analysis_ online would be unauthorised. It seems that Pearson, who swallowed the original publishers Addison-Wesley, are bringing out a reprint with publication date 2020, which perhaps means very soon given publishers' habits. – Nick Cox Mar 29 '19 at 00:30

2 Answers2

2

Okay, so with help from @whuber and @Nick Cox (thank you!), I think I can now answer this for a folded root with $\lambda=(1/2)$.

  1. a. @Nick Cox gives the formula $f(p) = p^\lambda - (1 - p)^\lambda$ for the folded root.

    b. @whuber gives the formula $f(p) = \sqrt{1/2}\left(\sqrt{p} - \sqrt{1-p}\right)$ for the folded root where $\lambda=(1/2)$.

These are different, but both are versions used by Tukey. I think that both versions are symmetrical about p=0.5, and f(0.5)=0 for both. I plotted a quick graph of f(p) against p for each version:

f(p) against p for different version of folded root formula

  1. I had to go about this one by writing out the algebra, squaring both sides of the equation and rearranging. Eventually I get down to the quadratic equation, with $p - p^2-c = 0$, so $a=-1$ and $b=1$ in the quadratic.

a. For @Nick Cox's formula, $c=((1-(f(p))^2)/2)^2$.

b. For @whuber's formula, $c=((1-2(f(p))^2)/2)^2$.

The quadratic formula gives us two possible solutions. If f(p) is negative, then p<0.5, and we want the solution for the quadratic form $(-b+\sqrt{b^2-4ac})/2a$. If f(p) is positive, then p>0.5, and we want the solution for the quadratic form $(-b-\sqrt{b^2-4ac})/2a$ (I'm not sure if there's a theoretical/mathematical reason for why it works out that way round?).

There may be a simpler/more elegant way to do this, I would love to hear what it is. Particularly because I am still struggling to generalise this to less simple cases e.g. $\lambda=(2/3)$, where I'm stuck at multiplying out the cubed bracket/trying to simplify from there.

Izy
  • 579
  • 5
  • 17
  • 1
    (+1) Generally, there is no neat formula. One would use numerical methods to perform the inverse transformation. – whuber Mar 29 '19 at 13:51
  • Okay, thanks @whuber. So in that case it's sensible to choose values of lambda, like 1/2, for which back-transformation is (relatively) numerically straightforward, at least in my case where I would like to back-transform confidence intervals. – Izy Mar 29 '19 at 14:17
1

Adding to the other excellent answer, here I just show a solution for general $\lambda$. That must be a numerical solution. First, I define the folded power as $$ f(p) = \frac{p^\lambda - (1-p)^\lambda}{\lambda} $$ in analogy with the definition of the Box-Cox transformation as $\frac{y^\lambda - 1}{\lambda}$, obtaining that way a useful limit when $\lambda \to 0$, the logit function $\text{logit}(p)=\log(p)-\log(1-p)$. One can check that the folded power such defined is always monotone increasing.

The practical way of computing the inverse is by numerics, I will just give some simple Rcode:

 f <- function(x, lambda) (x^lambda - (1-x)^lambda)/lambda
 f_inv <- function(q, lambda, ...) uniroot(function(x)f(x, lambda)-q, interval=c(0, 1), extendInt="no", ...)$root
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • 1
    Great! Your definition of the folded power gives a more extreme transformation at λ=(1/2) (more similar to the logit but straighter for moderately small/large values of p) than the definitions used by whuber and Nick Cox. – Izy Apr 01 '19 at 13:22
  • 1
    A modification of the R code works for @Nick Cox's version of the folded root: # f_nc – Izy Apr 01 '19 at 13:24
  • 1
    But it doesn't seem to work for @whuber's version of the folded root, which I've coded into R as # f_whuber – Izy Apr 01 '19 at 13:31
  • Correction - a 'gda' crept in on the end of the 'root' in my second comment above, which is the only reason that it wasn't working. So the back-transformation does work for whuber's version of the folded root as well, although it breaks for p=0 and p=1 if the value of the folded root has been rounded down/up. Thanks @kjetil b halvorsen, really nice! – Izy Apr 01 '19 at 16:17