27

Is it possible to revert the softmax function in order to obtain the original values $x_i$?

$$S_i=\frac{e^{x_i}}{\sum e^{x_i}} $$

In case of 3 input variables this problem boils down to finding $a$, $b$, $c$ given $x$, $y$ and $z$:

\begin{cases} \frac{a}{a+b+c} &= x \\ \frac{b}{a+b+c} &= y \\ \frac{c}{a+b+c} &= z \end{cases}

Is this problem solvable?

jojek
  • 1,243
  • 1
  • 13
  • 17

2 Answers2

25

Note that in your three equations you must have $x+y+z=1$. The general solution to your three equations are $a=kx$, $b=ky$, and $c=kz$ where $k$ is any scalar.

So if you want to recover $x_i$ from $S_i$, you would note $\sum_i S_i = 1$ which gives the solution $x_i = \log (S_i) + c$ for all $i$, for some constant $c$.

angryavian
  • 85,948
  • 6
  • 61
  • 133
  • 1
    So it’s solvable up to a constant. Thank you! – jojek May 18 '18 at 17:39
  • 1
    Which ```c``` constant should I use? There is any way of calculating it? – Joel Carneiro Feb 07 '19 at 17:16
  • 3
    @JoelCarneiro Any $c$ will work; the solution is not unique. – angryavian Feb 07 '19 at 17:58
  • Any $c$ will work, one choice is if you augment the $x_i$ vector like $(0, x_1,...,x_n)$ then this will induce a particular $c$, note the corresponding log-sum-exp -- the gradient of which is the softmax -- would also be convex (https://en.wikipedia.org/wiki/LogSumExp). – Josh Albert Jul 29 '19 at 10:59
  • 1
    In the case anybody like me spend too much time figuring out $c$: If you know your 3 input variables have to sum to 1 then your $c = (1 - log(x \cdot y \cdot z))/3)$. – Rasmus Ø. Pedersen Sep 14 '20 at 13:04
  • Usually they normalize the xi to sum to zero, in which case the inverse softmax would be log(x)-mean(log(x)). Or they code the first xi to be zero, in which case the inverse softmax would be log(x)-log(x[[1]]). – Tom Wenseleers Aug 24 '22 at 20:12
15

The softmax function is defined as:

$$S_i = \frac{\exp(x_i)}{\sum_{i} \exp(x_i)}$$

Taking the natural logarithm of both sides:

$$\ln(S_i) = x_i - \ln(\sum_{i} \exp(x_i))$$

Rearranging the equation:

$$x_i = \ln(S_i) + \ln(\sum_{i} \exp(x_i))$$

The second term on the right-hand side is a constant over all $i$ and can be written as $C$. Therefore, we can write:

$$x_i = \ln(S_i) + C$$

This answer is adapted from this post on Reddit.