0

I am trying to understand the equivalence between the logistic function and the softmax function, when k = 2. My understanding is that the following code should output the same values. Where is my mistake?

def sigmoid(x):
    x = np.asarray(x, dtype=np.float)
    x = 1 / (1 + np.exp(-x))
    return x

def softmax(x):
    x = np.asarray(x)
    x = np.exp(x)/np.sum(np.exp(x))
    return x

x = .5
b1 = 1
b2 = -1
sigmoid(x*b1), sigmoid(x*b2), softmax([x*(b1-b2), x*(b2-b2)])
---
Output> 0.6224593312018546, 0.3775406687981454, array([0.73105858, 0.26894142])

Is it because I am assuming that the softmax optimization would find the same betas?

Sycorax
  • 76,417
  • 20
  • 189
  • 313
RV1994
  • 25
  • 6
  • Also: https://stats.stackexchange.com/questions/233658/softmax-vs-sigmoid-function-in-logistic-classifier/254071#254071 – Sycorax Sep 04 '19 at 12:42
  • I see no reason why these two functions should return the same values, because they appear to use different denominators. Could you explain why you think they are equivalent? – whuber Sep 04 '19 at 13:13
  • @whuber The two links above show why the two are theoretically linked, and the answer below shows what is the relationship between the coefficients that I was missing. – RV1994 Sep 04 '19 at 13:58

1 Answers1

2

The assumption for two class logistic regression and softmax to have the same values is the follows

beta ( logistic regression ) = -(beta1 - beta2)

Following code gives the matching numbers

    def sigmoid1(x):
        x = np.asarray(x, dtype=np.float)
        x = 1 / (1 + np.exp(-x))
        return x

    def sigmoid0(x):
        x = np.asarray(x, dtype=np.float)
        x = np.exp(-x) / (1 + np.exp(-x))
        return x

    def softmax(x):
        x = np.asarray(x)
        x = np.exp(x)/np.sum(np.exp(x))
        return x

    x = .5
    b = -2
    b1 = 1
    b2 = -1
    sigmoid0(x*b), sigmoid1(x*b), softmax([x*b1, x*b2])

(0.7310585786300049, 0.2689414213699951, array([0.73105858, 0.26894142]))
Anant Gupta
  • 300
  • 1
  • 3