2

I have a multi-task learning model with two binary classification tasks. One part of the model creates a shared feature representation that is fed into two subnets in parallel. The loss function for each subnet at the moment is NLL, with a Softmax layer at the end of each.

I want to maximise the entropy in one task so the model doesn't/can't learn anything about that one task, and then I think the resulting accuracy for that task should be 50% i.e no better than chance.

Since I'm not an expert I've been advised (rightly or wrongly I haven't decided yet) one way to do this is use the cost function from a GAN discriminator to maximise the entropy for this particular task.

I've found the discriminator loss function in Goodfellow's GAN tutorial. It looks like the y labels have been replaced with 0.5. I'm using Torch and I can't see how this could be implemented, since I was presuming the y label for each sample could only be 0 or 1 when using the binary cross entropy loss function.

So far scouring the web hasn't been a great deal of help. Can anyone shed any light with an answer? Not so much on Torch implementation, rather the theory.

Thanks!

JM1982
  • 21
  • 3

0 Answers0