Binary Hinge Loss. Choosing best model

Asked May 30 '17 at 14:47

Active May 30 '17 at 14:47

Viewed 136 times

I was reviewing the answer to the question about the "Gradient of Hinge Loss" here: and noticed that one of the answers proposed returning of averaged weights in function:

def grad_descent(): 
...
return np.sum(ws,1)/np.size(ws,1)

I m curious if there is any special reason for averaging weights? Why not to find best weights by using the validating set on each epoch?

asked May 30 '17 at 14:47

Alex

Binary Hinge Loss. Choosing best model

0 Answers0