1

I was reviewing the answer to the question about the "Gradient of Hinge Loss" here: and noticed that one of the answers proposed returning of averaged weights in function:

def grad_descent(): 
...
return np.sum(ws,1)/np.size(ws,1)

I m curious if there is any special reason for averaging weights? Why not to find best weights by using the validating set on each epoch?

Alex
  • 121
  • 2

0 Answers0