Hessian-Free instead of LSTM for Recurrent Net Machine Translation

Asked Feb 11 '15 at 22:32

Active Dec 17 '16 at 18:10

Viewed 207 times

Last year, Ilya Sutskever and collaborators came out with a paper about a recurrent LSTM net that learns sequence to sequence mappings for machine translation. It's somewhat surprising that the authors used LSTM instead of Hessian-Free to train this net since the first author was one of the innovators behind the development of Hessian-Free methods for recurrent nets (citation).

I was wondering if anyone has tried Hessian-Free for learning sequence to sequence mappings for machine translation. If so, does it work? Is its performance inferior to LSTM's in some way?

edited Dec 17 '16 at 18:10

Franck Dernoncourt

42,093
30
155
271

asked Feb 11 '15 at 22:32

sudo-nim

1

(I am not really familiar with either of these terms ... but why let that stop me?) I thought LSTM would be a network *architecture*, whereas "Hessian Free" sounds like an *optimization* (training) method? – GeoMatt22 Dec 17 '16 at 18:14

Hessian-Free instead of LSTM for Recurrent Net Machine Translation

0 Answers0