3

Last year, Ilya Sutskever and collaborators came out with a paper about a recurrent LSTM net that learns sequence to sequence mappings for machine translation. It's somewhat surprising that the authors used LSTM instead of Hessian-Free to train this net since the first author was one of the innovators behind the development of Hessian-Free methods for recurrent nets (citation).

I was wondering if anyone has tried Hessian-Free for learning sequence to sequence mappings for machine translation. If so, does it work? Is its performance inferior to LSTM's in some way?

Franck Dernoncourt
  • 42,093
  • 30
  • 155
  • 271
sudo-nim
  • 143
  • 4
  • 1
    (I am not really familiar with either of these terms ... but why let that stop me?) I thought LSTM would be a network *architecture*, whereas "Hessian Free" sounds like an *optimization* (training) method? – GeoMatt22 Dec 17 '16 at 18:14

0 Answers0