Calculating RNN loss (for a SINGLE example) as a sum of individual time step losses VS. an average of individual time step losses

Asked Jan 25 '22 at 13:36

Active Jan 25 '22 at 13:54

Viewed 13 times

In Andrew Ng's course, I see RNN loss being calculated as a sum of the losses from each time step as seen here:

In Stanford's CS224N, I see loss calculated as an average of individual losses as seen here:

Why are there two different approaches? Which one is preferred?

edited Jan 25 '22 at 13:54

asked Jan 25 '22 at 13:36

Hank

0 Answers0