1

I am training GoogleNet on the Stanford cars data set. It's 8000 training images of cars with labels (2004 Toyota Camry).

  • I made minimal changes to the network. I just changed the loss outputs to 196 since I have 196 types of vehicles.
  • I used the pretrained Caffe GoogleNet weights as initial weights.
  • The image dimensions are scaled to 224x224.

I consistently see the following behavior. My loss hovers between 3 to 5 and then hits some iteration where it just sky rockets all the way up to 87. You can see it happens in the 15400 iteration and I'm on the 39320 iteration right now and it hasn't changed from 87. The learning rate is bouncing about a little bit.

What causes this kind of behavior? Should I just cut my losses and use the weights around the 15400 iteration for inferences?

I0921 19:00:23.580992    36 solver.cpp:218] Iteration 15360 (2.7357 iter/s, 14.6215s/40 iters), loss = 5.07562
I0921 19:00:23.581143    36 solver.cpp:237]     Train net output #0: loss1/loss1 = 4.07955 (* 0.3 = 1.22386 loss)
I0921 19:00:23.581161    36 solver.cpp:237]     Train net output #1: loss2/loss2 = 3.34162 (* 0.3 = 1.00248 loss)
I0921 19:00:23.581168    36 solver.cpp:237]     Train net output #2: loss3/loss3 = 3.1286 (* 1 = 3.1286 loss)
I0921 19:00:23.581182    36 sgd_solver.cpp:105] Iteration 15360, lr = 0.00996795
I0921 19:00:38.185421    36 solver.cpp:218] Iteration 15400 (2.73888 iter/s, 14.6045s/40 iters), loss = 13.0996
I0921 19:00:38.185487    36 solver.cpp:237]     Train net output #0: loss1/loss1 = 87.3365 (* 0.3 = 26.201 loss)
I0921 19:00:38.185503    36 solver.cpp:237]     Train net output #1: loss2/loss2 = 87.3365 (* 0.3 = 26.201 loss)
I0921 19:00:38.185511    36 solver.cpp:237]     Train net output #2: loss3/loss3 = 87.3365 (* 1 = 87.3365 loss)
I0921 19:00:38.185523    36 sgd_solver.cpp:105] Iteration 15400, lr = 0.00996786
mj_
  • 233
  • 2
  • 5

1 Answers1

0

I found an answer to my problem. My dataset was 6,000 images in size with nearly 200 classes. This leaves too few sample images per class. My network could never converge as a consequence.

mj_
  • 233
  • 2
  • 5