2

Is there a solid reference on pre-training methods in deep neural networks which never see the actual inputs? Any such known thing in literature?

I guess a more correct term is "initialization using gradient methods" instead of "pre-training".

I see it like this: generating layerwise iid weights is the simplest approach. We can do better by unsupervised pre-training, if we have a dataset. But what can be done between these two extremes?

  • is pretraining using another dataset allowed? – shimao Nov 09 '20 at 15:36
  • Sorry, no. I am interested in weight initialization approaches which operate with as little as possible assumptions on the dataset. I suppose some inductive biases are unavoidable, but having e.g. MNIST in pre-training and Fashion-MNIST in training would be too much. – Daniel Paleka Nov 09 '20 at 16:12
  • I see it like this: generating layerwise iid weights is the simplest approach. When seeing some dataset, we can do better by unsupervised pretraining. But what can be done between these two extremes? – Daniel Paleka Nov 09 '20 at 16:17

0 Answers0