Learning useful semantic representations of data

Question

Training a neural network on its final task (e.g. classification) right from the beginning is not always the best way to go. I'd like to make a short list of recognized methods of motivating a NN to learn useful representations of data. This is in my opinion closely related to preventing shortcuts in learning ("person A is the one with ear piercing").

Siamese and Triplet Networks
Autoencoders
self-supervised learning with interesting synthetic target and loss
- matching high-res small patches of pictures with low-res whole pictures
- solving jigsaw puzzles (everything from keras blog so far)
Confusing Domains

What else?

score 1 · Accepted Answer · answered Apr 30 '20 at 04:03

Language modelling is the most widely used pre-training/self-supervised task. Word embeddings are naturally transferred to be used in some downstream task.

For vision, rotations and denoising is popular. There is also in-painting, though it doesn't work as well at the moment.

For reinforcement learning, there is structure-from-motion, and self-play.

For vision-and-language tasks, there are now many joint-modality networks whose secret sauce is in the pre-training tasks that they do. For instance, LXMERT and VisualBERT

Recently there has been an incredible proliferation of self-supervision tasks; this is an active and exciting area of research!

Learning useful semantic representations of data

1 Answers1

Linked