keras embedding training optimization objective

Question

I am aware of this and this existing questions, as well as this issue on github. Unless I am missing something though, all these fail to explain how the example in the keras docs makes sense:

model = Sequential()
model.add(Embedding(1000, 64, input_length=10))
input_array = np.random.randint(1000, size=(32, 10))
model.compile('rmsprop', 'mse')
output_array = model.predict(input_array)
assert output_array.shape == (32, 10, 64)

Specifically, what is the target against which the mse is computed? In supervised variants, the target is the respective class of the inputs, and the error is backpropagated from the classification layer. In unsupervised variants it is the lexical context by which the model captures the distributional information (e.g. surrounding words or "middle" word for the word2vec embeddings).

What is the respective pipeline in keras' Embedding layer? Is this information simply omitted in the above example?

That specific example is just for the demonstration of output shape of the Embedding layer. It is not serving as an example of real applicable model. — today, Nov 06 '18 at 17:07
So a supervised setting on which to apply the loss is implied, between the Embedding layer insertion, and the model compilation? — npit, Nov 07 '18 at 18:17

keras embedding training optimization objective

0 Answers0