Using this question as background: https://stackoverflow.com/questions/71023822/lstm-multi-variate-multi-feature-in-pytorch
I was wondering how one processes the output of a pytorch LSTM
I was using this as a reference: https://pytorch.org/tutorials/beginner/introyt/trainingyt.html
And looking at the loss function, and realized perhaps I've missed something. I thought the process was:
train set: input, label
test set: input response: label
Where the LSTM performs matrix multiplications to get as close to "1" for the label i've presented it with, and as close to "0" for all other possible outputs, adjusts its internal weights as needed to make this true, and continues to adjust its weights as new training inputs are presented
I then thought that when presented with a test input, the LSTM would return the predicted label for that observation, however I have been told this is incorrect; That what I'll get back is a vector of the same size and shape as what came in.
I also thought that the loss function was a measurement of how far we are by some distance metric from modeling the training set accurately.
Question: Given the dataset I have, how would one take the hidden layer's output, match it with a label (during training)?
(i've been advised this involves the loss function, so I suppose this question would involve the default one for pytorch, which seems to be torch.nn.CrossEntropyLoss())
Question 2: How do i get back a label from the trained LSTM when i present it with a new test input?
Thank you