I am confused on how LSTM learn from word embedding. I know that LSTM accepts 3D input (sample, timesteps, features). So, when we use embedding layer (word2vec) and we have 300-d vector representation, how does LSTM learn. I do understand that for a sentence, the words are timesteps. However, when we deal with vectors should the timesteps be same as vectors?
If it is true, then how can LSTM learn the sequence of words in that case to predict anything about sentence.