Below, I have a simple diagram explaining the matrix dimension of word2vec. My goal is to expand this graph to incorporate document vectors for doc2vec. However, I'm having trouble understanding the original paper, specifically about how to incorporate the weight parameters of documents. Let D be the number of documents to train and M be the hidden layer size for the documents. How would the doc2vec architecture look like in terms of matrix dimensions?
Asked
Active
Viewed 373 times
1 Answers
0

Yoo Inhyeok
- 161
- 8
-
How to initialize the matrix D or it is a one hot encode for every document as in the case of word? In the inference time, the document is unseen so its id does not exist in the Matrix D! How to do to generate the vector Dx1? – BetterEnglish Feb 04 '22 at 17:04