10

In word2vec's CBOW and skip-gram models, how does choosing word vectors from $W$ (input word matrix) vs. choosing word vectors from $W'$ (output word matrix) impact the quality of the resulting word vectors?

CBOW:

enter image description here

Skip-gram:

enter image description here

Franck Dernoncourt
  • 42,093
  • 30
  • 155
  • 271

1 Answers1

9

Garten et al. {1} compared word vectors obtained by adding input word vectors with output word vectors, vs. word vectors obtained by concatenating input word vectors with output word vectors. In their experiments, concatenating yield significantly better results:

enter image description here

The video lecture {2} recommends to average input word vectors with output word vectors, but doesn't compare against concatenating input word vectors with output word vectors.


References:

Franck Dernoncourt
  • 42,093
  • 30
  • 155
  • 271