I want to create embeddings in $R^D$ for sets. So I want a function (probably a neural network) that takes in a set $ S = \{ s_1, \dots, s_n \} $ (and ideally of any size, so the number of elements might vary but anything is good) and produces vector. Ideally, the set embedding function is ordering invariant (the way sets are) so a straight LSTM isn't quite what I want (since thats for sequences), unless modified and ideally referenced in some published paper.
$$ f_{\theta}(S) = e_S \in R^D$$
what are state of the art (SOTA) methods for this task?
The silliest method I know is just embed each element seperately and then take the sum, so:
$$ f_{\theta}(S) = \sum_i g(s_i) $$
or perhaps better with some sort of attention:
$$ f_{\theta}(S) = \sum_i \alpha(S) g(s_i) $$
but ideally if something is already a paper then it's already been tested better than my random idea...
BTW, the only thing I am aware of is in this paper: https://arxiv.org/abs/1606.04080 but seems rather old (2016) and as of the writing of this question we are 2020.
Related: