Questions tagged [embeddings]

76 questions
8
votes
1 answer

What is the intuition behind the positional cosine encoding in the transformer network?

I don't understand how adding the cosine encodings/functions to each of the dimension of the word vector embedding enables the network to "understand" where each word is situated in the sentence. What is the intuition behind it? It seems a bit…
7
votes
2 answers

How to embed in Euclidean space

I have what I think might be a standard machine learning problem but I can't find a clear solution. I have lots vectors of different dimensions. For each pair of vectors I can compute their similarity. I would like to embed these vectors into…
6
votes
2 answers

Why BERT use learned positional embedding?

Compared with sinusoidal positional encoding used in Transformer, BERT's learned-lookup-table solution has 2 drawbacks in my mind: Fixed length Cannot reflect relative distance Could anyone please tell me the considerations behind such design?
eric2323223
  • 277
  • 1
  • 3
  • 14
6
votes
2 answers

How to use "IDs" as an input variable to a ML model?

I am trying to include a variable like "account number" which is an "ID" as a predictive variable for a logistic regression model. In fact there are several columns in my dataset that are "IDs" but are important in predicting the outcome. for…
5
votes
1 answer

If the curse of dimensionality exists, how does embedding search work?

The curse of dimensionality tells us if the dimension is high, the distance metric will stop working, i.e., everyone will be close to everyone. However, many machine learning retrieval systems rely on calculating embeddings and retrieve similar data…
4
votes
1 answer

ArcFace - How to compute $\cos(t+m)$ when $t+m > \pi$

I am trying to understand the ArcFace Implementation and I am stuck at one condition. If the $ \cos(t) > \cos(\pi -m)$ then $t + m > \pi$. In this case the way how we're computing $\cos(t+m)$ is changed into $cos(t+m) = \cos(t) - m * \sin(m)$. Could…
pawols
  • 141
  • 4
4
votes
1 answer

What is embedding? (in the context of dimensionality reduction)

In the context of dimensionality reduction one often uses word embedding, which seems to me a rather technical mathematical term, which rather stands out compared to the rest of the discussion, which in case of PCA, MDS and similar methods is just…
4
votes
0 answers

Why researchers use conv1d for embeddings instead of dense layers?

In some papers (like Reinforcement learning for Vehicle Routing Problem), researchers use conv1d to embed the problem input into a hyperspace; for example, in solving TSP, they use conv1d on the (x,y) coordinates of node, but I don't understand why…
4
votes
1 answer

What is the difference in the latent space of a variational autoencoder and a regular autoencoder?

Should VAEs be even used for non-generative tasks? If I were to use both models for embedding images, how would the embedding space differ on a structural level?
Daniel
  • 151
  • 5
3
votes
2 answers

Embedding data into a larger dimension space

Embeddings or latent spaces are vector spaces that we embed our initial data into that for further processing. The benefit of doing so as far as I am aware, is to reduce the dimension. Often data has many discrete features that doesn't make sense to…
3
votes
1 answer

What are state of the art methods for creating embeddings for sets?

I want to create embeddings in $R^D$ for sets. So I want a function (probably a neural network) that takes in a set $ S = \{ s_1, \dots, s_n \} $ (and ideally of any size, so the number of elements might vary but anything is good) and produces…
Charlie Parker
  • 5,836
  • 11
  • 57
  • 113
3
votes
1 answer

Is the Keras Embedding layer dependent on the target label?

I learned how to 'use' the Keras Embedding layer, but I am not able to find any more specific information about the actual behavior and training process of this layer. For now, I understand that the Keras Embedding layer maps distinct categorical…
Jan Musil
  • 291
  • 2
  • 9
3
votes
0 answers

Can you use VAEs to produce deep word embeddings?

There are many articles about applications of VAE such as image reconstruction, denoising, data compression / augmentation. However, I have not seen an example of embeddings for high dimensional data such as words. Are there some papers about the…
3
votes
1 answer

Facebook's infersent intuition

When reviewing Infersent's architecture here, I noticed that, after encoding the premise and hypothesis to obtain two vectors u and v, they feed the set of fully connected layers with: (u, v) the concatenation between u and v, u * v the…
ryuzakinho
  • 163
  • 8
2
votes
0 answers

Is there an MDS/embedding algorithm that is more suitable to the goal of clustering a graph

I am testing ideas on clustering a particular graph. After testing a set of graph clustering/community detection algorithms I thought about mapping the graph to a vector space and using vector space clustering algorithms, let us say GMM in…
1
2 3 4 5 6