0

I am working on a triplet loss based model for text embedding.
Short description:
I have a database about online shop, I need to find the suitble product when users enter a text on search bar. I want a model work better than matching string and can understand user's mind. I define a triplet Network like that: My input is (query text [anchor], next product user view after searching [positive], a random product [negative]). I build an encoder model based on bi-LSTM and tried to train the distance between anchor and positive is minimum and the distance between anchor and negative is maximum, and use triplet loss. I tried hard margin or semi-margin, but it didn't work.
The result, I saw that in training dataset, loss value decreased to so small and so fast but in valid dataset loss value didn't present any meaning, it was up and down like random.
m I trained model with 8572 vocabs, 81822 training samples, Is it too small dataset?
Can you help me and what is the issue in my solution?

0 Answers0