I have a product categories tree with names like apparel
, jeans
, beauty & wellness
etc.
I need to 'match' external categories with this category tree.
For example, I'd need to match: lenses color
to lens colour
.
I currently use a fuzzy string matching approach (using python and FuzzyWuzzy), and when there is no match, it is done manually and saved as a synonym. I was wondering if a machine learning classification method would make sense here since:
- We're talking about single words (or 2 to 3 words like in
beauty & wellness
). It seems like classification works better on sentences, or words inside sentences. - There are very few examples for each 'class'. Even after gathering a lot of synonyms, there aren't more than 4 or 5 ways to say a category. So not sure the training will be consistent.
I'm more familiar with neural nets than other machine learning techniques, but maybe someone is familiar with this matter and will suggest other techniques.