I am trying to implement Anomaly Detection over a multivariate dataset having categorical and numerical predictors.
If we consider the below sample records, product_type, company_type and currency are categorical variables(Nominal, to be precise), whereas price is a numerical variable.
My model is able to identify the anomaly in the price for product_id=10 because the price range for different products is between 10-500 EUR for given combination of (product-type, company-type, price, currency).
But it is not able to identify anomalies for product_id=5 or product_id=8 as they have unusual currency or product type.
I have tried different approaches like Multiple Correspondance Analysis(MCA) for categorical encoding and dimensionality reduction along with One class-SVM and Isolation Forest. I have even tried deep learning approaches using Autoencoders. But none of the models is able to identify anomalies in categorical predictor variables.
I have even referred other answers like:
Anomaly Detection with Dummy Features (and other Discrete/Categorical Features)
and
Outlier detection with data (which has categorical and numeric variables) with R
but could not resolve my problem.
I have recently started data science journey and would really appreciate any help.