I'm building a machine learning model to predict customer's propensity to buy (the likelihood that a customer buying a product). The purpose is rank the customers with probability score for customer targeting. Performance on binary outcome is not priority.
I'm seeking expert opinion what metric we are supposed to use in this context. (auroc, logloss, f1 ... etc?). I have seen some conflicting opinions online.
What metric should I use if my dataset is highly unbalanced in this case? (buy vs not buy: 1:99).
Detailed explanations is highly appreciated!