0

I am currenty doing a fraud detection analysis. I have 2 groups of data points with several features that I create myself. 1 group is fraud people and another group is non fraud.

I found out that the features are mostly similar between the 2 groups that it is hard to distinguish the group based on these features. If I map the datapoint in 2D space, the points are overlapping. There is still very slight difference we can observe between the features.

My main objective is to find a feature that best distinguish the 2 groups. Is there any test that I can use to support my hypothesis for this problem? Or is there any other statistical approach I can use for this problem?

Sorry for the very general question. I am just starting in the field of statistics. I tried to make it more specific with questions I got from you.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
addicted
  • 103
  • 3
  • SVM could solve this with an appropriate kernel (not easy). – user2974951 Sep 26 '18 at 11:11
  • let me see if my understanding is correct. SVM is a machine learning algorithm used for prediction & classification. How is SVM useful in looking most distinguishing feature? – addicted Sep 26 '18 at 11:17
  • Lots of machine learning algorithms estimate feature importances, for SVM look at https://stats.stackexchange.com/questions/2179/variable-importance-from-svm – user2974951 Sep 26 '18 at 12:08

0 Answers0