Sybil detection metric

Asked Jun 23 '14 at 10:40

Active Feb 18 '18 at 14:50

Viewed 46 times

I have a set of user data and I want to build some kind of metric to evaluate the probability of the user being a sybil (a "fake" account).

But I have a very limited set of users who are sybils with 100% certainty.

How do I use machine learning here?

Also, as for now, I've built a heuristic metric based on that data and need to evaluate it somehow.

To sum up: I have a small fraction of data that is labeled and only negative class. And need to build a metric to evaluate users. On top of that I need evaluate the "goodness" of that metric?

How do I approach this problem?

ps It would be good if I could scale this process for big datasets.

edited Feb 18 '18 at 14:50

kjetil b halvorsen

63,378
26
142
467

asked Jun 23 '14 at 10:40

esengie

You need some examples of each class. Maybe start with logistic regression, and try to extend it to case where not all individuals have a certain class identity. Some ideas can be found here: https://stats.stackexchange.com/questions/174856/semi-supervised-classification-with-unseen-classes – kjetil b halvorsen Feb 18 '18 at 14:49
You also find a lot about sybil detection by googling: https://www.usenix.org/conference/usenixsecurity13/technical-sessions/presentation/wang – kjetil b halvorsen Feb 18 '18 at 14:52

Sybil detection metric

0 Answers0