In a simple classification, we have two classes: class-0 and class-1. In some data I only have values for class-1, so none for class-0. Now I am thinking about making a model to model the data for class-1. So, when new data come, this model is applied to the new data and finds a probability saying how probable that new data fit this model. Then comparing with a threshold, I can filter inappropriate data.
My questions are:
- Is this a good way to work with such problems?
- Can a RandomForest classifier be used for this case? Do I need to add artificial data for class-0 which I hope the classifier regards as noise?
- Any other idea may help for this problem?