Suppose I have a group of users of a paying app and I want to predict each month the users that are not going to renew their subscription. This is called churn rate. To do that I create a binary classification model that, based on individual features of each user like usage time, device used, etc.. will calculate the probability of renewing/not renewing.
Would it be a valid approach to use this model on each user of my app and calculate this proportion based on those individual predictions?. Or Would it be better to build a model that tries to predict said churn rate (%) for the month, instead of aggregating the individual predictions as in the first approach (similar to a time series prediction problem)?
My understanding is that using the individual predictions is not going to have the same effect. First I have to choose a probability threshold to classify an instance in a positive or negative instance. But the criteria to choose that threshold does not have to be aligned with predicting the right proportion of positive classes