I use R, Party package in order to fit prediction model ("classifier") for
"Converted.clicks" as response variable.
The rest of vars are used as explaining variables in the model.
Here is the relevant part of my code:
table(DF$Converted.clicks)
"0" = 31456
"1" = 39
"2" = 6
Formula<-Converted.clicks ~ Day.of.week
+ Device
+ Keyword
+ Quality.score
+ Network..with.search.partners.
+ Ad.group
+ Match.type
ct<-ctree(Formula,data=DF)
#######################################
Issue:
The Converted.clicks variable is highly imbalanced.The majority of the observations has
class "zero". So after ctree function is applied,all the predictions are "zero",there are
no classes "1" and "2" predicted.
My questions are:
Is the classifier Decision Tree model is appropriate model to predict
as.factor(DF$Converted.clicks)?
If so, how can I balance the response var (i.e.to give the chance the two rest classes
"1" and "2" to be predicted?) - if I need to use weights, I need an
example,please.
Is there any other appropriate model to predict # of Converted.clicks? I understand
that Regression Decision Tree is only for continuous response variable, but in my case
I have an integer response var, please advise.