I am trying to build a model to predict click through rate for advertisement. I have the data in the following format
ad_id | ad_feature_1 | ad_feature_2 | label (1/0) | count
ad_xs. 0.2 0.5 1 100
ad_xs. 0.2 0.5 0 10000
ad_xz. .. .. 1 10
and so on where the label = 1 indicates a click and 0 indicates that the ad was shown but not clicked. Count represent the number of times the row is present in the data
This is an imbalanced data set where the feature values are same but the labels are different. How to build a classifier for such kind of data . Also currently I do not have the user or query data to enrich the features.