In the context of a ninary classification problem if all the predictors (independent variables) are strictly binary {0,1} then is there a specific way to preprocess the data ? Assume there are no missing or noisy data points.
More Info: I am currently working on a project with gene expressions data. The predictors are completely binary and I am trying classify genes into two categories "cancerous" or "non cancerous". I have tried using logistic regession, svm (linear and rbf) and random forests as suggested in this thread, however the auc for my application is about 0.65 for all the classifiers. I am trying to get the auc up. What approach should I be using ? I hope there is a specific way to handle this data and not this is not a situation where the inter-class sepration is low.