So I've been trying to predict a minority class, and thus far I've built an svm/boosted tree/random forest/logistic regression/knn combo. After making them all and tuning them and doing feature engineering and all that:
I now have a nice single combination of them all, that weights them based on how simulated annealing thought they should be weighted and similarly where simulated annealing thought the vote cutoff should be (they were originally 0 or 1 for their votes and the cutoff was 5, but with the weights it's now funky; this was done on the test set).
Anyways, I am now doing an auto-encoder neural network as a final step in order to bring it home because my results are not where I'd like them to be (my company's revenue is fairly well tied to how well this model performs).
I was wondering if it would be a good idea or ill advised theoretically to include the recommendation from my previous mega combo as a new variable for the neural net? Would it help or just get in the way? Should I include the 0 or 1 votes from all the models or just the ultimate outcome?