I have a dataset with 1000+ features and 1 mil+ rows. I have a binary target variable either yes or no and the features are all numerically values range from 0 to 100k+.
My goal is to understand which features contributed the most to each instance. My main emphasis is which features contributed to the binary target, thus interpretability is a bigger plus than accuracy.
My question is, are decision trees in sci-kit learn the best suited to interpret non-linear relationships in a classification problem?