I'm looking for a method to identify a shortlist of potentially good 2-way interaction terms rather than trying all possible interactions. This question is similarly asked before here but in a more general sense, not on a big data set.
The answer that is given there ("think" about the problem) is not applicable for me because I have around 800 features and >50K observations. I'd like to get something from the data.
Note: I also tried the random forest method that is given as an answer in the link above but I'm not sure I get the method completely right. The problems with RF are that 1) It overfits on training data so what you find on training doesn't work on holdout. 2) The $importance doesn't really define the strength of the interaction but defines the strength of the predictor itself.