I am working with a dataset that is essentially all categorical data. I have 20-30 distinct columns of categorical data, with some columns having as many as 1000 different categorical values. If I use dummy variables to convert all of my categorical data I will have so many features that I don't know how I could ever interpret my results.
I'm curious as to what is best practice for these sorts of problems? I care more about interpretability of my model than the predictive power. What are the best categorical modeling techniques to use? And is the false numerical relationships label encoder will introduce out weigh the benefits of reduced dimensionality and ease of interpretability.