Most of the neural net algorithms I'm aware of require multilevel, ANOVA-type categorical features to be preprocessed into a set of dummy (0,1) variables. So, if one has a single categorical feature such as the teams in the NFL (Bears, Packers, Cowboys, etc.) then each level (team) would be transformed into separate 0,1 dummy variables indicating membership of a unit (observation, record, entity). This approach makes NNs computationally feasible but has many drawbacks including:
For data with many categorical features possessing many labels, the addition of many irrelevant 0,1 dummy variables
Impossibility of summarizing the structure of explained variance
- For instance, US residential 5 digit zip codes have about 36,000 possible labels. No one cares about the 'impact' of a single zip code 0,1 dummy variable on a target but explaining the overall impact of 'zip codes' is highly likely to be quite relevant to understanding variance structure
Do neural net algorithms exist which are able to handle multilevel categorical features without conversion into dummy variables?