I have raw data that has about 20 columns (20 features). Ten of them are continuous data and 10 of them are categorical. Some of the categorical data can have like 50 different values (U.S. States). After I pre-process the data the 10 continuous columns become 10 prepared columns and the 10 categorical values become like 200 one-hot encoded variables. I am concerned that if I put all of these 200+10=210 features into the neural net then the 200-one-hot features (the 10 categorical columns) will totally dominate the 10-continuous features.
Perhaps one method would be to "group" columns together or something. Is this a valid concern and is there any standard way of dealing with this issue?
(I am using Keras, although I don't think it matters much.)