I have a dataset in which new data comes in everyday. There are categorical variables in the inputs. As a result, I use one-hot-encoder to create a dummy variable. If a new categorical comes in, the number of features increases by 1, because it takes one more dimension to assign 0 or 1 for that category. However, I felt that this approach leads to a decrease of prediction performance. In other words, I felt that the testing error increases, even though there is new data coming in. Here is a source stating that increasing the number of features reduces performance, another source describing similar behavior of dropping performance.
My question is, if new data is associated with new category, does it decrease the prediction performance? Perhaps it is better to quarantine the outlier data?