Assume that I want to predict a response with 3 classes. I have two features $X_1$ and $X_2$ where $X_1$ is continuous and $X_2$ is categorical with 5 categories. What would be the number of parameters in the case we are using softmax parametrization?
I was thinking, we would have the bias, $X_1$ and we would split $X_2$ into 5 different variables. Would this be possible? Or could we just keep $X_2$ as a categorical predictor. I'm a bit confused as to how to handle the categorical predictor in this case.