I am trying to construct a model for single-label multiclass classification using Keras in a Jupyter notebook. Here's my model (or see full jupyter notebook):
model = models.Sequential()
model.add(layers.Dense(8, activation='relu', input_dim=9))
model.add(layers.Dense(8, activation='relu'))
model.add(layers.Dense(4, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
hist = model.fit(X_train, y_train, epochs=20, batch_size=256, validation_split=0.2, shuffle=True)
I'm getting training accuracy of 0.8538, a validation accuracy of 0.8533, and a test accuracy of 0.8524. However, these values don't change for any of the following:
- adding/removing hidden layers
- increasing or decreasing the width of each layer (even to 1)
- changing the optimizer to adam
- changing softmax to sigmoid (!)
- changing batch_size
- validation_split of 0.1 and 0.2
Also, these values are reached within 1 epoch and don't budge from there. My losses do similar at ~0.57. I'd expect after a lot of epochs to see some level of overfitting, but I'm just getting training and validation loss flat-lining at slightly different values.
I've tried
- normalising the data
- filling missing values
I feel there must be something seriously broken with my workflow but can't for the life of me figure out what.
EDITS
- there are 271116 data points, 1/3 of which are used as test data
- as per Sycorax's suggestion, I reduced learning rate. This just slowed the process of arriving at the same values