I am currently learning LSTM-RNN models and I have done some tests to see how they work. As in the most NN, overfitting and underfitting is a problem in ML. I have read articles such as this guy here: https://machinelearningmastery.com/learning-curves-for-diagnosing-machine-learning-model-performance/ this: https://towardsdatascience.com/learning-curve-to-identify-overfitting-underfitting-problems-133177f38df5 and this: Dealing with LSTM overfitting All of them are talking about detecting overfitting and underfitting using loss functions: train loss function and test/validation loss function. In papers around the google I see they are depicting plots of real datasets + prediction on trained datasets + prediction on unseen datasets. I haven't seen someone depicting loss functions. So, my question is how can I understand if a LSTM-RNN model works well and doesn't overfit/underfit from the plot of (real dataset + prediction on trained dataset + prediction on unseen dataset)?? Is it possible?
-
2You can use k-fold cross-validation to see how well the model generalizes across your dataset. A low cross-validation score means that the particular model isn't correctly learning the trend in your data, either it is overfitting or underfitting. – Jay Ekosanmi Jan 21 '22 at 16:31
-
I am searching it thanks! However, from a plot of: real dataset + predictions on unseen data + predictions on trained data, can someone understand if the model overfitting (or underfitting)? – just_learning Jan 21 '22 at 16:40
-
I believe someone who has experience with LSTM-RNN can help me... – just_learning Jan 22 '22 at 00:00
-
1So to be clear: You want to know how to detect under/overfitting. You want to be able to do this by plotting what exactly? You say you want to plot "real dataset+predictions on trained dataset + prediction on unseen data", but I'm not sure I understand the question. Are you saying this is a regression task, and you want to plot true values (of train and test set), train set predictions and test set predictions, all on one plot, and be able to tell under/overfitting from that plot? – Vladimir Belik Jan 24 '22 at 16:50
-
@Vladimir Belik Yes, I have built an LSTM-RNN model with 10.000 measurements (splitted to test and train values) and I have plotted in one plot: "real dataset+predictions on trained dataset + prediction on unseen data". Can I figure out from that plot if I have underfitting (or overfitting)? Or do I see that only with loss functions plots? – just_learning Jan 24 '22 at 17:15
1 Answers
For future readers: I clarified my understanding of the question in the comments.
EDIT: This answer is not specific to LSTM or neural networks, it is true for any predictive algorithm.
Response: In general, you probably can tell overfitting/underfitting from a single plot of true values (all, train and test) + training data predictions + testing data predictions. However, there are some pretty big issues with doing this, and I don't see why you wouldn't just use more objective methods.
How to do it from plot: It's pretty straightforward. You know that the definition of overfitting is that the model does much better in training than in test. Visually, from a plot, you will detect this by seeing that the model predictions match very closely with true values in the training set section of the plot, but are noticeably worse/farther away/messy-looking in the test set.
For underfitting, you will see in the plot that the predictions are bad/messy/far-away-from-true-values in both, the training set section of the plot and the test set section. As a general note, it is pretty unlikely that you are underfitting with a neural network.
The problems with doing that (please read!):
- You are working with 10,000 measurements. To be able to visually detect over/under fitting from a plot of 10,000 points is going to be very difficult. As in, you'll have to zoom in a ton to be able to tell what's going on. I literally mean that there aren't enough pixels on your screen to easily distinguish what's going on between the train and test set on a single plot, so unless you zoom in a lot and scroll side to side (annoying and difficult), this will be a pain.
- This method of eyeballing it from the plot is pretty subjective. If it's truly extremely obvious overfitting, you will be able to tell. But besides that, why do this subjective method when you can use an objective one?
My recommendations:
- The most straightforward and objective way to tell if you're overfitting is to compare the error in your training (better yet, cross validation) set vs. the error in your test set. That is, compute the average error across all points in both sets. If training/cross-validation set error is significantly lower than test set error, there's overfitting. If they're about equal, but both are bad, there's underfitting.
- If you insist on having a plot, I would recommend that you plot the errors (prediction minus true value at every point). NOT the true values vs. predictions as you are suggesting (because again, visually hard to tell what's going on). Plot the errors and maybe even run some simple moving average or something to make it even more easily visually interpretable (so you don't have what looks like crazy white noise). If you plot this as I'm describing (perhaps one color for train set error, one color for test set error), you will probably be able to visually compare the error (and performance) between the two sets. However, why not just do option 1 and have a quantifiable result?
Best of luck!

- 980
- 1
- 10
-
Now, they are more clear! Thanks!! I have been trying so many days to make K-fold work in LSTM-RNN, it does not seem to be the suitable for regression (LSTM-RNN). – just_learning Jan 24 '22 at 19:18
-
1@just_learning If you share your code (or just explain what you're doing), I can help you get K-fold CV working. K-fold Cross Validation works for any algorithm, including LSTM-RNN. You just need to implement it correctly. As a side note, it seems like you're pretty new to machine learning? So here's a tip I learned when I was starting: while neural networks are super hyped up and exciting, you might consider starting out with RandomForest or XGBoost. They're easier/faster to implement and often give comparable results to NNs, depending on the type of data you're working with. – Vladimir Belik Jan 24 '22 at 19:34
-
I use this: https://machinelearningmastery.com/evaluate-performance-deep-learning-models-keras/ and more specifically the paragraph: "Manual k-Fold Cross Validation" and as ML model I use this: https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/ and more specifically the paragraph: "LSTM Network for Regression". Also the problems I face: https://datascience.stackexchange.com/questions/107376/typeerror-float-object-is-not-subscriptable – just_learning Jan 24 '22 at 19:53
-
1@just_learning That site is a great resource, I suspect you are implementing it incorrectly. It might help you if you try to conceptually understand k-fold cross validation, so that you can implement it logically in your code (w/o copy-pasting code etc) . On Google or Youtube, look up "k-fold cross validation explained", and make sure to find diagrams. You should be able to explain to yourself why/how it works, and explain the k-fold step-by-step. This reading/watching shouldn't take more than 30 mins or an hour. Once you've done that, I think you'll find the error in your code. – Vladimir Belik Jan 24 '22 at 19:59
-
1@just_learning If you're confident you understand k-fold CV, but still can't get it to work, feel free to make another question where you provide your code (or maybe some pseudo-code), diagrams etc. and give me the link. I'll try to take a look and see what's going on, but again - please try to actually understand k-fold CV first. Side note: you are using LSTMs, so I assume you're working on time series prediction? Are you working on a finance dataset? – Vladimir Belik Jan 24 '22 at 20:06
-
I use "distance with ultrasound (in centi-meters)" dataset. Yes, time-series prediction. – just_learning Jan 24 '22 at 20:20
-
This guy here: https://machinelearningmastery.com/evaluate-performance-deep-learning-models-keras/ under paragraph: "Manual k-Fold Cross Validation" calculates 10 "accuracies" for each of the 10-fold cross-validation. And then calculates MEAN and STD. If instead of "accuracy" I put either "mape" or "mae" or "rmse" and do the same thing with MEAN and STD, is this ok? Can I result in something that shows me: "Ok, the LSTM-RNN model works well" ?? As I have understand in K-fold cross-validation, we need to have each calculated "accuracy" be close to the others... For instance: 80%, 81%, 78%.... – just_learning Jan 24 '22 at 20:58
-
1@just_learning Yes, that's exactly correct to replace "accuracy" with your error metric (MAE, RMSE etc). Unfortunately, there's no "one size fits all" way to just say "okay this is good/bad" . It always depends on the situation and it's on a spectrum. Generally, yes, you want to the error of the folds to be 1. Good/low (in context of your problem) and 2. Relatively consistent. If you are seeing huge differences in error between folds, there's something strange going on. – Vladimir Belik Jan 24 '22 at 21:18
-
I use this command: `cvscores.append(scores[1])` and this: `model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mae'])` and it reads `mae`. This is the output: `loss: 8.6468e-04 - mae: 0.0135 - val_loss: 0.0210 - val_mae: 0.0400 - 22s/epoch - 2ms/step`. How can I read `val_mae`? – just_learning Jan 25 '22 at 17:44
-
1Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/133507/discussion-between-vladimir-belik-and-just-learning). – Vladimir Belik Jan 25 '22 at 17:47
-
-
1I'm not very familiar with the package you're using and its outputs, I'm sorry. If MAE is just for training, then maybe "val_mae" is for validation? Did you give it a validation set? – Vladimir Belik Jan 27 '22 at 22:22
-
No, I do not give any validation set. Please look at the commands I posted 4 comments up... :-) – just_learning Jan 27 '22 at 22:25
-
1You're running cross validation though, right? In K-fold cross validation, you have a total of k validation sets/folds, so maybe that metric is just reporting the average MAE from those k validation sets. If this still doesn't make sense, please post a separate question for this, because I don't think I have enough information about your code and packages to answer it. Lastly: whatever package you're using, just type in "__(package name)__ __(function name)__ documentation" into Google and you should quickly find the meaning of the val_mae output :) – Vladimir Belik Jan 27 '22 at 22:29