I am trying to implement a leave one out cross-validation for my time series LSTM model, but I am not sure how to go about it considering my dataset.
My dataset consists of flight IDs (1-279) which have different routes labelled R1 - R5. Flight data of each flight ID is recorded sequentially, with each new flight ID being a new flight. There's a table below to understand what I mean easier hopefully.
flight | time | ... | route |
---|---|---|---|
1 | 0 | ... | R1 |
1 | 0.2 | ... | R1 |
1 | ... | ... | R1 |
1 | 100 | ... | R1 |
2 | 0 | ... | R5 |
2 | 0.2 | ... | R5 |
2 | ... | ... | R5 |
2 | 120 | ... | R5 |
Different flight numbers use the same routes, so for example flights 8,10,12, etc all use R5.
What would be the best way to implement LOOCV? Would it be to run the LSTM for all flights and leave out each flight number, or should the flights be grouped together using the routes they take?