Predictor With Lower Mean Absolute Error Ends Up Worse

Question

I have been recently working on a problem to estimate the ETAs of vehicles using ensemble techniques such as LightGBM. As expected, the distance taken by the vehicle's route to its destination is a powerful predictor, however this can only be estimated (since we don't know the path of the vehicle until it actually takes it). My previous approach for creating this estimate was entirely algorithmic, and produced decent results, however I thought we could also leverage historical data regarding routes to come up with a better estimate. I utilised another LightGBM model, with certain features such as the starting point, destination, and our previous algorithmic distance, to create this better estimate. It ended up having a smaller MAE, and smaller residual variance than our previous solution, which looked highly promising.

However, when the previous ETA prediction model was trained on this new feature, its predicted ETAs became significantly worse. I tried reshuffling my train test splits (thinking it could be an issue with sampling), to no avail. Furthermore, when I take historical average speed of the vehicles (again unobtainable in practice), and divide my distances, I also find that the new model's ETAs are better. Its just, oddly, when put into LightGBM, it performs worse. Is there any reason this could be occurring? Extreme care was taken to ensure no information leakage, with two entirely separate train test splits being created with one for the distance model, and one for the ETA model.

It is difficult to follow this post. You use indirect phrases like "I utilised *another* LightGBM model", "the *previous* ETA prediction model was trained on this new feature", " the *new* model's ETAs are better" and I can't keep up which is which. — Sextus Empiricus, Nov 19 '21 at 20:42
And what do you mean by 'lower mean absolute error ends up worse'? Do you measure something else than the mean absolute error, or do you change the data? What changes? — Sextus Empiricus, Nov 19 '21 at 20:48
This question was worded poorly, however I have found the solution. I will edit the question and outline my findings when I find time ASAP. — James Balajan, Nov 20 '21 at 01:55

Predictor With Lower Mean Absolute Error Ends Up Worse

0 Answers0