I am trying to do time series forecasting through machine learning. I want to engineer lag features, but was wondering what would be the best way to go about generating these features for the test set (or validation folds). Now, I'm sure that I can't just use test data to engineer these features - should I instead be using the predictions, and generating the lags iteratively?
For example, if I'm only using a 1-period lag, I would use the last entry in my training set as the lag for the first entry in my test set. I make my prediction, then use that as the 1-period lag for the second value in my test set, and so on. Is this the correct way to use lag features with machine learning? And is there a function in sklearn or another library that automates this process?