scaler = MinMaxScaler()
sp500_scaled = pd.Series(scaler.fit_transform(sp500).squeeze(),
index=sp500.index)
sp500_scaled.describe()
This code is from the book Machine learning for algorithmic trading.
It uses a two-layer RNN to predict S&P 500 index, which is a time series of stock market index. The overall task is a time-series prediction task. It uses data of 63 time steps to predict the next time step data.
It uses MinMaxScaler to transform all S&P 500 data to [0,1] before feeding it into the RNN model. Before scaling the data, it is.
DATE SP500
2012-01-04 1277.30
2012-01-05 1281.06
2012-01-06 1277.81
2012-01-09 1280.70
2012-01-10 1292.08
... ...
2019-12-24 3223.38
2019-12-26 3239.91
2019-12-27 3240.02
2019-12-30 3221.29
2019-12-31 3230.78
2011 rows × 1 columns
After scaling it with MinMaxScaler, it is
DATE
2012-01-04 0.000000
2012-01-05 0.001916
2012-01-06 0.000260
2012-01-09 0.001732
2012-01-10 0.007530
...
2019-12-24 0.991522
2019-12-26 0.999944
2019-12-27 1.000000
2019-12-30 0.990457
2019-12-31 0.995292
Length: 2011, dtype: float64
I don't know why it is necessary to use MinMaxScaler to preprocess the time-series data. And of course after training the model, the prediction results are scaled back to original size to compute the real prediction error.