0
scaler = MinMaxScaler()
sp500_scaled = pd.Series(scaler.fit_transform(sp500).squeeze(), 
                         index=sp500.index)
sp500_scaled.describe()

This code is from the book Machine learning for algorithmic trading.

https://github.com/stefan-jansen/machine-learning-for-trading/blob/main/19_recurrent_neural_nets/01_univariate_time_series_regression.ipynb

It uses a two-layer RNN to predict S&P 500 index, which is a time series of stock market index. The overall task is a time-series prediction task. It uses data of 63 time steps to predict the next time step data.

It uses MinMaxScaler to transform all S&P 500 data to [0,1] before feeding it into the RNN model. Before scaling the data, it is.

DATE    SP500
2012-01-04  1277.30
2012-01-05  1281.06
2012-01-06  1277.81
2012-01-09  1280.70
2012-01-10  1292.08
... ...
2019-12-24  3223.38
2019-12-26  3239.91
2019-12-27  3240.02
2019-12-30  3221.29
2019-12-31  3230.78
2011 rows × 1 columns

After scaling it with MinMaxScaler, it is

DATE
2012-01-04    0.000000
2012-01-05    0.001916
2012-01-06    0.000260
2012-01-09    0.001732
2012-01-10    0.007530
                ...   
2019-12-24    0.991522
2019-12-26    0.999944
2019-12-27    1.000000
2019-12-30    0.990457
2019-12-31    0.995292
Length: 2011, dtype: float64

I don't know why it is necessary to use MinMaxScaler to preprocess the time-series data. And of course after training the model, the prediction results are scaled back to original size to compute the real prediction error.

user900476
  • 65
  • 5
  • Scalers are used to remove discrepancy on different units in the features. Also, stackoverflow might be a better venue to ask for software debugging. – msuzen Jan 05 '22 at 00:08
  • Scaling will also allow to not bias dimension reduction technique like pca – Mayeul sgc Jan 05 '22 at 02:19

0 Answers0