I'm fitting some data using Gaussian Process (GP) in Scikit-Learn. As I understand, the GP requires to scale both X (input features) and Y (outputs) to standard normal distribution (mean = 0 and std = 1).
I usually use the following code to scale my data:
from sklearn.preprocessing import StandardScaler
x_scaler = StandardScaler()
y_scaler = StandardScaler()
X_train_scale = x_scaler.fit_transform(X_train)
Y_train_scale = y_scaler.fit_transform(Y_train)
After fitting and prediction, I will use the following code to re-scale data to original scale that human can understand:
gp = GaussianProcessRegressor() # kernel was defined specific for each task
gp.fit(X_train_scale, Y_train_scale)
X_test_scale = x_scaler.transform(X_train)
Y_test, std = gp.predict(X_test_scale, return_std=True)
I think we can re-scale the Y values using the following line:
Y_test_real = y_scaler.inverse_transform(Y_test)
But I don't know what is the right way to re-scale std. And my question is how to re-scale the std if we scaling Y to normal distribution at the beginning? Actually this value is very important to me because this is the confidence interval. Now, I am using the the following line:
std_real = y_scaler.inverse_transform(std)
But the std_real calculated by the line above is very large and not reasonable. I'm not sure whether we could just scale Y to [0, 1], rather than scale to normal distribution. Or, when Y is small, such as <10, can we not using scaling on Y?
Thanks!