First, I simulated some data according to the OLS condition:
n <- 500
x.ols <- runif(n, min=0,max=50)
y.ols <- (1/3)*x.ols +rnorm(n,0,1)
train <- data.frame(x=x.ols, y=y.ols)
test <- data.frame(x=runif(100, min=0,max=50))
Then, I scaled the data and fit a neural network:
range11 <- function(x){2*(x-min(x))/(max(x)-min(x))-1}
unrange <- function(x,train){0.50 * (min(train)*(-x) + min(train) + max(train)*x + max(train)) }
anntrain <- train
anntrain$y <- range11(anntrain$y)
d <- ncol(train)-1
annSize<-ceiling(2*d/3)
ann.mod <- nnet(y ~ x, anntrain, size=annSize)
ann.pred <- predict(ann.mod, newdata = test)
ann.pred <- unrange(ann.pred, train$y)
Unfortunately, the predictions are flat and don't capture the line. This code works really well for a non-linear pattern and so I'm confused as to why it won't work in this simpler case.
Interestingly, if I scale differently, it works like a charm here and is awful in the non-linear case.
anntrain <- train
anntrain$y <- anntrain$y/max(anntrain$y)
d <- ncol(train)-1
annSize<-ceiling(2*d/3)
ann.mod <- nnet(y ~ x, anntrain, size=annSize)
ann.pred <- predict(ann.mod, newdata = test)
ann.pred <- ann.pred * max(anntrain$y)
The reason I scaled to (-1,1) was that I had an issue in a non-linear case and I found this post to be helpful. In fact, scaling to (-1,1) helped in that case but hurts in this case. Is there a consistent way to scale that works ``well" for most cases or did I just happen upon a weird case?