You would like a function $f(x)$ which maps a range $[min,max]$ to $[a,b]$. As taken from this question on SO,
$f(min) = a$
$f(max) = b$
Following the intuition in that question, we end up here for arbitrary $a$ and $b$, given that $a \ne b$ and $b > a$:
$f(x) = \frac{(b-a)(x-min)}{max-min} + a$
Moving this into R, we can construct a function for this, given an input vector r
scale <- function(vector = NULL, lower_bound = NULL, upper_bound = NULL){
if(is.null(vector)){
stop("Please provide input")
} else if(is.null(lower_bound)){
stop("Please provide lower bound")
} else if(is.null(upper_bound)){
stop("Please provide upper bound")
}
min <- min(vector)
max <- max(vector)
a <- lower_bound
b <- upper_bound
new <- lapply(vector, function(x) ((b-a)*(x-min)/(max-min)) + a)
return(unlist(new))
}
Testing:
r <- c(1,2,3,4,5,6,7,8,9,10)
scale(r, -1, 1)
[1] -1.0000000 -0.7777778 -0.5555556 -0.3333333 -0.1111111
[6] 0.1111111 0.3333333 0.5555556 0.7777778 1.0000000
scale(r, 10, 100)
[1] 10 20 30 40 50 60 70 80 90 100
System timing:
For input vector length 10
ptm <- proc.time()
scale(r, -1, 1)
proc.time() - ptm
user system elapsed
0.000 0.001 0.002
For input vector length 10000
r <- runif(10000)
ptm <- proc.time()
scale(r, -1, 1)
proc.time() - ptm
user system elapsed
0.174 0.023 0.169
And for input vector length 1,000,000
r <- runif(1000000)
ptm <- proc.time()
scale(r, -1, 1)
proc.time() - ptm
user system elapsed
3.824 0.063 3.862
So it slows down a little when you get larger, but is still fairly speedy and accurate.
EDIT: Find below a similar function which first separates data into positive and negative portions in order to keep 0 neutral.
scale2 <- function(vector = NULL, lower_bound = NULL, upper_bound = NULL){
if(is.null(vector)){
stop("Please provide input")
} else if(is.null(lower_bound)){
stop("Please provide lower bound")
} else if(is.null(upper_bound)){
stop("Please provide upper bound")
}
min <- min(vector)
max <- max(vector)
a <- lower_bound
b <- upper_bound
positive <- vector[vector >= 0]
negative <- vector[vector <= 0]
p <- lapply(positive, function(x) ((b-0)*(x-min(positive))/(max(positive)-min(positive))) + 0)
n <- lapply(negative, function(x) ((0-a)*(x-min(negative))/(max(negative)-min(negative))) + a)
#delete duplicates
p[p == 0] <- NULL
return(unlist(list(n,p)))
}
Example:
t <- c(-100, -20, -5, 0, 0, 0, 10, 42, 904)
scale2(t, upper_bound = 1, lower_bound = -1)
[1] -1.00000000 -0.20000000 -0.05000000 0.00000000 0.00000000 0.00000000 0.01106195
[8] 0.04646018 1.00000000