Large-scale MAE regression in R

Question

I have a large, sparse dgCMatrix matrix in R:

~200,000 rows
~150,000 columns
~1,000,000,000 non-zero entries

R code to generate the matrix:

nrows <- 2e5
ncols <- 1.5e5
nnz <- 1e9
set.seed(42)
i <- sample(1:nrows, nnz, replace=TRUE)
j <- sample(1:ncols, nnz, replace=TRUE)
x <- sparseMatrix(i=i, j=j, x=1, dims=c(nrows, ncols))

I also have a vector y of length 200,000:

cf <- rnorm(1:ncols)
cf[sample(1:ncols, ncols/2)] <- 0
y <- (x %*% cf)[,1] + rnorm(nrows) * 100

I'd like to find a set of weights w such that they minimize MAE: mean(abs(y - x %*% w)) (or mean absolute error)

I can find weights that minimize RMSE (root mean squared error) using glmnet:

model <- cv.glmnet(x, y, family = 'gaussian', nfolds=5)

But so far I can't find a similar package for minimizing MAE/LAD/L1 error. The closest thing I've found is the flare package, but that doesn't support sparse input.

Does anyone know of any R package (or have good tricks) for trying to solve a large-scale, sparse MAE/LAD/L1 regression problem?

As soon as you start generating a fit and computing residuals, the matrix will no longer be sparse, so sparseness appears to be of little consequence or use. A quick and dirty way to approximate a fit like the one you seek is Tukey's *median polish*: see http://stats.stackexchange.com/questions/8251, for instance. — whuber, Dec 06 '16 at 23:16
It looks like the flare package is using a sparse matrix internally, but I'm not 100% sure. Also, what makes MAE regression different than RMSE regression? It seems possible to fit a good RMSE model while maintaining sparsity, why is that harder than with MAE? Thanks a lot. — Zach, Dec 07 '16 at 00:51
You created two new tags without Wikis. I think it would be a cleaner approach to avoid that. Also, what about quantile regression, doesn't it fit here? — Richard Hardy, Dec 07 '16 at 06:27
@RichardHardy Oops. How do I delete the new tags/wikis? Also, I think quantile regression would work here, but I can't find a package for quantile regression that works with such a large, sparse matrix. — Zach, Dec 07 '16 at 14:24
I think they get deleted automatically if there are no questions tagged. So once I removed the tags from your post, they should disappear from the tag list as well (after a while). About sparse quantile stuff -- unfortunately, I don't have any suggestions. — Richard Hardy, Dec 07 '16 at 15:27
Have you by any chance come across an answer to your question? It is quite interesting (+1). — usεr11852, Jun 28 '17 at 21:33
@usεr11852 I answered my own question. Not perfect, but it works: https://stats.stackexchange.com/a/288223/2817 — Zach, Jun 30 '17 at 18:10

score 1 · Accepted Answer · answered Jun 30 '17 at 18:09

It's not perfect, but I ended up rolling my own solution with the optim function. There's probably a lot of room for improvement, but it works:

library(Matrix)
set.seed(42)
X <- rsparsematrix(1e6, 1e2, density = .01)
CF <- rnorm(ncol(X))
Y <- as.numeric(X %*% CF) + rnorm(nrow(X))
mae <- function(p, a) mean(abs(p-a))
optim_fun <- function(p){mae(X %*% p, Y)}
init <- rep(0, ncol(X))
model <- optim(init, optim_fun, method='BFGS')
print(model$value)
plot(model$par ~ CF)

An answer that works, is an answer that works. (+1) Thanks for sharing. — usεr11852, Jun 30 '17 at 20:42

user231919 · Answer 2 · 2018-12-21T23:26:49.517

1

Let $X$ be a $n\times p$ real matrix. The problem is

$$\min_{b} \Vert{y - Xb\Vert_1 + \lambda\Vert b\Vert_1}$$

This is a linear optimization (Rglpk package works with sparse matrices).

$$\min_{b, r, c} \vec 1\cdot r + \lambda\vec 1\cdot c$$ s.t.

$$Xb - r \le y \le Xb + r$$

$$b - c \le 0 \le b + c$$

$$r, c \ge 0$$

You can implement a cross-validation routine for finding $\lambda$.

edited Dec 21 '18 at 23:26

answered Dec 21 '18 at 08:47

user231919

11
3

you might want to use [mathjax](https://math.meta.stackexchange.com/q/5020/306553) to typeset your question – Siong Thye Goh Dec 21 '18 at 09:38

Large-scale MAE regression in R

2 Answers2