12

Here is the least absolute deviation problem under concerned: $ \underset{\textbf{w}}{\arg\min} L(w)=\sum_{i=1}^{n}|y_{i}-\textbf{w}^T\textbf{x}|$. I know it can be rearranged as LP problem in following way:

$\min \sum_{i=1}^{n}u_{i}$

$u_i \geq \textbf{x}^T\textbf{w}- y_{i} \; i = 1,\ldots,n$

$u_i \geq -\left(\textbf{x}^T\textbf{w}-y_{i}\right) \; i = 1,\ldots,n$

But I have no idea to solve it step by step, as I am a newbie to LP. Do you have any idea? Thanks in advance!

EDIT:

Here is the latest stage I have reached at this problem. I am trying to solve the problem following this note:

Step 1: Formulating it to a standard form

$\min Z=\sum_{i=1}^{n}u_{i}$

$ \textbf{x}^T\textbf{w} -u_i+s_1=y_{i} \; i = 1,\ldots,n$

$ \textbf{x}^T\textbf{w} +u_i+s_2=-y_{i} \; i = 1,\ldots,n$

subject to $s_1 \ge 0; s_2\ge 0; u_i \ge 0 \ i=1,...,n$

Step 2: Construct a initial tableau

           |      |    0      |    1   |  0  |   0   |   0    
 basic var | coef |  $p_0$    |  $u_i$ |  W  | $s_1$ | $s_2$ 
      $s_1$| 0    |  $y_i$    |   -1   |  x  |   1   |   0
      $s_2 | 0    |  $-y_i$   |    1   |  x  |   0   |   1
      z    |      |    0      |    -1  |  0  |   0   |   0

Step 3: Choose basic variables

$u_i$ is chosen as input base variable. Here comes a problem. When choosing the output base variable, it is obvious $y_i/-1=-y_i/1=-y_i$. According to the note, if $y_i\ge0$, the problem has unbounded solution.

I am totally lost here. I wonder if there is anything wrong and how should I continue the following steps.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
southdoor
  • 121
  • 1
  • 5
  • 2
    Pragmatically, you use a linear program solver instead of writing your own. I reccomend gurobi. – Matthew Drury Feb 07 '17 at 15:25
  • 1
    @MatthewDrury Thanks for you reply. But I want to know exactly how LP works in this problem, instead of just taking the answer. – southdoor Feb 07 '17 at 15:26
  • 1
    Do you know or did you Google 'simplex method'? –  Feb 07 '17 at 15:32
  • 2
    Linear program is just a formulation of your problem in terms of maximizing (or minimizing) of linear goal function subject to some linear constraints. It doesn't "solve" itself. There are a bunch of algorithms that solve these specially formulated programs, one of most commonly used is Simplex – Łukasz Grad Feb 07 '17 at 15:33
  • 1
    @fcop Yes, indeed I have read some notes of simplex method. But I have no idea how to generate it to this problem. As the examples in those notes are very simple and specific. I can not find one begin with general problems. I have already spent two nights in this problem, but still being confused. Sorry. – southdoor Feb 07 '17 at 15:34
  • @ŁukaszGrad Thanks for your reply. Do you have any idea how to solve it with simplex method? I have read some notes about simplex algorithm but still have no idea. – southdoor Feb 07 '17 at 15:36
  • "Solving" a problem with LP is (generally) simply formulating it (correctly) in terms of those linear constraints, so i don't know what you mean by "solving with simplex method". – Łukasz Grad Feb 07 '17 at 15:47
  • @ŁukaszGrad, but simplex **is** an algorithm that can be used for solving a formulated problem, isn't it? – Richard Hardy Feb 07 '17 at 16:38
  • @RichardHardy Yes i stated that in my previous comment. But the author apparently wasn't satisfied with that answer :), so I had no clue what he/she meant. – Łukasz Grad Feb 07 '17 at 16:42
  • @ŁukaszGrad Sorry I wasn't clear enough. What I want indeed is to find the solution via say, simplex method. I will clarify it. Thanks. – southdoor Feb 07 '17 at 16:47

2 Answers2

6

You want an example for solving least absolute deviation by linear programming. I will show you an simple implementation in R. Quantile regression is a generalization of least absolute deviation, which is the case of the quantile 0.5, so I will show a solution for quantile regression. Then you can check the results with the R quantreg package:

    rq_LP  <-  function(x, Y, r=0.5, intercept=TRUE) {
        require("lpSolve")
        if (intercept) X  <-  cbind(1, x) else X <-  cbind(x)
        N   <-  length(Y)
        n  <-  nrow(X)
        stopifnot(n == N)
        p  <-  ncol(X)
        c  <-  c(rep(r, n), rep(1-r, n), rep(0, 2*p))  
                 # cost coefficient vector
        A  <- cbind(diag(n), -diag(n), X, -X)
        res  <-  lp("min", c, A, "=", Y, compute.sens=1)
            ### Desempaquetar los coefs:
        sol <- res$solution
        coef1  <-  sol[(2*n+1):(2*n+2*p)]
        coef <- numeric(length=p)
        for (i in seq(along=coef)) {
             coef[i] <- (if(coef1[i]<=0)-1 else +1) *  
                  max(coef1[i], coef1[i+p])
        }
        return(coef)
        }

Then we use it in a simple example:

    library(robustbase)
    data(starsCYG)
    Y  <- starsCYG[, 2]
    x  <- starsCYG[, 1]
    rq_LP(x, Y)
    [1]  8.1492045 -0.6931818

then you yourself can do the check with quantreg.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • 3
    +1 I am a big fan of doing things manually and differently then compare! – Haitao Du Mar 27 '17 at 16:47
  • 3
    For a post with a little more explanation see [quantile regression](https://stats.stackexchange.com/questions/384909/formulating-quantile-regression-as-linear-programming-problem) – Jesper for President Dec 29 '18 at 22:05
3

Linear Programming can be generalized with convex optimization, where in addition to simplex, many more reliable algorithms are available.

I would suggest you to check The Convex Optimization Book and the CVX toolbox they provided. Where you can easily formulate least absolute deviation with regularization.

https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf

http://cvxr.com/cvx/

Haitao Du
  • 32,885
  • 17
  • 118
  • 213
  • 2
    Thanks for your answer. But when I try to search the term "simplex method" in the book, I can't find any. And the CVX toolbox is just a tool for take input as the LP problem and run the algorithm. But what I really want is how the algorithm works in this problem. Neither the final result, nor how to formulate the problem. But the step to get the result. thanks – southdoor Feb 07 '17 at 17:03