4

Having a "measured" vector $\mathbf{y}$ with its statistics (counts or variance per element), one can use weighted least squares approach to solve the linear system $$\mathbf{A}\mathbf{x} = \mathbf{y}$$ by minimizing $$(\mathbf{y} - \mathbf{A}\mathbf{x})^T{\rm diag}(\mathbf{c})(\mathbf{y} - \mathbf{A}\mathbf{x}),$$ where $\mathbf{c}$ contains either counts or inverse variances.

When this linear problem is ill-posed but the model is sparse I can use the Basis Pursuit Denoising approach looking for a sparse solution:
$$ \mathbf{x}^{\ast} = \arg \min_{\mathbf{x}} \left\{ \frac{1}{2} {\left\| \mathbf{A} \mathbf{x} - \mathbf{y} \right\|}_{2}^{2} + \lambda {\left\| \mathbf{x} \right\|}_{1} \right\} $$

Logically i am tempted to just modify the $\mathbf{A}$ term to $\mathbf{A}^T {\rm diag}(\mathbf{c}) \mathbf{A}$ to include the weights for each element, but I am not sure this is a sound approach for the $L_1$ case. Is this approach valid or is there a different way to include the measurement statistic for such regularization?

Royi
  • 33,983
  • 4
  • 72
  • 179
bla
  • 576
  • 1
  • 3
  • 14
  • Why can't you just replace ${\left\| \mathbf{A} \mathbf{x} - \mathbf{y} \right\|}_{2}^{2}$ with the term using $\mathbf{c}$? It just changes the problem to a weighted least squares. – Peter K. Nov 10 '21 at 23:27
  • that is what I ask more or less... I am not sure it makes sense. – bla Nov 11 '21 at 22:06
  • I think it would relate to https://dsp.stackexchange.com/questions/30147. – Thomas Nov 12 '21 at 06:37

1 Answers1

6

Your formulation:

$$ \arg \min_{\boldsymbol{x}} \frac{1}{2} {\left\| A \boldsymbol{x} - \boldsymbol{y} \right\|}_{2}^{2} + \lambda {\left\| \boldsymbol{x} \right\|}_{1} $$

Has 2 elements:

  1. The Fidelity Term
    This is basically measurements term with the model of AWGN with IID noise.
  2. The Regularization Term
    This is a sparse promoting model by using the Laplace Distribution as a prior.

Since your measurement model is not IID but with different variance per measurement what you need is use Mahalanobis Distance in the measurement model (Which matches the Multivariate Gaussian Distribution).

Using some variation of the norm (See Norm with Symmetric Positive Definite Matrix) which is defined as:

$$ {\left\| \boldsymbol{x} \right\|}_{W}^{2} = \boldsymbol{x}^{T} W \boldsymbol{x}, \; \boldsymbol{x} \in \mathbb{R}^{n}, W \in \mathbb{S}_{++}^{n} $$

You may formulate your problem as:

$$ \arg \min_{\boldsymbol{x}} \frac{1}{2} {\left\| A \boldsymbol{x} - \boldsymbol{y} \right\|}_{ {C}^{-1} }^{2} + \lambda {\left\| \boldsymbol{x} \right\|}_{1} $$

Where $ C $ is your covariance matrix (Basically $ \operatorname{diag} \left( \boldsymbol{c} \right) $ on your question).
The above is easy to solve using many methods as it is basically transformed LS by using Cholesky Decomposition of the Covariance Matrix.

Royi
  • 33,983
  • 4
  • 72
  • 179
  • I am trying to understand if I can articulate your answer in CVX (does adding weight render the problem to be non-convex?)... First, if I understand correctly the norm to be minimized $C^{-1}$ is the inverse(?) of the covariance matrix given by $C= (y−Ax)^T diag(c) (y−Ax)$ ?and to obtain the inverse I can use Cholesky Decomposition or etc? – bla Nov 11 '21 at 20:36
  • @bla Adding the weighting does not make the problem non-convex. – Peter K. Nov 11 '21 at 22:39
  • 1
    @bla, Since $ {C}^{-1} = \operatorname{diag} \left( \boldsymbol{c}^{-1} \right) $ you can easily define $ {C}^{-1} $ and use `quad_form()` from `CVX`. You may mark this as answered and open a question specifically how to solve my form in CVX and I will post code. – Royi Nov 12 '21 at 05:48
  • @Royi, thank you will do. – bla Nov 12 '21 at 08:42
  • https://dsp.stackexchange.com/questions/79087/cvx-for-covariance-matrix-norm – bla Nov 12 '21 at 08:54