4

I have matrix $A$ that is $(M \times M)$ square matrix, $x$ $(M \times N)$ matrix, $b$ is $(M \times N)$ matrix. Knowing $A$ and $b$ I would like to get the $x$ from the equation $Ax=b$. $N=p \times q$, so consider $x$ as an $M$ number of $p \times q$ pixel images. These images are sparse when we take total variational $TV(x)$. I would like to solve the following minimization problem for $x$;

Define an operator $g(x)=\parallel x^T\parallel_2,$ or $g(x)=\parallel x^T\parallel_1,$ that maps $M \times N \mapsto N \times 1$.

$$\min \frac{1}{2}\parallel Ax-b\parallel^2_2+k\parallel z\parallel_1,$$ subject to $Fg(x)-z=0$.

$F$ is the difference matrix to take numerical gradient of pixels.

Unlike in the deblurring+denoising problem, my matrix $A$ and $x$ are in different sizes.

The ADMM (Alternating Direction Method of Multipliers) solution to my minimization problem is given as:

$$x^{k+1}=(A^TA+pF^TF)^{-1}(A^Tb+pF^T(z^k-u^k))$$

$$z^{k+1}=S_{t/p}(Fx^{k+1}+u^{k})$$

$$u^{k+1}=u^k+Fx^{k+1}-z^{k+1}$$

When calculating $x^{k+1}$ the matrix dimensions are not compatible in my case. Gradient matris should be applied to total number of pixels $N$, $F$ matrix should have $N$ columns. So the term $(A^TA+pF^TF)$ is not proper way to add 2 matrices.

How can I overcome this and find a solution to my problem?

Royi
  • 33,983
  • 4
  • 72
  • 179
johanson
  • 65
  • 5
  • Which matrix multiplication exactly causes the dimension mismatch? At first glance they do look okay. $p$ is a scalar, right? – Florian Mar 03 '20 at 07:51
  • Yes $p$ is scalar. The dimension mismatch is addition of $A^TA$ and $F^TF$. $F$ is a scalar gradient operator which should be applied to all pixels of $x$. Since total number of pixels in $x$ is $N$, $F$ should be $NxN$. – johanson Mar 03 '20 at 11:10
  • @Florian what do you think about my question? – johanson Mar 04 '20 at 19:24
  • I don't understand the source of mismatch. $A$ multiplies $x$ in your cost function and $F$ multiplies $x$ in your constraint. Hence $A$ and $F$ have the same number of columns. Hence $A^T A$ and $F^T F$ are exactly the same size. – Florian Mar 05 '20 at 08:11
  • A is not applied to pixels in x. Thats why it is $M \times M$ @Florian – johanson Mar 05 '20 at 13:44
  • Huh, weird, your problem formulation looks like $x$ should be a vector, not a matrix. Anyways: If $A$ is $M \times M$ then $F$ needs to have $M$ columns as well. Otherwise your constraint $Fx - z = 0$ makes no sense. – Florian Mar 05 '20 at 13:56
  • I edited my question. Now we apply another operator to x before F. Can you check my question again? @Florian – johanson Mar 05 '20 at 14:07
  • I don't understand your operator $g(x)$. How does it map $M \times N$ to $N \times 1$? Do you mean $(M\cdot N) \times 1$ maybe (i.e., a vectorization)? – Florian Mar 05 '20 at 14:44
  • $g(x)= \parallel x^T\parallel_2$, norm is taken for every row. @Florian – johanson Mar 05 '20 at 14:54
  • Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/105196/discussion-between-johanson-and-florian). – johanson Mar 05 '20 at 15:05

1 Answers1

3

The Error in the Model

The problem is in the dimensions of the Linear Operator $ A $ in your model compared to the Data Matrix $ X $. The number of columns of the matrix $ A $ must match the number of pixels in each column of $ X $ (Each image). While in your case it matches the number of images.

Matrix Form of 2D Linear Operator

Let's try to build the model correctly. Assume our data (Images) is given by the set of 2D matrices $ {\left\{ {X}_{i} \right\}}_{i = 1}^{m} $ where $ {X}_{i} \in \mathbb{R}^{p \times q} $. Given a Linear Operator $ \mathcal{A}: \mathbb{R}^{p \times q} \to \mathbb{R}^{k \times l} $ and we have $ {B}_{i} = \mathcal{A} \left( {X}_{i} \right) $.

In order to create the matrix form we should represent the Linear Operator by the matrix $ A \in \mathbb{R}^{\left( k l \right) \times \left( p q \right)} $ which is the matrix which applies the operator on the Column Stacked (See Vectorization Operator) vectors. So we have $ \boldsymbol{x}_{i} = \operatorname{Vec} \left( {X}_{i} \right) $ and $ \boldsymbol{b}_{i} = \operatorname{Vec} \left( {B}_{i} \right) $.

By defining $ X = \left[ \boldsymbol{x}_{1}, \boldsymbol{x}_{2}, \ldots \boldsymbol{x}_{m} \right] $ and $ B = \left[ \boldsymbol{b}_{1}, \boldsymbol{b}_{2}, \ldots \boldsymbol{b}_{m} \right] $ we have $ A X = B $.

With this definition everything will work as expected. Same logic will work for $ F $.

Remark

If you want $ F $ to work along the rows of $ X $ then set $ Z = F {X}^{T} $.

ADMM for the Vector Case

First, pay attention that the ADMM works on vectors (You can work with Matrices but then you need to update the Prox operations accordingly.

So $ \boldsymbol{x} \in \mathbb{R}^{n}, A \in \mathbb{R}^{m \times n}, \boldsymbol{b} \in \mathbb{R}^{m} $ and $ F \in \mathbb{R}^{o \times n} $ where $ o $ is set to match the operation of TV on the image columns and rows which $ \boldsymbol{x} $ is a column stack of.

So now the terms do align propely: $ {\left( {A}^{T} A + p {F}^{T} F \right)}^{-1} \in \mathbb{R}^{n \times n} $ and $ \left( {A}^{T} \boldsymbol{b} + p {F}^{T} \left( \boldsymbol{x} - \boldsymbol{u} \right) \right) \in \mathbb{R}^{n} $. So the calculation of $ \boldsymbol{x}^{k} $ is well defined.

Given that $ \boldsymbol{z} \in \mathbb{R}^{o} $ and $ \boldsymbol{u} \in \mathbb{R}^{o} $ their calculation is also well defined.

I guess something in your code doesn't match. But if the code have the terms $ A \boldsymbol{x} - \boldsymbol{b} $ and $ F \boldsymbol{x} $ well defined then everything else will work.

Royi
  • 33,983
  • 4
  • 72
  • 179
  • Yes with dimensions you gave it should work. But I have different dimensions. So how can I solve the problem with matrices that i stated in my question? – johanson Mar 03 '20 at 10:48
  • It works with your dimensions the same way. Again, once you have $ A X - B $ working and $ F X $ well defined all the rest will work. – Royi Mar 03 '20 at 10:51
  • Total number of pixels in $x$ is $N$ and I need to apply F to take gradient of pixels values in $x$. So $F$ should be $NxN$ Matrix which cannot add to $A^TA$ because it is $MxM$ – johanson Mar 03 '20 at 10:58
  • This is what you miss. The right way to do this is column stack the image. If the image is $ p \times q $ then your vector $ \boldsymbol{x} $ has the length $ n = p \times q $. From there follow what's in my answer. – Royi Mar 03 '20 at 11:27
  • If the image is $p \times q$ my $x$ is $M \times p \times q$. Then taking $N=p \times q$ I can write $x$ as $M x N$. Am I missing anything? – johanson Mar 03 '20 at 11:37
  • Maybe I didn't get your data set. You have $ M $ images with the same size. What exactly are you after? Applying for each the Total Variation Deblurring? So in order to be efficient you stack all images (Each as a column vector) into a matrix? – Royi Mar 03 '20 at 11:43
  • Please do not consider my problem as image debulurring+denoising problem because it is not. I know $A$, $b$, the relationship $Ax=b$. I also know gradient of $x$ is sparse. Thast why I would like to approach the problem as TV minimaztion. When I get the $M \times p \times q$ 3D matrix $x$ I will sum the $M$ 'images' and get the final image. I am not sure if I can say I have $M$ images. – johanson Mar 03 '20 at 12:00
  • Again, we need to understand the data structure of your problem. Do you have Matrices where you want to operate on each column or what? – Royi Mar 03 '20 at 12:36
  • I cant see what is unclear in my problem. Could you please be more specific? – johanson Mar 03 '20 at 13:16
  • I want to write MATLAB Code for the problem. Should I generate $ M $ 2D Matrices of $ p \times q $? Now, the matrix $ A $ should be working on each image in a column stack form? – Royi Mar 03 '20 at 13:18
  • Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/105121/discussion-between-royi-and-johanson). – Royi Mar 03 '20 at 13:18
  • @johanson, Anything missing in my answer? Could you please mark it? – Royi Dec 11 '21 at 19:21