Deep Learning: Condition Number and Poor Conditioning

Question

I am reading the following section of the book Deep Learning.

Can you provide an intuitive explanation of the above section? I don't quite understand the statement "When this number is large, matrix inversion is particularly sensitive to error in the input" and "This sensitivity is an intrinsic property of the matrix itself, not the result of rounding error during matrix inversion. Poorly conditioned matrices amplify pre-existing errors when we multiply by the true matrix inverse"

The Wikipedia article on this is quite accessible: https://en.wikipedia.org/wiki/Condition_number#Matrices — Alex R., Dec 19 '17 at 08:20
The answer to https://stats.stackexchange.com/questions/371249/matrices-system-that-is-computationally-singular-versus-exactly-singular/371267#371267 may help. — jbowman, Mar 25 '19 at 03:07
You will find answers in this list: https://stats.stackexchange.com/search?q=condition+number+answers%3A1 — kjetil b halvorsen, Mar 25 '19 at 09:13
Possible duplicate of [Differing definitions of Matrix Condition Number](https://stats.stackexchange.com/questions/84179/differing-definitions-of-matrix-condition-number) — kjetil b halvorsen, Mar 25 '19 at 09:15
Other possible dups target: https://stats.stackexchange.com/questions/91653/condition-number-of-covariance-matrix, https://stats.stackexchange.com/questions/258710/multicolinearity-and-condition-number-of-logistic-regresison, https://stats.stackexchange.com/questions/120826/condition-number-of-data-matrix-and-stability-of-ols-estimates, https://stats.stackexchange.com/questions/168259/how-do-you-interpret-the-condition-number-of-a-correlation-matrix, https://stats.stackexchange.com/questions/332483/why-does-matrix-condition-number-change-drastically-when-a-constant-is-added — kjetil b halvorsen, Mar 25 '19 at 09:23
It may be the duplicate, but I think there is a twist in the question that grants keeping it open: the word _input_ in the passage is confusing as they have $A^{-1}x$ expression where in deep learning you'd assume $x$ is the input they discuss while in fact it is $A$ that they mean — Aksakal, Jul 24 '19 at 15:33
Does this answer your question? [What is a consequence of an ill-conditioned Hessian matrix?](https://stats.stackexchange.com/questions/391951/what-is-a-consequence-of-an-ill-conditioned-hessian-matrix) — jpmuc, Dec 28 '19 at 11:16

Aksakal · Answer 1 · 2019-07-24T15:29:55.483

I think you're confused with the usage of the word input. Naturally, in deep learning context we mean a vector $x$ by input. However, in this passage it is the matrix $\textbf A$ that is referred to as input.

Think of the matrix $\textbf A$ not as a constant predetermined matrix, but as of a parameter that is estimated. Maybe you estimate $\textbf A$ from training data etc. So, in a way it is a random value itself, random matrix it is. Hopefully, your estimation routine is consistent so that you can improve your precision by increasing the sample training data.

Now, what the passage states is that if the matrix property called "condition number" is very large then inverting the matrix $\textbf A^{-1}$ is very sensitive to the input $\textbf A$. Any random variations (noise) in estimated $\hat{\textbf A}$ will result in widely different outcome of matrix inversion routine $\hat{\textbf A}^{-1}$. Condition number tells you how much the input noise is amplified in the output of the inversion routine

Faizanur Rahman · Answer 2 · 2019-12-28T06:46:13.803

In mathematics, a condition number is a number representative of the change of an output proportionate to a change in the input of a function. For example, if a small change in the input results in a small change in the output, the function produces a small condition number and is said to be well-conditioned. Alternatively, if a small change in the input results in a large change in the output, the function produces a large condition number and is defined as ill-conditioned.

Deep Learning: Condition Number and Poor Conditioning

2 Answers2

Linked