Intuitive explanation of subspace

Question

There are many techniques in signal processing that use eigen analysis (MUSIC, SVD, eigen decomposition, etc) that result in signal and noise subspaces.The mathematical definitions for signal subspaces are abundant, but what is the intuitive, tangible explanation of what a subspace is representing? More importantly, how does one interpret the values of a subspace? What exactly does this result provide and what is an example of how one would use it? Nearly any topic I can think of in signal processing has very intuitive explanations of complex topics - but I've yet to see a good one for subspaces.

EDIT: The crux of the question is, what is the intuitive explanation of subspace as it applies to signal processing algorithms and applications (i.e., not the linear algebra explanation)?

A subspace is, as the name suggests, a part of a space, that is itself a space. Wouldn't know how to put it any more intuitive. It doesn't have "values of a subspace"; it *is a space*. Just as you can't say "what is the value of 'space of all integrable functions'." — Marcus Müller, Jul 17 '20 at 15:03
@MarcusMüller updated the question to clarify that I am trying to understand how it applies to signal processing and not specifically linear algebra — BigBrownBear00, Jul 17 '20 at 15:46
but: it *is* linear algebra, and signal processing in this case pretty plainly applies that :) see Cedron's answer. — Marcus Müller, Jul 17 '20 at 17:57
There is a book that is written to answer your question, albeit indirectly. The book is [_Linear algebra, signal processing, and wavelets: A unified approach_](https://www.uio.no/studier/emner/matnat/math/nedlagte-emner/MAT-INF2360/v17/kompendiet/applinalgpython.pdf) (PDF) by Øyvind Ryan. The link will take you to a freely available copy written for Python users. There is also a [MATLAB version](https://www.uio.no/studier/emner/matnat/math/nedlagte-emner/MAT-INF2360/v17/kompendiet/applinalgmatlab.pdf) available. — Joe Mack, Jul 17 '20 at 18:17
Are you happy with any of the answers? Could you please mark one of them as accepted? — jojek, Jul 19 '20 at 18:53

Laurent Duval · Accepted Answer · 2020-07-18T17:49:17.940

TL;DR: Subspaces are low-dimensional, linear portions of the entire signal space that are expected to contain (or be close to) a large part of the observable and useful signals or transformations thereof, with additional tools that allow us to compute interesting things on the data

We are given a set of data. To manipulate them more easily, it is common to embed them, or represented them in a well-adapted mathematical structure (from the plenty of structures we have in algebra or geometry), to perform operations, prove things, develop algorithms, etc. For instance in channel coding, group or ring structures can be better adapted. In a domain called mathematical morphology, one uses lattices.

Here, for standard signals or images, we often suppose a linear structure: signals can be weighted, added: $\alpha x+ \beta y$. This is the base for linear systems, like traditional windowing, filtering (convolution), differentiating, etc. So, a mathematical structure of choice lies in vector spaces. Vector spaces equipped with tools: a dot product (that can be used to compare data), a norm (to mesure distances). These tools help us compute. Indeed, energy minimization and linearity are strongly related.

Then, a data of $N$ samples naturally lives in the classical linear space of $N$ dimension. It is quite big (think of million-pixel images). It contains an awful lot of other "uninteresting" data: any $N$ dimensional "random" vector. Most of them are and will never be observed, have no meaning, etc.

The reasonable quantity of signals that you can record, up to variations, is very small relatively to the big space. Even more, we are often interested in structured information. So if you subtract noise effects, unimportant variations, the proportion of useful signals is very very tiny within the whole potential signal space.

One very useful hypothesis (heuristic, to help discover) is that those interesting signals live close together, or at least along regions of the space that "make sense". An example: suppose that some extraterrestrial intelligence has no other detection system than a very precise dog detector. They will get, across the Solar system, almost nothing, except many points located on something vaguely looking like a sphere, with large empty spaces (oceans), and sometimes very concentrated (urban areas). And the point cloud moves around a center, with a constant periodicity, and rotating on itself. Those aliens have discovered something!

Anyway, the partial-sphere looking point cloud is interpretable... maybe a planet?

So, our dog point cloud could have been fully 3D, but they are concentrated on a 2D surface (lower dimension), that seems relatively regular (in altitude) and smooth: most dogs live at intermediate altitudes.

These smooth low-dimensional parts of space are sometimes called smooth manifolds or varieties. Their structure and operators allow to compute things. For instance: distances, distributions, etc. Inter-dog distances make more sense when computed along the Earth surface (in spherical 2D coordinates) than directly through the planet with the standard 3D norm! But this can still be complicated to deal with. Let us simplify this a bit more.

Looking a little closer, the dog points are almost located on close-to-flat surfaces: countries, even continents. Those flat surfaces are portions of linear (or affine) subspaces. Still, you can now compute inter-dog distance, more easily, and design an algorithm for dog matching that will make you rich.

The story continues a bit. Sometimes, natural data does not assemble around a clear structure, directly. Unveiling this inherent structure is at the core of DSP. To help us in this direction, we can resort to data transformations to concentrate them better (Fourier, time-frequency, wavelets), filtering.

And if we find a suitable subspace, most algorithms become simpler, more tractable, and so on: adaptive filtering, denoising, matching.

[ADDITION] A typical use is the following: a signal can be better concentrated with a well-chosen orthogonal transform. In the meantime, a zero-mean random Gaussian noise remains Gaussian under an orthogonal transformation. Typically, the covariance matrix can be diagonalized. If you sort the eigenvalues in decreasing ordre, the smallest ones tend to flatten (they correspond to noise), and the highest more or less correspond to the signal. Hence, by thresholding the eigenvalues, it because possible to remove the noise.

A subspace is indeed defined from a priorly-defined space. So it is a subset of it, but a subset with the same properties that define a space — Laurent Duval, Jul 18 '20 at 12:46
@LaurentDuval this is a phenomenal answer on what a subspace is, and why we care about it. The final part of my question is - how do we use the output of subspace analysis (i.e., SVD, or eigenvalues/vectors) in actual DSP algorithms? — BigBrownBear00, Jul 18 '20 at 14:02
How we use it globally is not totally clear for me (I am still a researcher ;). It depends a lot on applications, and the definition of "using". To be continued — Laurent Duval, Jul 18 '20 at 14:20
One intuition (often complemented with proofs) is: projecting a signal onto a subspace better discriminates signal from noise. Some call that "diversity enhancement" sometimes. — Laurent Duval, Jul 18 '20 at 14:50
@LaurentDuval so why not use the signal subspace for matched filter detection? Would that not be more optimal? — BigBrownBear00, Jul 18 '20 at 14:54
We do use that in matched filtering! For instance doing matched filtering in a transformed domain. But optimality in the least-squared is not always sufficient, so we add constraints — Laurent Duval, Jul 18 '20 at 15:03
For instance, there is this research paper [A Primal-Dual Proximal Algorithm for SparseTemplate-Based Adaptive Filtering: Application to Seismic Multiple Removal](http://www.laurent-duval.eu/Articles/Pham_M_2014_j-ieee-tsp_primal-dual_proximal_astbafasmr-wavelet-frame-adaptive-filtering.pdf) — Laurent Duval, Jul 18 '20 at 15:05
I have added a detail on thresholding: If you sort the eigenvalues in decreasing ordre, the smallest ones tend to flatten (they correspond to noise), and the highest more or less correspond to the signal. Hence, by thresholding the eigenvalues, it because possible to remove the noise. — Laurent Duval, Jul 18 '20 at 17:49

score 5 · Answer 2 · answered Jul 17 '20 at 15:03

5

Subspaces are a Linear Algebra concepts. The best representative example I can think of is the relationship of the XY plane to XYZ space, The former is a subspace of the latter. Any vector in the plane also lies in the space. Every vector in space has an orthogonal projection onto the subspace. So a set of vectors in your subspace can only reach vectors in that subspace using linear combinations. For vectors lying off the plane, linear combinations of vectors in the plane can only get so close.

answered Jul 17 '20 at 15:03

Cedron Dawg

6,717
2
6
19

I updated the question to clarify that I am looking for a signal processing specific explanation, and not so much the linear algebra explanation (which is prevalent online) – BigBrownBear00 Jul 17 '20 at 15:46
@BigBrownBear00 I don't think there is any such universal application. The concept of subspaces goes hand and hand with doing least squares best fit solutions. This permeates DSP. Another might be the span of DFT basis vectors being limited to a resolvable frequency range before aliasing occurs. Hard to tell if there is a good answer here, I'll be watching what others say. – Cedron Dawg Jul 17 '20 at 16:40
I am very familiar with least squares and its application in DSP. Can you maybe expand on how they go hand in hand? Usually with least squares there is some known sequence that is used to solve for a series of weights or calculate an error. Subspace analysis seems to be blind in that sense, no? – BigBrownBear00 Jul 17 '20 at 16:50
@BigBrownBear00 That they go hand and hand is a Linear Algebra observation. In terms of XY space, finding the projection of a XYZ vector onto XY is solving least squares problem. Any time you are doing a least squares fit you are find the solution is some space that is closest to the answer in a larger space. – Cedron Dawg Jul 17 '20 at 17:03
I believe you, but I don't follow how you can solve the least squares problem without using some sort of criterion, which the eigen techniques don't utilize. – BigBrownBear00 Jul 17 '20 at 17:07
@BigBrownBear00 Sorry, I should let some one else who is more familiar with the algorithms you mentioned, or their implementations, address anything less general. I'm not getting a sense of what you are grasping for at all. I don't think in terms of subspaces in the stuff I do though they may be implicit. – Cedron Dawg Jul 17 '20 at 17:16
1

@BigBrownBear00: I don't know if this answers your question but pages 6-8 of this do a pretty nice job explaining how regression is a projection of the response onto a subspace. http://www.stat.cmu.edu/~ryantibs/datamining/lectures/13-reg1.pdf – mark leeds Jul 18 '20 at 04:15

score 3 · Answer 3 · answered Jul 17 '20 at 15:21

3

A subspace is just a vector space that's included in a bigger vector space.

Separating a random signal space into two statistically uncorrelated subspaces, a desired signal space and a noise space, yields eigenvectors that are orthogonal to each other.

This orthogonality property of those subspaces is used to separate noise from desired isgnal and get a better spectral estimate from the available data.

answered Jul 17 '20 at 15:21

Fat32

26,011
3
20
46

Once separated, can the signal subspace be used for matched filter detection (i.e., correlation)? Frequency offset estimation? Trying to grasp how the resultant subspaces after used in signal processing applications. – BigBrownBear00 Jul 17 '20 at 15:45
@BigBrownBear00 yes, you can use sub-space methods as filtering operations. Take a look at principal component analysis, specifically when it is applied to blind source separation (BSS). Additionally, “eigenfilters” are a concept that is used heavily in a lot of specialty fields, notably adaptive beamforming and space-time adaptive processing (both are radar/sonar sub-fields). In these methods, it essentially boils down to finding specific eigenvectors that describe signals of interest, which you can then use to “filter” things that don’t “align” with said eigenvectors (or vice versa). – matthewjpollard Jul 18 '20 at 00:04

Intuitive explanation of subspace

3 Answers3