4

I am using the SVMStruct function in MATLAB (with RBF kernel) to classify my data, and it works great. But now I need to compare the distance from the data points to the hyperplane, or to find the data point that is closest to the hyperplane. I don't find a function in MATLAB to do that, or even how this can be done. Could someone please suggest?

chl
  • 50,972
  • 18
  • 205
  • 364
AshX
  • 43
  • 1
  • 3

2 Answers2

5

What about just computing it explicitly? If a hyperplane is defined as $\langle \vec a, \vec x \rangle =0$, than the distance $$ d(\vec x_0) = \frac{\langle \vec a, \vec x_0 \rangle}{\| \vec a \|} $$ Programming it in matlab is easy.

agronskiy
  • 655
  • 6
  • 7
4

You can get the hyperplane only in the case of linear kernel (a.k.a dot-product) case. Here, the input for the computation are (based on what I could interpret from the documentation and a helpful thread)

  1. SVMStruct.Bias (call it $b$)
  2. SVMStruct.SupportVectors (call it $\{x_j\}$) (Note: These are data points closest to the hyperplane)
  3. SVMStruct.Alpha (call it $\{\alpha_j\}$)

The output is: $w^T = [(\sum_{j}\alpha_jx_j)^T\;\; b]$. The distance of every training point to the hyperplane specified by this vector $w$ is $w^T[x_i]/||w||_2$.

For RBF kernel, the representation of the classifier or regressor is of the form $\sum_{i=1}^n \alpha_i K(x_i,x)$ where $n$ is the number of training examples and $K$ is the kernel we choose and $\{x_i\}$ are our training data points. The hyperplane lives in a possibly higher (even infinite) dimension. This hyperplane is of course different from the decision boundary (which is non-linear) which you may visualize when you have only 2-dimensional features.

Notation: vectors are in column format.

  • Thanks, @Theja it really helps. The thread you gave is also very helpful. I just got the question, in the equation $w^T = [(\sum_{j}\alpha_jx_j)^T\;\; b]$ , is it supposed to be $w^T = [(\sum_{j}\alpha_jx_j)^T+ b\;]$ ? – AshX Oct 04 '13 at 15:58
  • $w$ is a vector with its first d coordinates being $\sum_j\alpha_j x_j$ and the d+1 coordinate being $b$. Here, d is the dimension of the feature vector. – Theja Tulabandhula Oct 05 '13 at 16:44