5

I am beginner to neural network and machine learning. I am working neural network with 1 hidden layer. I took spiral data set and I am trying to overfit the data.

I applied neural network to it and I am getting a 98% accuracy. But, I am getting the of classification boundary in 2nd figure. I mean why am I getting read color on yellow side and blue color on red side!

I should get boundary like the right figure.

enter image description here

Is there a reason why I am not getting such boundary even though I am achieving high accuracy. Or can you tell what precautions I should take to avoid such problems.

Haitao Du
  • 32,885
  • 17
  • 118
  • 213

1 Answers1

8

The reason is that you are NOT asking model to provide "a desired boundary", BUT simply ask the model to correctly classify your data.

There are infinite decision boundaries exist, that achieve same classification task with same accuracy.

When we use neural network, the model can chose whatever it wants. In addition, the model does not know the shape of the data (the groundtruth / generative model / spiral shape in your example). The model will just select one "working" decision boundary, but not "really optimal to the true distribution" (as indicated in your figure 3)

If you want to do something with decision boundary, please check support vector machine. In fact, even you use SVM, decision boundary may not be what you expected, because it will max the "margin", but still have no idea about true distribution (spiral or other shape).


As mentioned in the comment, different types of the model have different decision boundaries. For example, logistic regression and linear discriminant analysis (LDA) will have a line (or hyperplane in high dimensional space), and quadratic discriminant analysis (QDA) will have a quadratic curve as division boundary.

pic source

enter image description here

Finally, My answer for another question gives some examples on the different model's decision boundaries.

Do all machine learning algorithms separate data linearly?

Haitao Du
  • 32,885
  • 17
  • 118
  • 213
  • So, this means we can't control such things with neural network? – Shubham Sharma Dec 30 '16 at 05:18
  • short answer is know, But I bet you can design an "objective function" that describes the boundary, note this is totally different from the "correctly classify all data points" objective. And use NN to optimize such objective – Haitao Du Dec 30 '16 at 05:31
  • Is SVM really intrinsically better here? I think it would depend completely on the features/kernel? I remember playing [here](http://playground.tensorflow.org) and for their (2-class) spiral, a deeper network was needed. (Particularly if only $x$ and $y$ were input as "features"). Probably regularization would also be important? – GeoMatt22 Dec 30 '16 at 05:56
  • @GeoMatt22 I was trying to say NN has little thing to do with the boundary, and SVM has something to do with boundary, but may not get what the "true boundary" – Haitao Du Dec 30 '16 at 06:18
  • OK. I believe SVM is commonly applied to "features" derived from a deep NN? In that context the "default" top-level classifier for NN would be softmax (?) So then the "feature-independent" comparison would be SVM vs. softmax? – GeoMatt22 Dec 30 '16 at 06:44
  • 3
    The OP mentioned overfitting. Isn't this a problem that affects all techniques? I know more about discriminant analysis. Should I tell the OP that discriminant analysis will cure the problem? (rhetorical question). – Michael R. Chernick Dec 30 '16 at 14:37
  • Some methods presuppose certain shapes for the boundary(e.g. linear discriminant analysis presupposes linear boundaries). The main objective of classification is to make predictions on unlabeled data. Overfitting tends to hurt prediction. It is more realistic to assume some classes have overlapping boundaries making perfect predicttion impossible. – Michael R. Chernick Dec 30 '16 at 14:45
  • @MichaelChernick Great comment!, from what I read from OP, I am not sure how much it will be helpful if we give a really detailed answer in technical terms. But Definitely worth to cover logistic regression and LDA in concise way. I will try to revise my answer when I got time. – Haitao Du Jan 04 '17 at 18:01