SVM is a transformation-based classifier. It transform your data into a space where it can find a hyperplane that best separates examples (instances) from different classes.
In your graph, each point represents an example. They are scattered according to the values of their features in the space found by SVM (which can be the space of the original data).
The hard line is the optimal separating hyperplane, the dashed lines are the separating margins, and the points on the dashed lines are called support vectors. They are all related:
- the hyperplane is the one that best separates the classes with respect to a utility function;
- the margins are equally spaced from the hyperplane. They are called +1 and -1 because the utility function for the instances on the margin is exactly +1 or -1;
- the support vectors are the "hardest instances" of your problem. They are the ones closer to the optimal hyperplane. You can't find another hyperplane that maximizes your utility function without having more instances over the margin.
When you want to classify new data, the only instances you need are the support vectors. Suppose you want to classify a new instance whose features will be $x=8$ and $y=-8$. The SVM needs only to find the distance from this example to the support vectors and, with this, it knows which side of the hyperplane it falls. Our instance $(8,-8)$ falls in the side of the orange instances, so the SVM will classify the new instance with the "orange class".
Notice that the SVM does not really find the values of the features. Instead, it uses a function called kernel which gives the distance among the instances in the feature space without actually transforming them. The transformation is implicit. This makes possible for the SVM to use very complex spaces.
Also, this looks like an example of a hard margin SVM. The hard margin SVM is induced by solving an optimization problem where all instances from one class fall in one side of the margin, and all instances of the other fall in the opposite side. This is a very hard constraint and, in reality, we use soft margin SVM, which has a cost function that accepts a few instances in the "wrong side" of the margin. This reduces the bias of the model and in turn makes it generalize better to unseen data.