Cyborg's answer is basically right, but perhaps a little too strong.
Classification and PCA are not linked particularly strongly. People sometimes use PCA to reduce the dimensionality of their data set before classification. This is optional and it's often done for practical reasons: they know a priori that the first PCs contain a signal and the rest are noise or their computing platform simply can't handle the full-rank matrix (less of an issue now, but maybe you're using a huge data set or are on embedded hardware).
Having done PCA, you're still free to do whatever you want next. Your choice of classification or clustering algorithm is totally unconstrained. Throw it into an SVM, pipe it through a naive bayes classifier, whatever--it's your choice. That said, some algorithms make a little bit less sense. For example, one of the major strengths of decision trees is that they produce an interpretable model and this might be trickier on the PC-transformed data