0

For the following popular data mining methods: SVM, neural network, logistic regression, random forest, classification tree, Naïve Bayes classifier, regression;

  • How to compare them in terms of their respective advantage, disadvantage, limitation of application, performance, etc?
  • What are their underlying relationships?
Has QUIT--Anony-Mousse
  • 39,639
  • 7
  • 61
  • 96
user785099
  • 1,105
  • 3
  • 14
  • 24
  • Did you take a look at this question: [What is a good resource that includes a comparison of the pros and cons of different classifiers?](http://stats.stackexchange.com/q/17066/930) – chl Dec 06 '11 at 07:18
  • 1
    Note that these are all **machine learning** methods. To many "data mining" refers mostly to "unsupervised" knowledge discovery methods with a strong connection to database management for efficiency. I know that the Weka-book is called "machine learning". Did you know it was supposed to be called "practical machine learning", and it was for sales reasons renamed to the buzzword "data mining"? I recommend people to make this distinction to avoid further confusion. ML is a much more precise term. – Has QUIT--Anony-Mousse Dec 06 '11 at 12:09

1 Answers1

3

Some of the ways in which they differ:

1) some of them (regression, back-propagation neural network) require "supervision", i.e. the training set includes both the inputs and the desired output. Others can learn in "unsupervised" mode, meaning that given a mass of different cases they can divide them into sensible categories.

2) some of them are better at dealing with non-linear relationships than others. Knowledge of the basic physics of your system of interest can suggest to you whether this is important for your purpose or not

3) some of them scale better to very large systems than others; what works best for a few thousand cases may not be able to scale to trillions of cases in an economical manner.

So, perhaps you can add details like whether you are needing supervised or unsupervised, how big the dataset is, and what kind of mathematical relationships you expect, and we could give you a better answer to this question.

rossdavidh
  • 490
  • 4
  • 11