Machine learning framework for SVM, Random Forest

Question

I need an library, or something that is already done for SVM and Random Forest algorithms. Can you give me some ideas? I don't have experience and I don't know what to choose.

The restriction of my classification problem is: 27 dimensions, 9 classes, 50.000 entries in the training set, 150.000 in test set.

Package e1071,randomForest,caret in R are pretty easy.The difficult part comes when adjusting model parameters whichever library you use — D.Castro, Apr 11 '15 at 09:01

score 2 · Answer 1 · edited May 23 '17 at 12:39

2

I, too, would suggest the 'caret' package in R

You can built a lot of models and compare their performances

http://topepo.github.io/caret/training.html

By the way, usually the ratio of the training set to the test set is a bit higher than that you have.

Let have a look at this discussion: https://stackoverflow.com/questions/13610074/is-there-a-rule-of-thumb-for-how-to-divide-a-dataset-into-training-and-validatio

edited May 23 '17 at 12:39

Community

1

answered Apr 11 '15 at 15:19

user3875022

71
1
5

1

But few people have the luxury of such a large data set. $n_{train} = 50000 : p = 27$ is a pretty comfortable training set size, and 1.5e5 test cases is nice as well (assuming these are independent cases...) – cbeleites unhappy with SX Apr 11 '15 at 16:42
I understand your point, but in the machine learning community, usually the training set is suggested to be larger than the test set, not the other way around. – user3875022 Apr 12 '15 at 18:14
There are several discussions on this point: http://stats.stackexchange.com/questions/23331/why-is-there-an-asymmetry-between-the-training-step-and-evaluation-step – user3875022 Apr 12 '15 at 18:15
Also see this: http://www.quora.com/Can-the-validation-set-be-larger-than-the-training-set – user3875022 Apr 12 '15 at 18:15

Machine learning framework for SVM, Random Forest

1 Answers1