Questions tagged [caret]

Caret is an R package containing a set of functions that attempt to streamline the process of creating predictive models.

Caret is an R package. The name is an acronym for Classification And REgression Training. It is a set of functions that attempt to streamline the process of creating predictive models. It provides a standardized interface to several of R's machine-learning packages, along with utilities for training-data and plotting.

The official description of the package is:

Misc functions for training and plotting classification and regression models.

There is more information about caret at RDocumentation or at RBloggers. At Stackoverflow there is also a longer list with material and resources.

Important note: The stackoverflow tag for the R Package caret is r-caret, while the crossvalidated tag is .

485 questions
34
votes
3 answers

In caret what is the real difference between cv and repeatedcv?

This is similar to question Caret re-sampling methods, although that really never answered this part of the question in an agreed upon way. caret's train function offers cv and repeatedcv. What is the difference in say…
Brian Feeny
  • 501
  • 1
  • 5
  • 5
30
votes
3 answers

R: Random Forest throwing NaN/Inf in "foreign function call" error despite no NaN's in dataset

I'm using caret to run a cross validated random forest over a dataset. The Y variable is a factor. There are no NaN's, Inf's, or NA's in my dataset. However when running the random forest, I get Error in randomForest.default(m, y, ...) : …
Info5ek
  • 1,051
  • 3
  • 11
  • 21
30
votes
3 answers

R caret and NAs

I very much prefer caret for its parameter tuning ability and uniform interface, but I have observed that it always requires complete datasets (i. e. without NAs) even if the applied "naked" model allows NAs. That is very bothersome, regarding that…
Fredrik
  • 671
  • 1
  • 5
  • 8
24
votes
2 answers

Does caret train function for glmnet cross-validate for both alpha and lambda?

Does the R caret package cross-validate over both alpha and lambda for the glmnet model? Running this code, eGrid <- expand.grid(.alpha = (1:10) * 0.1, .lambda = (1:10) * 0.1) Control <- trainControl(method =…
mrquestion
  • 273
  • 1
  • 2
  • 7
24
votes
4 answers

Caret and randomForest number of trees

I am puzzled as to why the caret package in R does not allow tuning on the number of trees (ntree) in a random forest (specifically in the randomForest package)? I cant imagine this is an oversight on the part of the package author - so there must…
B_Miner
  • 7,560
  • 20
  • 81
  • 144
23
votes
1 answer

Caret and coefficients (glmnet)

I am interested in utilizing caret for making inferences on a particular data set. Is it possible to do the following: produce coefficients of a glmnet model I trained in caret. I would like to use glmnet because of the inherent feature selection…
user2300643
  • 741
  • 2
  • 5
  • 13
23
votes
2 answers

Caret re-sampling methods

I am using the library caret in R to test various modelling procedures. The trainControl object allows one to specify a re-sampling method. The methods are described in the documentation section 2.3 and include: boot, boot632, cv, LOOCV, LGOCV,…
Ram Ahluwalia
  • 3,003
  • 6
  • 27
  • 38
21
votes
3 answers

Stacking/ensembling models with caret

I often find myself training several different predictive models using caret in R. I'll train them all on the same cross validation folds, using caret::: createFolds, then choose the best model based on cross-validated error. However, the median…
Zach
  • 22,308
  • 18
  • 114
  • 158
21
votes
7 answers

Overfitting: No silver bullet?

My understanding is that even when following proper cross validation and model selection procedures, overfitting will happen if one searches for a model hard enough, unless one imposes restrictions on model complexity, period. Moreover, often times…
20
votes
2 answers

PCA and k-fold cross-validation in caret package in R

I just re-watched a lecture from the Machine Learning course on Coursera. In the section where the professor discusses PCA for pre-processing data in supervised learning applications, he says PCA should only be performed on the training data and…
mchangun
  • 479
  • 1
  • 3
  • 10
18
votes
0 answers

R - how to let glmnet choose lambda range when using caret?

To fit a lasso model using glmnet, you can simply do the following and glmnet will automatically calculate a reasonable range of lambda values appropriate for the data set: glmnet(x, y, alpha = 1) I know I can also do cross validation natively…
user13587
17
votes
1 answer

Caret - Repeated K-fold cross-validation vs Nested K-fold cross validation, repeated n-times

The caret package is a brilliant R library for building multiple machine learning models, and has several functions for model building and evaluation. For parameter tuning and model training, the caret package offers ‘repeatedcv’ as one of the…
Mani
  • 171
  • 1
  • 4
16
votes
1 answer

Caret glmnet vs cv.glmnet

There seems to be a lot of confusion in the comparison of using glmnet within caret to search for an optimal lambda and using cv.glmnet to do the same task. Many questions were posed, e.g.: Classification model train.glmnet vs. cv.glmnet? What is…
Jogi
  • 616
  • 1
  • 6
  • 13
16
votes
4 answers

Gradient boosting machine accuracy decreases as number of iterations increases

I'm experimenting with the gradient boosting machine algorithm via the caret package in R. Using a small college admissions dataset, I ran the following code: library(caret) ### Load admissions dataset. ### mydata <-…
RobertF
  • 4,380
  • 6
  • 29
  • 46
16
votes
2 answers

Using the caret package is it possible to obtain confusion matrices for specific threshold values?

I've obtained a logistic regression model (via train) for a binary response, and I've obtained the logistic confusion matrix via confusionMatrix in caret. It gives me the logistic model confusion matrix, though I'm not sure what threshold is being…
Black Milk
  • 356
  • 1
  • 3
  • 12
1
2 3
32 33