1

I'm trying to use the step() function in R for variable selection of a linear model. My model looks like this:

#I have some code first that randomly samples part of iris.

step_mod <- step(glm(Species~., iris_sample, family = binomial), 
                 direction = "both")

The step() function minimizes AIC as the way of finding the best combination of variables. However, I care about the % correct for a certain species (also called the "sensitivity") of the final predictions a lot more than I do about the AIC.

Is there a way to change the method of selection to "sensitivity" in the step() function? Or is there a different function/package that you can use where you can change the selection criterion?

Here's the code I'm using to make predictions after I get the model, since it might helpful:

probs <- predict(step_mod, newdata = iris_unused, type="response")
class_predictions <- ifelse(probs < 0.50, "setosa", "versicolor")
final_check <- cbind(iris_unused, class_predictions)
tablecheck <- confusionMatrix(data = 
                     as.factor(final_check$class_predictions),
                               reference = 
                     as.factor(final_check$Species),
                               prevalence = 2)

In the resulting confusion matrix, the proportion of setosa correct is the number I'd like to maximize when doing model selection.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
pfadenhw
  • 15
  • 4

1 Answers1

0

Asking how to do in R is off-topic here, but: The way to do it in R is probably to program it yourself, since what you propose is not considered a good method! To see why, look through answers by user Frank Harrell ...

You propose to use accuracy to choose models. Accuracy is not a proper scoring rule, see Is accuracy an improper scoring rule in a binary classification setting?. As for stepwise methods in general, they are much disliked around here: Are there any circumstances where stepwise regression should be used?, advice from Stata-people, What are modern, easily used alternatives to stepwise regression?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • Thanks, and sorry for the off topic post. I should clarify that I'm not only using stepwise selection for my analyses, but I wanted to compare the results of stepwise to some of the other methods that are referenced in the links you provided. – pfadenhw Jan 29 '21 at 14:35