Questions tagged [decision]

tag is unclear, consider tag with decision-theory, or find some more specific tag

53 questions
19
votes
1 answer

What is the difference between decision_function, predict_proba, and predict function for logistic regression problem?

I have been going through the sklearn documentation but I am not able to understand the purpose of these functions in the context of logistic regression. For decision_function it says that its the distance between the hyperplane and the test…
Sameed
  • 415
  • 1
  • 4
  • 10
9
votes
1 answer

Decision rule as a hyper-parameter in LASSO

I have a question that is related to the following: Is decision threshold a hyperparameter in logistic regression? but would like some clarification. The general consensus is that the decision rule is not a hyper-parameter in the strictest sense…
astel
  • 1,388
  • 5
  • 17
4
votes
1 answer

can we use any learners in gradient boosting instead of trees?

As we are simply trying to predict residuals from weak learners and aggregating them, can we use any weak learners in gradient boosting machines instead of trees ? If so, why are the all the gbm implementations like xgboost, lightgbm use trees ?
tjt
  • 687
  • 4
  • 13
3
votes
2 answers

Identify the confidence of the impact of a proposed improvement initiative

Assume I have a manufacturing process that involves a moving train. It has failures of certain types like brakes and steering and also the weather. However, we can not do anything with regard to weather. Furthermore, there are more Brake failures…
3
votes
1 answer

Overview of the main methods to prune decision trees

Could someone explain the main pruning techniques for decision trees. So something like the 3 most common techniques with a short explanation of how they work. I have looked online but this, surprisingly, doesnt seem to have been covered anywhere. A…
Trajan
  • 369
  • 4
  • 17
3
votes
1 answer

Scikit's permuted features in decision tree implementation

In the Scikit's docummentation of decision trees I found a note: "The features are always randomly permuted at each split. Therefore, the best found split may vary, even with the same training data and max_features=n_features" However, according to…
PyFox
  • 133
  • 4
3
votes
1 answer

Advertisment decision making based on customer past behaviour

Problem description: Every 3 weeks a fashion company sends out an expensive booklet with descriptions of clothes to each customer on their electronic records. There exists a purchase history what each customer bought in the 3 weeks after receiving…
MyCatsHat
  • 161
  • 8
2
votes
0 answers

Are binary splits always better than multi-way splits in decision trees?

I'm trying to devise a decision tree for classification with multi-way split at an attribute but even though calculating the entropy for a multi-way split gives better information gain than a binary split, the decision tree in code never tries to…
2
votes
0 answers

Determining the decision boundary for Naive Bayes

I'd like to know if this is a sensible idea and if there exist any already formed methods to do this (I'm new to the data science area). Essentially, I have used Naive Bayes to accurately classify three types of food, based on their nutritious value…
Dan Savage
  • 23
  • 3
2
votes
0 answers

Calculating Probability for Decision Tree Model

I came across calculation of probability for a decision tree model - which I do not understand. As I plan to do CEA of some health interventions I would not like to mess it up. The used method (calculation) seem to me rather strange. Could it be…
user215953
  • 21
  • 1
2
votes
2 answers

Contributing predictors to a response variable

I have a dataset which has the following two tables which look like the following: District ID Crime Rate Violent Crime Rate Annual Police Funding 97 437 148 36 96 819 369 30 83 799 693 35 81 548 226 31 74 432 98 23 71 989 1375 22 68 494…
1
vote
0 answers

What is the relation between Linear Classifier and Linear Decission Boundary (or Non Linear Decision Boundary)?

As we know (Wikipedia Definition): Linear Classifier makes a classification decision based on the linear combination of the feature vectors. Mathematically : $y = f(\sum w_i x_i)$ So , $f$ is our linear classifier (which may be logistic or any…
1
vote
0 answers

Cost complexity pruning decision trees

I am trying to understand cost complexity pruning in classification trees. I found that DecisionTree in sklearn has a function called cost_complexity_pruning_path, which gives the effective alphas of subtrees during pruning. What does effective…
Andreas Zaras
  • 741
  • 11
  • 21
1
vote
1 answer

Two question about a decision tree algorithm I found online

I am trying to learn decision trees but it has been difficult because the examples are extremely long and tedious and everybody seems to have a different algorithm in mind After some digging I found a reliable set of notes online. However, I have…
Norman
  • 297
  • 2
  • 11
1
vote
1 answer

How does a regression tree split the y variable

How does a regression tree split the y variable? Is it just a case of even chunking of the range or are the chunks of variable size?
Trajan
  • 369
  • 4
  • 17
1
2 3 4