Most Popular

1500 questions
72
votes
7 answers

Do all interactions terms need their individual terms in regression model?

I am actually reviewing a manuscript where the authors compare 5-6 logit regression models with AIC. However, some of the models have interaction terms without including the individual covariate terms. Does it ever make sense to do this? For example…
djhocking
  • 1,701
  • 3
  • 17
  • 21
72
votes
11 answers

Is there any *mathematical* basis for the Bayesian vs frequentist debate?

It says on Wikipedia that: the mathematics [of probability] is largely independent of any interpretation of probability. Question: Then if we want to be mathematically correct, shouldn't we disallow any interpretation of probability? I.e., are…
72
votes
5 answers

Intuition on the Kullback–Leibler (KL) Divergence

I have learned about the intuition behind the KL Divergence as how much a model distribution function differs from the theoretical/true distribution of the data. The source I am reading goes on to say that the intuitive understanding of 'distance'…
cgo
  • 7,445
  • 10
  • 42
  • 61
72
votes
6 answers

What are i.i.d. random variables?

How would you go about explaining i.i.d (independent and identically distributed) to non-technical people?
user333
  • 6,621
  • 17
  • 44
  • 54
72
votes
12 answers

Hold-out validation vs. cross-validation

To me, it seems that hold-out validation is useless. That is, splitting the original dataset into two-parts (training and testing) and using the testing score as a generalization measure, is somewhat useless. K-fold cross-validation seems to give…
user46925
71
votes
2 answers

Performance metrics to evaluate unsupervised learning

With respect to the unsupervised learning (like clustering), are there any metrics to evaluate performance?
user3125
  • 2,617
  • 4
  • 25
  • 33
71
votes
2 answers

Removing duplicated rows data frame in R

How can I remove duplicate rows from this example data frame? A 1 A 1 A 2 B 4 B 1 B 1 C 2 C 2 I would like to remove the duplicates based on both the columns: A 1 A 2 B 4 B 1 C 2 Order is not important.
Jana
  • 969
  • 1
  • 8
  • 13
71
votes
10 answers

How to interpret F-measure values?

I would like to know how to interpret a difference of f-measure values. I know that f-measure is a balanced mean between precision and recall, but I am asking about the practical meaning of a difference in F-measures. For example, if a classifier C1…
AM2
  • 1,237
  • 2
  • 11
  • 10
71
votes
19 answers

What are some valuable Statistical Analysis open source projects?

What are some valuable Statistical Analysis open source projects available right now? Edit: as pointed out by Sharpie, valuable could mean helping you get things done faster or more cheaply.
grokus
  • 233
  • 5
  • 10
71
votes
5 answers

How can adding a 2nd IV make the 1st IV significant?

I have what is probably a simple question, but it is baffling me right now, so I am hoping you can help me out. I have a least squares regression model, with one independent variable and one dependent variable. The relationship is not significant.…
EvKohl
  • 1,090
  • 2
  • 10
  • 14
71
votes
5 answers

Using principal component analysis (PCA) for feature selection

I'm new to feature selection and I was wondering how you would use PCA to perform feature selection. Does PCA compute a relative score for each input variable that you can use to filter out noninformative input variables? Basically, I want to be…
Michael
  • 2,180
  • 4
  • 23
  • 32
71
votes
4 answers

What is translation invariance in computer vision and convolutional neural network?

I don't have computer vision background, yet when I read some image processing and convolutional neural networks related articles and papers, I constantly face the term, translation invariance, or translation invariant. Or I read alot that the…
71
votes
3 answers

Is this the solution to the p-value problem?

In February 2016, the American Statistical Association released a formal statement on statistical significance and p-values. Our thread about it discusses these issues extensively. However, no authority has come forth to offer a universally…
whuber
  • 281,159
  • 54
  • 637
  • 1,101
71
votes
4 answers

What's the difference between momentum based gradient descent and Nesterov's accelerated gradient descent?

So momentum based gradient descent works as follows: $v=\beta m-\eta g$ where $m$ is the previous weight update, and $g$ is the current gradient with respect to the parameters $p$, $\eta$ is the learning rate, and $\beta$ is a constant. $p_{new} = p…
applecider
  • 1,175
  • 2
  • 11
  • 13
71
votes
4 answers

How to tune hyperparameters of xgboost trees?

I have a class imbalanced data & I want to tune the hyperparameters of the boosted tress using xgboost. Questions Is there an equivalent of gridsearchcv or randomsearchcv for xgboost? If not what is the recommended approach to tune the parameters…
GeorgeOfTheRF
  • 5,063
  • 14
  • 42
  • 51