Highest Voted Questions - Statistical Analysis Stack Exchange

72

votes

7 answers

Do all interactions terms need their individual terms in regression model?

I am actually reviewing a manuscript where the authors compare 5-6 logit regression models with AIC. However, some of the models have interaction terms without including the individual covariate terms. Does it ever make sense to do this? For example…

regression modeling interaction aic

asked May 04 '12 at 02:10

djhocking

1,701
3
17
21

72

votes

11 answers

Is there any mathematical basis for the Bayesian vs frequentist debate?

It says on Wikipedia that: the mathematics [of probability] is largely independent of any interpretation of probability. Question: Then if we want to be mathematically correct, shouldn't we disallow any interpretation of probability? I.e., are…

probability bayesian frequentist philosophical kolmogorov-axioms

asked Aug 18 '16 at 03:40

Chill2Macht

5,639
4
25
51

72

votes

5 answers

Intuition on the Kullback–Leibler (KL) Divergence

I have learned about the intuition behind the KL Divergence as how much a model distribution function differs from the theoretical/true distribution of the data. The source I am reading goes on to say that the intuitive understanding of 'distance'…

distributions distance intuition kullback-leibler

asked Jan 01 '16 at 17:03

cgo

7,445
10
42
61

72

votes

6 answers

What are i.i.d. random variables?

How would you go about explaining i.i.d (independent and identically distributed) to non-technical people?

random-variable intuition iid

asked Feb 07 '11 at 13:59

user333

6,621
17
44
54

72

votes

12 answers

Hold-out validation vs. cross-validation

To me, it seems that hold-out validation is useless. That is, splitting the original dataset into two-parts (training and testing) and using the testing score as a generalization measure, is somewhat useless. K-fold cross-validation seems to give…

machine-learning cross-validation validation

asked Jun 25 '14 at 13:41

user46925

71

votes

2 answers

Performance metrics to evaluate unsupervised learning

With respect to the unsupervised learning (like clustering), are there any metrics to evaluate performance?

machine-learning clustering data-mining unsupervised-learning

asked Dec 09 '13 at 03:00

user3125

2,617
4
25
33

71

votes

2 answers

Removing duplicated rows data frame in R

How can I remove duplicate rows from this example data frame? A 1 A 1 A 2 B 4 B 1 B 1 C 2 C 2 I would like to remove the duplicates based on both the columns: A 1 A 2 B 4 B 1 C 2 Order is not important.

r

asked Jan 31 '11 at 19:58

Jana

969
1
8
13

71

votes

10 answers

How to interpret F-measure values?

I would like to know how to interpret a difference of f-measure values. I know that f-measure is a balanced mean between precision and recall, but I am asking about the practical meaning of a difference in F-measures. For example, if a classifier C1…

classification precision-recall

asked Feb 04 '13 at 11:38

AM2

1,237
2
11
10

71

votes

19 answers

What are some valuable Statistical Analysis open source projects?

What are some valuable Statistical Analysis open source projects available right now? Edit: as pointed out by Sharpie, valuable could mean helping you get things done faster or more cheaply.

software open-source

asked Jul 19 '10 at 19:13

grokus

233
5
10

71

votes

5 answers

How can adding a 2nd IV make the 1st IV significant?

I have what is probably a simple question, but it is baffling me right now, so I am hoping you can help me out. I have a least squares regression model, with one independent variable and one dependent variable. The relationship is not significant.…

regression multiple-regression statistical-power suppressor

asked May 14 '12 at 18:02

EvKohl

1,090
2
10
14

71

votes

5 answers

Using principal component analysis (PCA) for feature selection

I'm new to feature selection and I was wondering how you would use PCA to perform feature selection. Does PCA compute a relative score for each input variable that you can use to filter out noninformative input variables? Basically, I want to be…

pca feature-selection

asked Apr 28 '12 at 15:39

Michael

2,180
4
23
32

71

votes

4 answers

What is translation invariance in computer vision and convolutional neural network?

I don't have computer vision background, yet when I read some image processing and convolutional neural networks related articles and papers, I constantly face the term, translation invariance, or translation invariant. Or I read alot that the…

machine-learning conv-neural-network convolution computer-vision

asked Apr 23 '16 at 15:30

Hossein

2,005
3
18
32

71

votes

3 answers

Is this the solution to the p-value problem?

In February 2016, the American Statistical Association released a formal statement on statistical significance and p-values. Our thread about it discusses these issues extensively. However, no authority has come forth to offer a universally…

hypothesis-testing statistical-significance p-value

asked Mar 31 '16 at 22:14

whuber

281,159
54
637
1,101

71

votes

4 answers

What's the difference between momentum based gradient descent and Nesterov's accelerated gradient descent?

So momentum based gradient descent works as follows: $v=\beta m-\eta g$ where $m$ is the previous weight update, and $g$ is the current gradient with respect to the parameters $p$, $\eta$ is the learning rate, and $\beta$ is a constant. $p_{new} = p…

optimization gradient-descent

asked Nov 03 '15 at 06:51

applecider

1,175
2
11
13

71

votes

4 answers

How to tune hyperparameters of xgboost trees?

I have a class imbalanced data & I want to tune the hyperparameters of the boosted tress using xgboost. Questions Is there an equivalent of gridsearchcv or randomsearchcv for xgboost? If not what is the recommended approach to tune the parameters…

machine-learning cross-validation boosting

asked Sep 04 '15 at 02:23

GeorgeOfTheRF

5,063
14
42
51

Most Popular