Highest Voted Questions - Statistical Analysis Stack Exchange

76

votes

8 answers

How and why do normalization and feature scaling work?

I see that lots of machine learning algorithms work better with mean cancellation and covariance equalization. For example, Neural Networks tend to converge faster, and K-Means generally gives better clustering with pre-processed features. I do not…

machine-learning neural-networks covariance normalization

asked Nov 01 '12 at 20:20

erogol

1,427
1
15
26

76

votes

4 answers

Why does including latitude and longitude in a GAM account for spatial autocorrelation?

I have produced generalized additive models for deforestation. To account for spatial-autocorrelation, I have included latitude and longitude as a smoothed, interaction term (i.e. s(x,y)). I've based this on reading many papers where the authors say…

r modeling spatial autocorrelation generalized-additive-model

asked Sep 01 '12 at 14:00

gisol

943
1
8
10

76

votes

2 answers

Multivariate multiple regression in R

I have 2 dependent variables (DVs) each of whose score may be influenced by the set of 7 independent variables (IVs). DVs are continuous, while the set of IVs consists of a mix of continuous and binary coded variables. (In code below continuous…

r multivariate-analysis manova multiple-regression multivariate-regression

asked May 22 '11 at 18:33

Andrej

2,131
2
18
26

76

votes

1 answer

Understanding ROC curve

I'm having trouble understanding the ROC curve. Is there any advantage / improvement in area under the ROC curve if I build different models from each unique subset of the training set and use it to produce a probability? For example, if $y$ has…

r roc

asked Jul 02 '14 at 07:18

Tay Shin

965
2
7
10

75

votes

1 answer

How to split the dataset for cross validation, learning curve, and final evaluation?

What is an appropriate strategy for splitting the dataset? I ask for feedback on the following approach (not on the individual parameters like test_size or n_iter, but if I used X, y, X_train, y_train, X_test, and y_test appropriately and if the…

machine-learning cross-validation python scikit-learn

asked Apr 30 '14 at 10:44

tobip

1,450
4
14
11

75

votes

4 answers

How should tiny $p$-values be reported? (and why does R put a minimum on 2.22e-16?)

For some tests in R, there is a lower limit on the p-value calculations of $2.22 \cdot 10^{-16}$. I'm not sure why it's this number, if there is a good reason for it or if it's just arbitrary. A lot of other stats packages just go to 0.0001, so this…

r p-value reporting precision

asked Dec 07 '13 at 07:06

paul

1,342
3
11
16

75

votes

4 answers

What is the difference Cross-entropy and KL divergence?

Both the cross-entropy and the KL divergence are tools to measure the distance between two probability distributions, but what is the difference between them? $$ H(P,Q) = -\sum_x P(x)\log Q(x) $$ $$ KL(P | Q) = \sum_{x} P(x)\log {\frac{P(x)}{Q(x)}}…

entropy kullback-leibler cross-entropy

asked Jul 19 '18 at 13:02

yoyo

979
1
6
9

75

votes

6 answers

What is an intuitive explanation for how PCA turns from a geometric problem (with distances) to a linear algebra problem (with eigenvectors)?

I've read a lot about PCA, including various tutorials and questions (such as this one, this one, this one, and this one). The geometric problem that PCA is trying to optimize is clear to me: PCA tries to find the first principal component by…

pca optimization linear-algebra intuition

asked Jun 08 '16 at 22:20

stackoverflowuser2010

3,190
5
27
35

75

votes

6 answers

What method can be used to detect seasonality in data?

I want to detect seasonality in data that I receive. There are some methods that I have found like the seasonal subseries plot and the autocorrelation plot but the thing is I don't understand how to read the graph, could anyone help? The other…

time-series seasonality

asked Sep 27 '11 at 15:00

Danial

751
1
6
3

74

votes

2 answers

Practical questions on tuning Random Forests

My questions are about Random Forests. The concept of this beautiful classifier is clear to me, but still there are a lot of practical usage questions. Unfortunately, I failed to find any practical guide to RF (I've been searching for something like…

random-forest cart

asked Mar 25 '13 at 15:53

lithuak

993
1
8
8

74

votes

5 answers

Understanding stratified cross-validation

I read in Wikipedia: In stratified k-fold cross-validation, the folds are selected so that the mean response value is approximately equal in all the folds. In the case of a dichotomous classification, this means that each fold contains roughly…

cross-validation stratification

asked Feb 07 '13 at 20:58

Amelio Vazquez-Reina

17,546
26
74
110

74

votes

6 answers

Model for predicting number of Youtube views of Gangnam Style

PSY's music video "Gangnam style" is popular, after a little more than 2 months it has about 540 million viewers. I learned this from my preteen children at dinner last week and soon the discussion went in the direction of if it was possible to do…

modeling internet

asked Oct 27 '12 at 05:49

FredrikD

843
7
15

74

votes

12 answers

What are some of the most common misconceptions about linear regression?

I'm curious, for those of you who have extensive experience collaborating with other researchers, what are some of the most common misconceptions about linear regression that you encounter? I think can be a useful exercise to think about common…

regression multiple-regression

asked Jun 09 '16 at 19:10

ST21

155
4
10

74

votes

4 answers

What makes the Gaussian kernel so magical for PCA, and also in general?

I was reading about kernel PCA (1, 2, 3) with Gaussian and polynomial kernels. How does the Gaussian kernel separate seemingly any sort of nonlinear data exceptionally well? Please give an intuitive analysis, as well as a mathematically involved…

machine-learning pca svm kernel-trick

asked Jan 02 '15 at 08:18

Simon Kuang

2,051
3
17
18

74

votes

5 answers

Unified view on shrinkage: what is the relation (if any) between Stein's paradox, ridge regression, and random effects in mixed models?

Consider the following three phenomena. Stein's paradox: given some data from multivariate normal distribution in $\mathbb R^n, \: n\ge 3$, sample mean is not a very good estimator of the true mean. One can obtain an estimation with lower mean…

regression mixed-model ridge-regression regularization steins-phenomenon

asked Oct 30 '14 at 15:08

amoeba

93,463
28
275
317

Most Popular