Highest Voted Questions - Statistical Analysis Stack Exchange

1229

votes

27 answers

Making sense of principal component analysis, eigenvectors & eigenvalues

In today's pattern recognition class my professor talked about PCA, eigenvectors and eigenvalues. I understood the mathematics of it. If I'm asked to find eigenvalues etc. I'll do it correctly like a machine. But I didn't understand it. I didn't…

pca intuition eigenvalues canonical-question

asked Sep 15 '10 at 20:05

claws

12,575
3
15
10

734

votes

11 answers

How to choose the number of hidden layers and nodes in a feedforward neural network?

Is there a standard and accepted method for selecting the number of layers, and the number of nodes in each layer, in a feed-forward neural network? I'm interested in automated ways of building neural networks.

model-selection neural-networks

asked Jul 20 '10 at 00:15

Rob Hyndman

51,928
23
126
178

607

votes

12 answers

What is the difference between "likelihood" and "probability"?

The wikipedia page claims that likelihood and probability are distinct concepts. In non-technical parlance, "likelihood" is usually a synonym for "probability," but in statistical usage there is a clear distinction in perspective: the number that…

probability terminology likelihood intuition

asked Sep 14 '10 at 03:24

Douglas S. Stones

6,931
4
16
18

530

votes

11 answers

What is the difference between test set and validation set?

I found this confusing when I use the neural network toolbox in Matlab. It divided the raw data set into three parts: training set validation set test set I notice in many training or learning algorithm, the data is often divided into 2 parts, the…

machine-learning validation

asked Nov 28 '11 at 11:05

xiaohan2012

6,819
5
18
18

527

votes

15 answers

What is the intuition behind beta distribution?

Disclaimer: I'm not a statistician but a software engineer. Most of my knowledge in statistics comes from self-education, thus I still have many gaps in understanding concepts that may seem trivial for other people here. So I would be very thankful…

distributions beta-distribution intuition beta-binomial-distribution

asked Jan 15 '13 at 15:31

ffriend

9,380
5
24
29

522

votes

23 answers

Why square the difference instead of taking the absolute value in standard deviation?

In the definition of standard deviation, why do we have to square the difference from the mean to get the mean (E) and take the square root back at the end? Can't we just simply take the absolute value of the difference instead and get the expected…

standard-deviation definition absolute-value faq

asked Jul 19 '10 at 21:04

c4il

5,465
4
16
9

516

votes

3 answers

Relationship between SVD and PCA. How to use SVD to perform PCA?

Principal component analysis (PCA) is usually explained via an eigen-decomposition of the covariance matrix. However, it can also be performed via singular value decomposition (SVD) of the data matrix $\mathbf X$. How does it work? What is the…

pca dimensionality-reduction matrix svd faq

asked Jan 20 '15 at 23:47

amoeba

93,463
28
275
317

478

votes

20 answers

The Two Cultures: statistics vs. machine learning?

Last year, I read a blog post from Brendan O'Connor entitled "Statistics vs. Machine Learning, fight!" that discussed some of the differences between the two fields. Andrew Gelman responded favorably to this: Simon Blomberg: From R's fortunes …

machine-learning pac-learning

asked Jul 19 '10 at 19:14

Shane

11,961
17
71
89

419

votes

5 answers

How to understand the drawbacks of K-means

K-means is a widely used method in cluster analysis. In my understanding, this method does NOT require ANY assumptions, i.e., give me a dataset and a pre-specified number of clusters, k, and I just apply this algorithm which minimizes the sum of…

machine-learning clustering data-mining k-means

asked Jan 16 '15 at 04:38

KevinKim

6,347
4
21
35

414

votes

14 answers

Bayesian and frequentist reasoning in plain English

How would you describe in plain English the characteristics that distinguish Bayesian from Frequentist reasoning?

bayesian frequentist

asked Jul 19 '10 at 19:25

Daniel Vassallo

4,249
3
15
7

393

votes

11 answers

Explaining to laypeople why bootstrapping works

I recently used bootstrapping to estimate confidence intervals for a project. Someone who doesn't know much about statistics recently asked me to explain why bootstrapping works, i.e., why is it that resampling the same sample over and over gives…

bootstrap intuition communication

asked Apr 08 '12 at 21:04

Alan H.

4,899
4
20
19

387

votes

18 answers

What happens if the explanatory and response variables are sorted independently before regression?

Suppose we have data set $(X_i,Y_i)$ with $n$ points. We want to perform a linear regression, but first we sort the $X_i$ values and the $Y_i$ values independently of each other, forming data set $(X_i,Y_j)$. Is there any meaningful interpretation…

regression correlation

asked Dec 07 '15 at 17:22

arbitrary user

3,541
3
9
8

380

votes

7 answers

When conducting multiple regression, when should you center your predictor variables & when should you standardize them?

In some literature, I have read that a regression with multiple explanatory variables, if in different units, needed to be standardized. (Standardizing consists in subtracting the mean and dividing by the standard deviation.) In which other cases…

multiple-regression standardization centering

asked Jun 04 '12 at 16:32

mathieu_r

4,211
3
14
5

376

votes

26 answers

Python as a statistics workbench

Lots of people use a main tool like Excel or another spreadsheet, SPSS, Stata, or R for their statistics needs. They might turn to some specific package for very special needs, but a lot of things can be done with a simple spreadsheet or a general…

r spss stata python

asked Aug 12 '10 at 10:46

Fabian Fagerholm

215
3
6
7

372

votes

9 answers

What is the difference between fixed effect, random effect and mixed effect models?

In simple terms, how would you explain (perhaps with simple examples) the difference between fixed effect, random effect and mixed effect models?

mixed-model random-effects-model definition fixed-effects-model

asked Nov 19 '10 at 00:03

Andrew

5,478
5
21
21

Most Popular