Highest Voted Questions - Statistical Analysis Stack Exchange

85

votes

7 answers

What are the 'big problems' in statistics?

Mathematics has its famous Millennium Problems (and, historically, Hilbert's 23), questions that helped to shape the direction of the field. I have little idea, though, what the Riemann Hypotheses and P vs. NP's of statistics would be. So, what are…

history

asked Sep 05 '10 at 04:16

raegtin

9,090
12
48
53

85

votes

5 answers

Cross-Validation in plain english?

How would you describe cross-validation to someone without a data analysis background?

cross-validation intuition

asked Aug 18 '10 at 13:11

Shane

11,961
17
71
89

85

votes

11 answers

What is the best way to remember the difference between sensitivity, specificity, precision, accuracy, and recall?

Despite having seen these terms 502847894789 times, I cannot for the life of me remember the difference between sensitivity, specificity, precision, accuracy, and recall. They're pretty simple concepts, but the names are highly unintuitive to me,…

terminology accuracy sensitivity-specificity

asked Oct 31 '14 at 19:14

Jessica

1,781
2
15
17

85

votes

14 answers

Why haven't robust (and resistant) statistics replaced classical techniques?

When solving business problems using data, it's common that at least one key assumption that under-pins classical statistics is invalid. Most of the time, no one bothers to check those assumptions so you never actually know. For instance, that so…

model-selection nonparametric outliers robust philosophical

asked Aug 03 '10 at 07:49

doug

9,901
1
22
26

84

votes

6 answers

Why does k-means clustering algorithm use only Euclidean distance metric?

Is there a specific purpose in terms of efficiency or functionality why the k-means algorithm does not use for example cosine (dis)similarity as a distance metric, but can only use the Euclidean norm? In general, will K-means method comply and be…

clustering k-means distance-functions euclidean

asked Jan 07 '14 at 11:53

curious

971
1
7
7

84

votes

9 answers

Mathematician wants the equivalent knowledge to a quality stats degree

I know people love to close duplicates so I am not asking for a reference to start learning statistics (as here). I have a doctorate in mathematics but never learned statistics. What is the shortest route to the equivalent knowledge to a top notch…

references careers

asked Jan 25 '11 at 19:03

John Robertson

973
3
15
25

84

votes

9 answers

Why is it possible to get significant F statistic (p<.001) but non-significant regressor t-tests?

In a multiple linear regression, why is it possible to have a highly significant F statistic (p<.001) but have very high p-values on all the regressor's t tests? In my model, there are 10 regressors. One has a p-value of 0.1 and the rest are above…

regression hypothesis-testing t-test multicollinearity canonical-question

asked Oct 13 '10 at 09:40

Ηλίας

1,439
3
15
16

84

votes

4 answers

What're the differences between PCA and autoencoder?

Both PCA and autoencoder can do demension reduction, so what are the difference between them? In what situation I should use one over another?

machine-learning pca neural-networks autoencoders

asked Oct 15 '14 at 07:26

RockTheStar

11,277
31
63
89

83

votes

5 answers

Mutual information versus correlation

Why and when we should use Mutual Information over statistical correlation measurements such as "Pearson", "spearman", or "Kendall's tau" ?

correlation mathematical-statistics mutual-information

asked Jan 08 '14 at 20:59

SaZa

975
1
7
6

83

votes

11 answers

What are disadvantages of using the lasso for variable selection for regression?

From what I know, using lasso for variable selection handles the problem of correlated inputs. Also, since it is equivalent to Least Angle Regression, it is not slow computationally. However, many people (for example people I know doing…

regression feature-selection lasso

asked Mar 06 '11 at 23:21

xuexue

2,098
2
16
11

83

votes

4 answers

How to visualize what canonical correlation analysis does (in comparison to what principal component analysis does)?

Canonical correlation analysis (CCA) is a technique related to principal component analysis (PCA). While it is easy to teach PCA or linear regression using a scatter plot (see a few thousand examples on google image search), I have not seen a…

regression data-visualization pca canonical-correlation geometry

asked Jul 26 '13 at 20:28

figure

933
2
7
6

83

votes

28 answers

Examples for teaching: Correlation does not mean causation

There is an old saying: "Correlation does not mean causation". When I teach, I tend to use the following standard examples to illustrate this point: number of storks and birth rate in Denmark; number of priests in America and alcoholism; in the…

correlation teaching

asked Jul 19 '10 at 19:31

csgillespie

11,849
9
56
85

83

votes

14 answers

When (if ever) is a frequentist approach substantively better than a Bayesian?

Background: I do not have an formal training in Bayesian statistics (though I am very interested in learning more), but I know enough--I think--to get the gist of why many feel as though they are preferable to Frequentist statistics. Even the…

bayesian frequentist philosophical

asked Feb 04 '16 at 16:27

jsakaluk

5,006
1
20
45

82

votes

7 answers

How to generate uniformly distributed points on the surface of the 3-d unit sphere?

I am wondering how to generate uniformly distributed points on the surface of the 3-d unit sphere? Also after generating those points, what is the best way to visualize and check whether they are truly uniform on the surface $x^2+y^2+z^2=1$?

random-generation

asked Mar 07 '11 at 22:57

Qiang Li

1,145
2
9
10

82

votes

1 answer

Help me understand Support Vector Machines

I understand the basics of what a Support Vector Machines' aim is in terms of classifying an input set into several different classes, but what I don't understand is some of the nitty-gritty details. For starters, I'm a bit confused by the use of…

machine-learning classification svm

asked Oct 24 '10 at 15:11

rohanbk

1,187
1
10
10

Most Popular