Highest Voted Questions - Statistical Analysis Stack Exchange

38

votes

8 answers

Is it OK to remove outliers from data?

I looked for a way to remove outliers from a dataset and I found this question. In some of the comments and answers to this question, however, people mentioned that it is bad practice to remove outliers from the data. In my dataset I have several…

outliers

asked Mar 08 '16 at 12:54

Sininho

501
1
4
7

38

votes

3 answers

What is pre training a neural network?

Well the question says it all. What is meant by "pre training a neural network"? Can someone explain in pure simple English? I can't seem to find any resources related to it. It would be great if someone can point me to them.

neural-networks pre-training

asked Jan 29 '16 at 13:12

Machina333

863
2
9
10

38

votes

7 answers

Is there an accepted definition for the median of a sample on the plane, or higher ordered spaces?

If so, what? If not, why not? For a sample on the line, the median minimizes the total absolute deviation. It would seem natural to extend the definition to R2, etc., but I've never seen it. But then, I've been out in left field for a long time.

multivariate-analysis spatial median

asked Aug 19 '10 at 19:36

phv3773

481
4
4

38

votes

3 answers

Do null and alternative hypotheses have to be exhaustive or not?

I saw a lot of times claims that they have to be exhaustive (the examples in such books were always set in such way, that they were indeed), on the other hand I also saw a lot of times books stating they should be exclusive (for example…

hypothesis-testing

asked Nov 26 '11 at 10:24

greenoldman

593
5
10

38

votes

1 answer

Why do we need to normalize the images before we put them into CNN?

I am not clear the reason that we normalise the image for CNN by (image - mean_image)? Thanks!

deep-learning conv-neural-network image-processing

asked Dec 09 '15 at 06:54

Zhi Lu

717
3
8
11

38

votes

7 answers

Why is the null hypothesis often sought to be rejected?

I hope I am making sense with the title. Often, the null hypothesis is formed with the intention of rejecting it. Is there a reason for this, or is it just a convention?

hypothesis-testing

asked Dec 04 '15 at 14:44

Prometheus

786
8
19

38

votes

7 answers

Should parsimony really still be the gold standard?

Just a thought: Parsimonious models have always been the default go-to in model selection, but to what degree is this approach outdated? I'm curious about how much our tendency toward parsimony is a relic of a time of abaci and slide rules (or, more…

predictive-models model-selection model

asked Jul 28 '15 at 14:19

theforestecologist

1,777
3
21
40

38

votes

1 answer

Doing principal component analysis or factor analysis on binary data

I have a dataset with a large number of Yes/No responses. Can I use principal components (PCA) or any other data reduction analyses (such as factor analysis) for this type of data? Please advise how I go about doing this using SPSS.

spss categorical-data pca factor-analysis binary-data

asked Oct 01 '11 at 18:39

Cathy

381
1
4
3

38

votes

5 answers

Do working statisticians care about the difference between frequentist and Bayesian inference?

As an outsider, it appears that there are two competing views on how one should perform statistical inference. Are the two different methods both considered valid by working statisticians? Is choosing one considered more of a philosophical…

bayesian frequentist

asked Aug 12 '10 at 20:09

Jonathan Fischoff

231
3
7

38

votes

3 answers

Do we need gradient descent to find the coefficients of a linear regression model?

I was trying to learn machine learning using the Coursera material. In this lecture, Andrew Ng uses gradient descent algorithm to find the coefficients of the linear regression model that will minimize the error function (cost function). For linear…

regression machine-learning linear-model gradient-descent canonical-question

asked Jul 06 '15 at 17:18

Victor

5,925
13
43
67

38

votes

3 answers

Regression coefficients that flip sign after including other predictors

Imagine You run a linear regression with four numeric predictors (IV1, ..., IV4) When only IV1 is included as a predictor the standardised beta is +.20 When you also include IV2 to IV4 the sign of the standardised regression coefficient of IV1…

regression predictor

asked Aug 12 '10 at 05:03

Jeromy Anglim

42,044
23
146
250

38

votes

4 answers

Information gain, mutual information and related measures

Andrew More defines information gain as: $IG(Y|X) = H(Y) - H(Y|X)$ where $H(Y|X)$ is the conditional entropy. However, Wikipedia calls the above quantity mutual information. Wikipedia on the other hand defines information gain as the…

information-theory

asked Jul 22 '11 at 18:27

Amelio Vazquez-Reina

17,546
26
74
110

38

votes

4 answers

When is the bootstrap estimate of bias valid?

It is often claimed that bootstrapping can provide an estimate of the bias in an estimator. If $\hat t$ is the estimate for some statistic, and $\tilde t_i$ are the bootstrap replicas (with $i\in\{1,\cdots,N\}$), then the bootstrap estimate of bias…

bootstrap bias

asked Dec 17 '14 at 15:50

Bootstrapped

381
1
3
5

38

votes

7 answers

How to interpret the coefficient of variation?

I am trying to understand the Coefficient of Variation. When I try to apply it to the following two samples of data I am unable to understand how to interpret the results. Let's say sample 1 is ${0, 5, 7, 12, 11, 17}$ and sample 2 is ${10 ,15 ,17…

interpretation descriptive-statistics coefficient-of-variation

asked Oct 09 '14 at 15:27

Durin

964
2
7
17

38

votes

4 answers

What are the differences between sparse coding and autoencoder?

Sparse coding is defined as learning an over-complete set of basis vectors to represent input vectors (<-- why do we want this) . What are the differences between sparse coding and autoencoder? When will we use sparse coding and autoencoder?

machine-learning neural-networks unsupervised-learning deep-learning autoencoders

asked Oct 07 '14 at 17:44

RockTheStar

11,277
31
63
89

Most Popular