Highest Voted Questions - Statistical Analysis Stack Exchange

46

votes

2 answers

When will L1 regularization work better than L2 and vice versa?

Note: I know that L1 has feature selection property. I am trying to understand which one to choose when feature selection is completely irrelevant. How to decide which regularization (L1 or L2) to use? What are the pros & cons of each of L1 / L2…

regression lasso regularization ridge-regression

asked Nov 28 '15 at 16:57

GeorgeOfTheRF

5,063
14
42
51

46

votes

4 answers

Would PCA work for boolean (binary) data types?

I want to reduce the dimensionality of higher order systems and capture most of the covariance on a preferably 2 dimensional or 1 dimensional field. I understand this can be done via principal component analysis, and I have used PCA in many…

pca data-visualization binary-data dimensionality-reduction correspondence-analysis

asked Jul 02 '15 at 21:20

Alvin Nunez

647
1
7
8

46

votes

5 answers

Why is multiple comparison a problem?

I find it hard to understand what really is the issue with multiple comparisons. With a simple analogy, it is said that a person who will make many decisions will make many mistakes. So very conservative precaution is applied, like Bonferroni…

hypothesis-testing multiple-comparisons

asked Aug 09 '10 at 18:03

AgCl

603
5
6

46

votes

5 answers

Statistical models cheat sheet

I was wondering if there is a statistical model "cheat sheet(s)" that lists any or more information: when to use the model when not to use the model required and optional inputs expected outputs has the model been tested in different fields…

references modeling

asked Aug 04 '10 at 16:39

dassouki

1,219
1
17
25

46

votes

5 answers

Why are regression problems called "regression" problems?

I was just wondering why regression problems are called "regression" problems. What is the story behind the name? One definition for regression: "Relapse to a less perfect or developed state."

regression terminology history etymology

asked May 21 '11 at 18:25

Fabian

1,341
4
12
13

46

votes

22 answers

Are there any good movies involving mathematics or probability?

Can you suggest some good movies which involve math, probabilities etc? One example is 21. I would also be interested in movies that involve algorithms (e.g. text decryption). In general "geeky" movies with famous scientific theories but no science…

probability references

asked May 07 '11 at 11:13

Siato

101
1
2
4

45

votes

5 answers

Statistics published in academic papers

I read a lot of evolutionary/ecological academic papers, sometimes with the specific aim of seeing how statistics are being used 'in the real world' outside of the textbook. I normally take the statistics in papers as gospel and use the papers to…

publication-bias academia

asked Apr 02 '14 at 08:08

luciano

12,197
30
87
119

45

votes

2 answers

PP-plots vs. QQ-plots

What is the difference between probability plots, PP-plots and QQ-plots when trying to analyse a fitted distribution to data?

probability data-visualization goodness-of-fit qq-plot

asked Apr 01 '14 at 14:23

kay

581
1
4
3

45

votes

8 answers

Rigorous definition of an outlier?

People often talk about dealing with outliers in statistics. The thing that bothers me about this is that, as far as I can tell, the definition of an outlier is completely subjective. For example, if the true distribution of some random variable…

outliers definition

asked Feb 13 '11 at 15:07

dsimcha

7,375
7
32
29

45

votes

3 answers

whether to rescale indicator / binary / dummy predictors for LASSO

For the LASSO (and other model selecting procedures) it is crucial to rescale the predictors. The general recommendation I follow is simply to use a 0 mean, 1 standard deviation normalization for continuous variables. But what is there to do with…

predictive-models model-selection lasso normalization standardization

asked Sep 09 '13 at 14:46

László

897
1
7
17

45

votes

1 answer

Comparing two models using anova() function in R

From the documentation for anova(): When given a sequence of objects, ‘anova’ tests the models against one another in the order specified... What does it mean to test the models against one another? And why does the order matter? Here is an…

r anova

asked Mar 26 '13 at 10:01

qed

2,508
3
21
33

45

votes

6 answers

How seriously should I think about the different philosophies of statistics?

I've just finished a module where we covered the different approaches to statistical problems – mainly Bayesian vs frequentist. The lecturer also announced that she is a frequentist. We covered some paradoxes and generally the quirks of each…

bayesian frequentist philosophical

asked Apr 05 '21 at 16:21

DerrYe

403
2
5

45

votes

3 answers

Reviewing statistics in papers

For some of us, refereeing papers is part of the job. When refereeing statistical methodology papers, I think advice from other subject areas is fairly useful, i.e. computer science and Maths. This question concerns reviewing more applied…

references referee

asked Oct 10 '10 at 09:55

csgillespie

11,849
9
56
85

45

votes

2 answers

Determining sample size necessary for bootstrap method / Proposed Method

I know this is a rather hot topic where no one really can give a simple answer for. Nevertheless I am wondering if the following approach couldn’t be useful. The bootstrap method is only useful if your sample follows more or less (read exactly) the…

bootstrap sample-size methodology

asked Jul 29 '12 at 14:02

siegfried

469
1
5
4

45

votes

4 answers

how to weight KLD loss vs reconstruction loss in variational auto-encoder

in nearly all code examples I've seen of a VAE, the loss functions are defined as follows (this is tensorflow code, but I've seen similar for theano, torch etc. It's also for a convnet, but that's also not too relevant, just affects the axes the…

machine-learning deep-learning tensorflow autoencoders variational-bayes

asked Mar 07 '18 at 10:19

memo

859
1
7
10

Most Popular