Highest Voted Questions - Statistical Analysis Stack Exchange

32

votes

5 answers

How should an individual researcher think about the false discovery rate?

I've been trying to wrap my head around how the False Discovery Rate (FDR) should inform the conclusions of the individual researcher. For example, if your study is underpowered, should you discount your results even if they're significant at…

statistical-significance p-value publication-bias false-discovery-rate

asked Apr 28 '15 at 00:52

Richard Border

1,128
9
26

32

votes

3 answers

Outlier Detection on skewed Distributions

Under a classical definition of an outlier as a data point outide the 1.5* IQR from the upper or lower quartile, there is an assumption of a non-skewed distribution. For skewed distributions (Exponential, Poisson, Geometric, etc) is the best way to…

distributions outliers skewness exponential-distribution interquartile

asked Dec 16 '14 at 05:40

Eric

321
1
3
3

32

votes

1 answer

Equivalence between least squares and MLE in Gaussian model

I am new to Machine Learning, and am trying to learn it on my own. Recently I was reading through some lecture notes and had a basic question. Slide 13 says that "Least Square Estimate is same as Maximum Likelihood Estimate under a Gaussian model".…

regression bayesian least-squares

asked Jul 01 '11 at 21:31

Andy

1,583
3
21
19

32

votes

5 answers

When should I apply feature scaling for my data

I had a discussion with a colleague and we started to wonder, when should one apply feature normalization / scaling to the data? Let's say we have a set of features with some of the features having a very broad range of values and some features…

machine-learning classification normalization k-nearest-neighbour scales

asked Oct 29 '14 at 09:00

jjepsuomi

5,207
11
34
47

32

votes

2 answers

Generating data with a given sample covariance matrix

Given a covariance matrix $\boldsymbol \Sigma_s$, how to generate data such that it would have the sample covariance matrix $\hat{\boldsymbol \Sigma} = \boldsymbol \Sigma_s$? More generally: we are often interested in generating data from a density…

correlation sampling random-generation covariance-matrix

asked Oct 15 '14 at 17:35

Kees Mulder

1,414
1
10
10

32

votes

2 answers

Convolutional neural networks: Aren't the central neurons over-represented in the output?

[This question was also posed at stack overflow] The question in short I'm studying convolutional neural networks, and I believe that these networks do not treat every input neuron (pixel/parameter) equivalently. Imagine we have a deep network…

machine-learning neural-networks convolution

asked Oct 13 '14 at 08:39

Koen

421
3
4

32

votes

2 answers

Produce a list of variable name in a for loop, then assign values to them

I wonder if there is a simple way to produce a list of variables using a for loop, and give its value. for(i in 1:3) { noquote(paste("a",i,sep=""))=i } In the above code, I try to create a1, a2, a3, which assign to the values of 1, 2, 3. However,…

r

asked May 16 '11 at 00:17

Han Lin Shang

321
1
4
3

32

votes

2 answers

Does it make sense to combine PCA and LDA?

Assume I have a dataset for a supervised statistical classification task, e.g., via a Bayes' classifier. This dataset consists of 20 features and I want to boil it down to 2 features via dimensionality reduction techniques such as Principal…

classification pca regularization discriminant-analysis overfitting

asked Jul 07 '14 at 23:25

user39663

31

votes

2 answers

Is the exact value of a 'p-value' meaningless?

I had a discussion with a statistician back in 2009 where he stated that the exact value of a p-value is irrelevant: the only thing that is important is whether it is significant or not. I.e. one result cannot be more significant than another; your…

statistical-significance p-value bonferroni

asked Apr 24 '14 at 05:58

Mark Ramotowski

519
6
17

31

votes

3 answers

How can stochastic gradient descent avoid the problem of a local minimum?

I know that stochastic gradient descent has random behavior, but I don't know why. Is there any explanation about this?

machine-learning random-variable gradient-descent

asked Mar 21 '14 at 14:32

SunshineAtNoon

503
1
5
9

31

votes

3 answers

What is the proper name for a "river plot" visualisation

In a famous plot, Charles Minard visualised the losses of the French Army in the Russian campaign of Napoleon: (another nice example is this xkcd plot) Is there a canonical name for this type of visualisation? I'm actually looking for an R package…

data-visualization sankey-diagram

asked Feb 19 '14 at 10:59

January

6,999
1
32
55

31

votes

2 answers

Choosing optimal alpha in elastic net logistic regression

I'm performing an elastic-net logistic regression on a health care dataset using the glmnet package in R by selecting lambda values over a grid of $\alpha$ from 0 to 1. My abbreviated code is below: alphalist <- seq(0,1,by=0.1) elasticnet <-…

machine-learning cross-validation glmnet elastic-net

asked Jan 31 '14 at 17:03

RobertF

4,380
6
29
46

31

votes

2 answers

Why is the Expectation Maximization algorithm guaranteed to converge to a local optimum?

I have read a couple of explanations of EM algorithm (e.g. from Bishop's Pattern Recognition and Machine Learning and from Roger and Gerolami First Course on Machine Learning). The derivation of EM is ok, I understand it. I also understand why the…

missing-data convergence expectation-maximization

asked Jan 26 '14 at 14:09

michal

1,138
3
11
14

31

votes

5 answers

How to test and avoid multicollinearity in mixed linear model?

I am currently running some mixed effect linear models. I am using the package "lme4" in R. My models take the form: model <- lmer(response ~ predictor1 + predictor2 + (1 | random effect)) Before running my models, I checked for possible…

r correlation mixed-model lme4-nlme multicollinearity

asked Jan 22 '14 at 06:24

mjburns

1,077
3
12
16

31

votes

3 answers

Why use Lasso estimates over OLS estimates on the Lasso-identified subset of variables?

For Lasso regression $$L(\beta)=(X\beta-y)'(X\beta-y)+\lambda\|\beta\|_1,$$ suppose the best solution (minimum testing error for example) selects $k$ features, so that…

regression feature-selection lasso regularization

asked Jan 16 '14 at 14:19

yliueagle

755
2
6
10

Most Popular