Most Popular
1500 questions
32
votes
5 answers
How should an individual researcher think about the false discovery rate?
I've been trying to wrap my head around how the False Discovery Rate (FDR) should inform the conclusions of the individual researcher. For example, if your study is underpowered, should you discount your results even if they're significant at…

Richard Border
- 1,128
- 9
- 26
32
votes
3 answers
Outlier Detection on skewed Distributions
Under a classical definition of an outlier as a data point outide the 1.5* IQR from the upper or lower quartile, there is an assumption of a non-skewed distribution. For skewed distributions (Exponential, Poisson, Geometric, etc) is the best way to…

Eric
- 321
- 1
- 3
- 3
32
votes
1 answer
Equivalence between least squares and MLE in Gaussian model
I am new to Machine Learning, and am trying to learn it on my own. Recently I was reading through some lecture notes and had a basic question.
Slide 13 says that "Least Square Estimate is same as Maximum Likelihood Estimate under a Gaussian model".…

Andy
- 1,583
- 3
- 21
- 19
32
votes
5 answers
When should I apply feature scaling for my data
I had a discussion with a colleague and we started to wonder, when should one apply feature normalization / scaling to the data? Let's say we have a set of features with some of the features having a very broad range of values and some features…

jjepsuomi
- 5,207
- 11
- 34
- 47
32
votes
2 answers
Generating data with a given sample covariance matrix
Given a covariance matrix $\boldsymbol \Sigma_s$, how to generate data such that it would have the sample covariance matrix $\hat{\boldsymbol \Sigma} = \boldsymbol \Sigma_s$?
More generally: we are often interested in generating data from a density…

Kees Mulder
- 1,414
- 1
- 10
- 10
32
votes
2 answers
Convolutional neural networks: Aren't the central neurons over-represented in the output?
[This question was also posed at stack overflow]
The question in short
I'm studying convolutional neural networks, and I believe that these networks do not treat every input neuron (pixel/parameter) equivalently. Imagine we have a deep network…

Koen
- 421
- 3
- 4
32
votes
2 answers
Produce a list of variable name in a for loop, then assign values to them
I wonder if there is a simple way to produce a list of variables using a for loop, and give its value.
for(i in 1:3)
{
noquote(paste("a",i,sep=""))=i
}
In the above code, I try to create a1, a2, a3, which assign to the values of 1, 2, 3. However,…

Han Lin Shang
- 321
- 1
- 4
- 3
32
votes
2 answers
Does it make sense to combine PCA and LDA?
Assume I have a dataset for a supervised statistical classification task, e.g., via a Bayes' classifier. This dataset consists of 20 features and I want to boil it down to 2 features via dimensionality reduction techniques such as Principal…
user39663
31
votes
2 answers
Is the exact value of a 'p-value' meaningless?
I had a discussion with a statistician back in 2009 where he stated that the exact value of a p-value is irrelevant: the only thing that is important is whether it is significant or not. I.e. one result cannot be more significant than another; your…

Mark Ramotowski
- 519
- 6
- 17
31
votes
3 answers
How can stochastic gradient descent avoid the problem of a local minimum?
I know that stochastic gradient descent has random behavior, but I don't know why.
Is there any explanation about this?

SunshineAtNoon
- 503
- 1
- 5
- 9
31
votes
3 answers
What is the proper name for a "river plot" visualisation
In a famous plot, Charles Minard visualised the losses of the French Army in the Russian campaign of Napoleon:
(another nice example is this xkcd plot)
Is there a canonical name for this type of visualisation? I'm actually looking for an R package…

January
- 6,999
- 1
- 32
- 55
31
votes
2 answers
Choosing optimal alpha in elastic net logistic regression
I'm performing an elastic-net logistic regression on a health care dataset using the glmnet package in R by selecting lambda values over a grid of $\alpha$ from 0 to 1. My abbreviated code is below:
alphalist <- seq(0,1,by=0.1)
elasticnet <-…

RobertF
- 4,380
- 6
- 29
- 46
31
votes
2 answers
Why is the Expectation Maximization algorithm guaranteed to converge to a local optimum?
I have read a couple of explanations of EM algorithm (e.g. from Bishop's Pattern Recognition and Machine Learning and from Roger and Gerolami First Course on Machine Learning). The derivation of EM is ok, I understand it. I also understand why the…

michal
- 1,138
- 3
- 11
- 14
31
votes
5 answers
How to test and avoid multicollinearity in mixed linear model?
I am currently running some mixed effect linear models.
I am using the package "lme4" in R.
My models take the form:
model <- lmer(response ~ predictor1 + predictor2 + (1 | random effect))
Before running my models, I checked for possible…

mjburns
- 1,077
- 3
- 12
- 16
31
votes
3 answers
Why use Lasso estimates over OLS estimates on the Lasso-identified subset of variables?
For Lasso regression $$L(\beta)=(X\beta-y)'(X\beta-y)+\lambda\|\beta\|_1,$$ suppose the best solution (minimum testing error for example) selects $k$ features, so that…

yliueagle
- 755
- 2
- 6
- 10