Most Popular
1500 questions
47
votes
1 answer
Is regression with L1 regularization the same as Lasso, and with L2 regularization the same as ridge regression? And how to write "Lasso"?
I'm a software engineer learning machine learning, particularly through Andrew Ng's machine learning courses. While studying linear regression with regularization, I've found terms that are confusing:
Regression with L1 regularization or L2…

stackoverflowuser2010
- 3,190
- 5
- 27
- 35
47
votes
6 answers
Why zero correlation does not necessarily imply independence
If two variables have 0 correlation, why are they not necessarily independent? Are zero correlated variables independent under special circumstances ? If possible, I am looking for an intuitive explanation, not a highly technical one.

Victor
- 5,925
- 13
- 43
- 67
47
votes
8 answers
What are the cons of Bayesian analysis?
What are some practical objections to the use of Bayesian statistical methods in any context? No, I don't mean the usual carping about choice of prior. I'll be delighted if this gets no answers.
user6666
47
votes
5 answers
Lift measure in data mining
I searched many websites to know what exactly lift will do? The results that I found all were about using it in applications not itself.
I know about the support and confidence function. From Wikipedia, in data mining, lift is a measure of the…

Nickool
- 625
- 1
- 6
- 7
47
votes
2 answers
Hierarchical clustering with mixed type data - what distance/similarity to use?
In my dataset we have both continuous and naturally discrete variables. I want to know whether we can do hierarchical clustering using both type of variables. And if yes, what distance measure is appropriate?

Beta
- 5,784
- 9
- 33
- 44
47
votes
6 answers
Importance of local response normalization in CNN
I've found that Imagenet and other large CNN makes use of local response normalization layers. However, I cannot find that much information about them. How important are they and when should they be used?
From…

pir
- 4,626
- 10
- 38
- 73
47
votes
4 answers
Bayesian equivalent of two sample t-test?
I'm not looking for a plug and play method like BEST in R but rather a mathematical explanation of what are some Bayesian methods I can use to test the difference between the mean of two samples.

John
- 501
- 1
- 5
- 3
47
votes
5 answers
Optimized implementations of the Random Forest algorithm
I have noticed that there are a few implementations of random forest such as ALGLIB, Waffles and some R packages like randomForest. Can anybody tell me whether these libraries are highly optimized? Are they basically equivalent to the random…

Henry B.
- 1,479
- 1
- 14
- 19
46
votes
5 answers
Using R online - without installing it
Is there a possibility to use R in a webinterface without the need to install it?
I have only one small script which I like to run but I just want to give it a shot without a long installation procedure.
Thank you.

vonjd
- 5,886
- 4
- 47
- 59
46
votes
1 answer
Negative values for AIC in General Mixed Model
I'm trying to select the best model by the AIC in the General Mixed Model test. The best model is the model with the lowest AIC, but all my AIC's are negative!
So is the biggest negative AIC the lowest value?
Or is the smallest negative AIC the…

Josephine van Nieuwenhuizen
- 461
- 1
- 4
- 3
46
votes
2 answers
Confidence interval around binomial estimate of 0 or 1
What is the best technique to calculate a confidence interval of a binomial experiment, if your estimate is that $p=0$ (or similarly $p=1$) and sample size is relatively small, for example $n=25$?

Kasper
- 3,059
- 2
- 22
- 37
46
votes
3 answers
What is the relationship between the mean squared error and the residual sum of squares function?
Looking at the Wikipedia definitions of:
Mean Squared Error (MSE)
Residual Sum of Squares (RSS)
It looks to me that
$$\text{MSE} = \frac{1}{N} \text{RSS} = \frac{1}{N} \sum (f_i -y_i)^2$$
where $N$ is he number of samples and $f_i$ is our…

Josh
- 3,408
- 4
- 22
- 46
46
votes
5 answers
Understanding regressions - the role of the model
How can a regression model be any use if you don't know the function you are trying to get the parameters for?
I saw a piece of research that said that mothers who breast fed their children were less likely to suffer diabetes in later life. The…

Jonathan Andrews
- 461
- 3
- 4
46
votes
2 answers
How to interpret p-value of Kolmogorov-Smirnov test (python)?
I have Two samples that I want to test (using python) if they are drawn from the same distribution. To do that I use the statistical function ks_2samp from scipy.stats. It returns 2 values and I find difficulties how to interpret them.
Help please!

meri
- 461
- 1
- 4
- 3
46
votes
10 answers
Why do people use p-values instead of computing probability of the model given data?
Roughly speaking a p-value gives a probability of the observed outcome of an experiment given the hypothesis (model). Having this probability (p-value) we want to judge our hypothesis (how likely it is). But wouldn't it be more natural to calculate…

Roman
- 1,013
- 2
- 23
- 38