Highest Voted Questions - Statistical Analysis Stack Exchange

52

votes

3 answers

Regularization methods for logistic regression

Regularization using methods such as Ridge, Lasso, ElasticNet is quite common for linear regression. I wanted to know the following: Are these methods applicable for logistic regression? If so, are there any differences in the way they need to be…

regression logistic regularization

asked Aug 08 '16 at 10:29

Tapan Khopkar

796
2
7
9

52

votes

3 answers

Understanding Naive Bayes

From StatSoft, Inc. (2013), Electronic Statistics Textbook, "Naive Bayes Classifier": To demonstrate the concept of Naïve Bayes Classification, consider the example displayed in the illustration above. As indicated, the objects can be…

machine-learning naive-bayes

asked Jan 27 '12 at 17:29

G Gr

981
2
8
15

52

votes

1 answer

Understanding "almost all local minimum have very similar function value to the global optimum"

In a recent blog post by Rong Ge, it was said that: It is believed that for many problems including learning deep nets, almost all local minimum have very similar function value to the global optimum, and hence finding a local minimum is good…

machine-learning neural-networks optimization deep-learning

asked Mar 23 '16 at 17:02

John Donn

621
1
6
8

52

votes

1 answer

Why do my p-values differ between logistic regression output, chi-squared test, and the confidence interval for the OR?

I have built a logistic regression where the outcome variable is being cured after receiving treatment (Cure vs. No Cure). All patients in this study received treatment. I am interested in seeing if having diabetes is associated with this outcome.…

r hypothesis-testing logistic generalized-linear-model odds-ratio

asked Apr 02 '15 at 19:25

SniperBro2000

720
1
6
8

51

votes

4 answers

Kullback–Leibler vs Kolmogorov-Smirnov distance

I can see that there are a lot of formal differences between Kullback–Leibler vs Kolmogorov-Smirnov distance measures. However, both are used to measure the distance between distributions. Is there a typical situation where one should be used…

distributions distance-functions kolmogorov-smirnov-test kullback-leibler

asked Apr 07 '11 at 11:39

Greg

613
1
5
7

51

votes

2 answers

What is quasi-binomial distribution (in the context of GLM)?

I'm hoping someone can provide an intuitive overview of what quasibinomial distribution is and what it does. I'm particularly interested in these points: How quasibinomial differs to the binomial distribution. When the response variable is a…

r generalized-linear-model binomial-distribution overdispersion quasi-likelihood

asked Mar 28 '14 at 16:56

luciano

12,197
30
87
119

51

votes

3 answers

Online vs offline learning?

What is the difference between offline and online learning? Is it just a matter of learning over the entire dataset (offline) vs. learning incrementally (one instance at a time)? What are examples of algorithms used in both?

machine-learning online-algorithms

asked Jul 28 '10 at 13:32

griffin

785
2
7
8

51

votes

3 answers

Which has the heavier tail, lognormal or gamma?

(This is based on a question that just came to me via email; I've added some context from a previous brief conversation with the same person.) Last year I was told that the gamma distribution is heavier tailed than the lognormal, and I've since been…

distributions gamma-distribution lognormal-distribution heavy-tailed

asked Feb 13 '14 at 06:01

Glen_b

257,508
32
553
939

51

votes

5 answers

Relationship between $R^2$ and correlation coefficient

Let's say I have two 1-dimensional arrays, $a_1$ and $a_2$. Each contains 100 data points. $a_1$ is the actual data, and $a_2$ is the model prediction. In this case, the $R^2$ value would be: $$ R^2 = 1 - \frac{SS_{res}}{SS_{tot}}…

correlation r-squared

asked Jan 25 '14 at 21:01

Shawn Wang

1,245
3
12
12

51

votes

4 answers

What is difference-in-differences?

Difference in differences has long been popular as a non-experimental tool, especially in economics. Can somebody please provide a clear and non-technical answer to the following questions about difference-in-differences. What is a…

regression econometrics difference-in-difference

asked Jul 23 '10 at 16:57

Graham Cookson

7,543
6
41
35

51

votes

8 answers

Statistical tests when sample size is 1

I'm a high school math teacher who is a bit stumped. A Biology student came to me with his experiment wanting to know what kind of statistical analysis he can do with his data (yes, he should have decided that BEFORE the experiment, but I wasn't…

hypothesis-testing estimation experiment-design

asked Apr 28 '20 at 02:56

Brent Parker

621
3
5

51

votes

3 answers

What are the values p, d, q, in ARIMA?

In the arima function in R, what does order(1, 0, 12) mean? What are the values that can be assigned to p, d, q, and what is the process to find those values?

r time-series arima

asked Dec 03 '12 at 13:29

kalyani

589
1
5
4

51

votes

9 answers

Does anyone know any good open source software for visualizing data from database?

Recently I came across Tableau and tried to visualize the data from database and csv file. The user iterface enables the user to visualize time and spatial data and create plots in an instant. Such tool is really useful as it enables to observe the…

data-visualization software interactive-visualization

asked Nov 22 '12 at 16:28

niko

1,261
3
15
18

51

votes

11 answers

Famous easy to understand examples of a confounding variable invalidating a study

Are there any well-known statistical studies that were originally published and thought to be valid, but later had to be thrown out due to a confounding variable that wasn't taken into account? I'm looking for something easy to understand that…

experiment-design confounding observational-study paradox

asked Oct 23 '19 at 23:25

NathanLite

581
5
5

51

votes

3 answers

Consider the sum of $n$ uniform distributions on $[0,1]$, or $Z_n$. Why does the cusp in the PDF of $Z_n$ disappear for $n \geq 3$?

I've been wondering about this one for a while; I find it a little weird how abruptly it happens. Basically, why do we need just three uniforms for $Z_n$ to smooth out like it does? And why does the smoothing-out happen so relatively…

normal-distribution mathematical-statistics uniform-distribution central-limit-theorem

asked Oct 30 '12 at 00:09

tetragrammaton

1,336
2
12
13

Most Popular