Highest Voted Questions - Statistical Analysis Stack Exchange

40

votes

2 answers

When do Poisson and negative binomial regressions fit the same coefficients?

I’ve noticed that in R, Poisson and negative binomial (NB) regressions always seem to fit the same coefficients for categorical, but not continuous, predictors. For example, here's a regression with a categorical…

regression negative-binomial-distribution poisson-regression

asked Sep 30 '13 at 21:39

half-pass

3,594
7
23
34

40

votes

4 answers

Recall and precision in classification

I read some definitions of recall and precision, though it is every time in the context of information retrieval. I was wondering if someone could explain this a bit more in a classification context and maybe illustrate some examples. Say for…

machine-learning metric

asked Jun 26 '13 at 09:22

Olivier_s_j

1,055
2
11
25

40

votes

11 answers

Open Source statistical textbooks?

There have been a few questions about statistical textbooks, such as the question Free statistical textbooks. However, I am looking for textbooks that are Open Source, for example, having an Creative Commons license. The reason is that in course…

references open-source

asked Jul 25 '10 at 14:53

Egon Willighagen

176
1
3
7

40

votes

2 answers

How to find a good fit for semi-sinusoidal model in R?

I want to assume that the sea surface temperature of the Baltic Sea is the same year after year, and then describe that with a function / linear model. The idea I had was to just input year as a decimal number (or num_months/12) and get out what the…

r regression time-series lm

asked May 31 '13 at 06:17

GaRyu

503
1
5
6

40

votes

4 answers

Polynomial regression using scikit-learn

I am trying to use scikit-learn for polynomial regression. From what I read polynomial regression is a special case of linear regression. I was hopping that maybe one of scikit's generalized linear models can be parameterised to fit higher order…

regression machine-learning large-data polynomial scikit-learn

asked May 11 '13 at 20:00

Mihai Damian

503
1
4
6

40

votes

3 answers

What is the Wine/Water Paradox in Bayesian statistics, and what is its resolution?

I have just heard about the Wine/Water Paradox in Bayesian statistics, but didn't understand it very well (see Mikkelson 2004 for an introduction). Can you explain in simple terms what the paradox is (and why is it a paradox), why it matters for…

bayesian paradox

asked Mar 18 '21 at 22:14

user314217

40

votes

2 answers

Purpose of the link function in generalized linear model

What is the purpose of the link function as a component of the generalized linear model? Why do we need it? Wikipedia states: It can be convenient to match the domain of the link function to the range of the distribution function's mean What's the…

regression generalized-linear-model link-function irls

asked Jan 26 '13 at 17:03

Chris

1,169
3
12
16

40

votes

6 answers

Least-angle regression vs. lasso

Least-angle regression and the lasso tend to produce very similar regularization paths (identical except when a coefficient crosses zero.) They both can be efficiently fit by virtually identical algorithms. Is there ever any practical reason to…

regression lasso

asked Nov 18 '10 at 07:28

NPE

5,351
5
33
44

40

votes

5 answers

How to derive the least square estimator for multiple linear regression?

In the simple linear regression case $y=\beta_0+\beta_1x$, you can derive the least square estimator $\hat\beta_1=\frac{\sum(x_i-\bar x)(y_i-\bar y)}{\sum(x_i-\bar x)^2}$ such that you don't have to know $\hat\beta_0$ to estimate…

regression multiple-regression generalized-linear-model linear-model

asked Dec 18 '12 at 05:32

Saber CN

739
2
7
11

40

votes

3 answers

How to determine the quality of a multiclass classifier

Given a dataset with instances $x_i$ together with $N$ classes where every instance $x_i$ belongs exactly to one class $y_i$ a multiclass classifier After the training and testing I basically have a table with the true class $y_i$ and the…

machine-learning classification multi-class

asked Nov 23 '12 at 12:46

Gerenuk

1,833
3
14
20

40

votes

3 answers

Does statistical independence mean lack of causation?

Two random variables A and B are statistically independent. That means that in the DAG of the process: $(A {\perp\!\!\!\perp} B)$ and of course $P(A|B)=P(A)$. But does that also mean that there's no front-door from B to A? Because then we should get…

independence causality bayesian-network dag

asked Jul 15 '18 at 14:39

user1834069

593
4
9

40

votes

4 answers

L1 regression estimates median whereas L2 regression estimates mean?

So I was asked a question on which central measures L1 (i.e., lasso) and L2 (i.e., ridge regression) estimated. The answer is L1=median and L2=mean. Is there any type of intuitive reasoning to this? Or does it have to be determined algebraically? If…

lasso regularization loss-functions ridge-regression

asked Aug 19 '12 at 06:16

Bstat

791
1
7
5

40

votes

4 answers

When should I use a variational autoencoder as opposed to an autoencoder?

I understand the basic structure of variational autoencoder and normal (deterministic) autoencoder and the math behind them, but when and why would I prefer one type of autoencoder to the other? All I can think about is the prior distribution of…

deep-learning autoencoders variational-bayes

asked Jan 21 '18 at 22:58

DiveIntoML

1,583
1
11
21

40

votes

3 answers

What is the rationale of the Matérn covariance function?

The Matérn covariance function is commonly used as kernel function in Gaussian Process. It is defined like this $$ {\displaystyle C_{\nu }(d)=\sigma ^{2}{\frac {2^{1-\nu }}{\Gamma (\nu )}}{\Bigg (}{\sqrt {2\nu }}{\frac {d}{\rho }}{\Bigg )}^{\nu…

spatial gaussian-process kernel-trick

asked Jan 11 '18 at 01:28

Recuerdos de la Alhambra

477
1
4
7

40

votes

1 answer

PCA objective function: what is the connection between maximizing variance and minimizing error?

The PCA algorithm can be formulated in terms of the correlation matrix (assume the data $X$ has already been normalized and we are only considering projection onto the first PC). The objective function can be written as: $$ \max_w (Xw)^T(Xw)\; \:…

pca optimization

asked Jul 12 '12 at 15:09

Cam.Davidson.Pilon

11,476
5
47
75

Most Popular