Most Popular

1500 questions
35
votes
2 answers

Why is lambda "within one standard error from the minimum" is a recommended value for lambda in an elastic net regression?

I understand what role lambda plays in an elastic-net regression. And I can understand why one would select lambda.min, the value of lambda that minimizes cross validated error. My question is Where in the statistics literature is it recommended to…
35
votes
2 answers

How to use both binary and continuous variables together in clustering?

I need to use binary variables (values 0 & 1) in k-means. But k-means only works with continuous variables. I know some people still use these binary variables in k-means ignoring the fact that k-means is only designed for continuous variables. This…
GeorgeOfTheRF
  • 5,063
  • 14
  • 42
  • 51
35
votes
2 answers

Is there a boxplot variant for Poisson distributed data?

I'd like to know if there is a boxplot variant adapted to Poisson distributed data (or possibly other distributions)? With a Gaussian distribution, whiskers placed at L = Q1 - 1.5 IQR and U = Q3 + 1.5 IQR, the boxplot has the property that there…
caas
  • 535
  • 1
  • 4
  • 7
35
votes
2 answers

Should we address multiple comparisons adjustments when using confidence intervals?

Suppose we have a multiple comparisons scenario such as post hoc inference on pairwise statistics, or like a multiple regression, where we are making a total of $m$ comparisons. Suppose also, that we would like to support inference in these…
Alexis
  • 26,219
  • 5
  • 78
  • 131
35
votes
13 answers

What statistical blogs would you recommend?

What statistical research blogs would you recommend, and why?
csgillespie
  • 11,849
  • 9
  • 56
  • 85
35
votes
8 answers

In Naive Bayes, why bother with Laplace smoothing when we have unknown words in the test set?

I was reading over Naive Bayes Classification today. I read, under the heading of Parameter Estimation with add 1 smoothing: Let $c$ refer to a class (such as Positive or Negative), and let $w$ refer to a token or word. The maximum likelihood…
35
votes
3 answers

Is it possible to change a hypothesis to match observed data (aka fishing expedition) and avoid an increase in Type I errors?

It is well known that researchers should spend time observing and exploring existing data and research before forming a hypothesis and then collecting data to test that hypothesis (referring to null-hypothesis significance testing). Many basic…
post-hoc
  • 677
  • 1
  • 6
  • 14
34
votes
4 answers

Checking if two Poisson samples have the same mean

This is an elementary question, but I wasn't able to find the answer. I have two measurements: n1 events in time t1 and n2 events in time t2, both produced (say) by Poisson processes with possibly-different lambda values. This is actually from a…
Charles
  • 1,068
  • 1
  • 7
  • 14
34
votes
3 answers

How can I interpret a confusion matrix

I am using confusion matrix to check the performance of my classifier. I am using Scikit-Learn, I am little bit confused. How can I interpret the result from from sklearn.metrics import confusion_matrix >>> y_true = [2, 0, 2, 2, 0, 1] >>> y_pred…
user3378649
  • 1,107
  • 4
  • 13
  • 22
34
votes
6 answers

Difference between Bayes network, neural network, decision tree and Petri nets

What is the difference between neural network, Bayesian network, decision tree and Petri nets, even though they are all graphical models and visually depict cause-effect relationship.
Ria George
  • 1,375
  • 2
  • 14
  • 31
34
votes
2 answers

How to derive the standard error of linear regression coefficient

For this univariate linear regression model $$y_i = \beta_0 + \beta_1x_i+\epsilon_i$$ given data set $D=\{(x_1,y_1),...,(x_n,y_n)\}$, the coefficient estimates are $$\hat\beta_1=\frac{\sum_ix_iy_i-n\bar x\bar y}{n\bar x^2-\sum_ix_i^2}$$…
avocado
  • 3,045
  • 5
  • 32
  • 45
34
votes
3 answers

Does a sample version of the one-sided Chebyshev inequality exist?

I am interested in the following one-sided Cantelli's version of the Chebyshev inequality: $$ \mathbb P(X - \mathbb E (X) \geq t) \leq \frac{\mathrm{Var}(X)}{\mathrm{Var}(X) + t^2} \,. $$ Basically, if you know the population mean and variance, you…
34
votes
2 answers

Am I creating bias by using the same random seed over and over?

In almost all of the analysis work that I've ever done I use: set.seed(42) It's an homage to Hitchhiker's Guide to the Galaxy. But I'm wondering if I'm creating bias by using the same seed over and over.
Brandon Bertelsen
  • 6,672
  • 9
  • 35
  • 46
34
votes
5 answers

Data "exploration" vs data "snooping"/"torturing"?

Many times I have come across informal warnings against "data snooping" (here's one amusing example), and I think I have an intuitive idea of roughly what that means, and why it may be a problem. On the other hand, "exploratory data analysis" seems…
kjo
  • 1,817
  • 1
  • 16
  • 24
34
votes
5 answers

How to sample from a discrete distribution?

Assume I have a distribution governing the possible outcome from a single random variable X. This is something like [0.1, 0.4, 0.2, 0.3] for X being a value of either 1, 2, 3, 4. Is it possible to sample from this distribution, i.e. generate pseudo…