Most Popular

1500 questions
61
votes
6 answers

Is the "hybrid" between Fisher and Neyman-Pearson approaches to statistical testing really an "incoherent mishmash"?

There exists a certain school of thought according to which the most widespread approach to statistical testing is a "hybrid" between two approaches: that of Fisher and that of Neyman-Pearson; these two approaches, the claim goes, are "incompatible"…
60
votes
3 answers

Testing equality of coefficients from two different regressions

This seems to be a basic issue, but I just realized that I actually don't know how to test equality of coefficients from two different regressions. Can anyone shed some light on this? More formally, suppose I ran the following two regressions:…
coffeinjunky
  • 1,646
  • 1
  • 16
  • 22
60
votes
5 answers

Is it important to scale data before clustering?

I found this tutorial, which suggests that you should run the scale function on features before clustering (I believe that it converts data to z-scores). I'm wondering whether that is necessary. I'm asking mostly because there's a nice elbow point…
Jeremy
  • 1,259
  • 3
  • 12
  • 17
60
votes
5 answers

How to calculate pseudo-$R^2$ from R's logistic regression?

Christopher Manning's writeup on logistic regression in R shows a logistic regression in R as follows: ced.logr <- glm(ced.del ~ cat + follows + factor(class), family=binomial) Some output: > summary(ced.logr) Call: glm(formula = ced.del ~ cat +…
dfrankow
  • 2,816
  • 6
  • 30
  • 39
60
votes
5 answers

Is every covariance matrix positive definite?

I guess the answer should be yes, but I still feel something is not right. There should be some general results in the literature, could anyone help me?
Jingjings
  • 1,173
  • 1
  • 9
  • 13
60
votes
46 answers

Most famous statisticians

What are the most important statisticians, and what is it that made them famous? (Reply just one scientist per answer please.)
mariana soffer
  • 1,091
  • 2
  • 15
  • 18
60
votes
3 answers

Linear model with log-transformed response vs. generalized linear model with log link

In this paper titled "CHOOSING AMONG GENERALIZED LINEAR MODELS APPLIED TO MEDICAL DATA" the authors write: In a generalized linear model, the mean is transformed, by the link function, instead of transforming the response itself. The two methods …
60
votes
1 answer

Logistic regression in R resulted in perfect separation (Hauck-Donner phenomenon). Now what?

I'm trying to predict a binary outcome using 50 continuous explanatory variables (the range of most of the variables is $-\infty$ to $\infty$). My data set has almost 24,000 rows. When I run glm in R, I get: Warning messages: 1: glm.fit: algorithm…
Dcook
  • 733
  • 1
  • 7
  • 8
60
votes
5 answers

How does one interpret SVM feature weights?

I am trying to interpret the variable weights given by fitting a linear SVM. (I'm using scikit-learn): from sklearn import svm svm = svm.SVC(kernel='linear') svm.fit(features, labels) svm.coef_ I cannot find anything in the documentation that…
Austin Richardson
  • 928
  • 1
  • 8
  • 10
60
votes
4 answers

How to generate correlated random numbers (given means, variances and degree of correlation)?

I'm sorry if this seems a bit too basic, but I guess I'm just looking to confirm understanding here. I get the sense I'd have to do this in two steps, and I've started trying to grok correlation matrices, but it's just starting to seem really…
60
votes
7 answers

Why is the regularization term *added* to the cost function (instead of multiplied etc.)?

Whenever regularization is used, it is often added onto the cost function such as in the following cost function. $$ J(\theta)=\frac 1 2(y-\theta X^T)(y-\theta X^T)^T+\alpha\|\theta\|_2^2 $$ This makes intuitive sense to me since minimize the cost…
grenmester
  • 725
  • 1
  • 6
  • 5
60
votes
5 answers

Is adjusting p-values in a multiple regression for multiple comparisons a good idea?

Lets assume you are a social science researcher/econometrician trying to find relevant predictors of demand for a service. You have 2 outcome/dependent variables describing the demand (using the service yes/no, and the number of occasions). You have…
60
votes
5 answers

What is the advantages of Wasserstein metric compared to Kullback-Leibler divergence?

What is the practical difference between Wasserstein metric and Kullback-Leibler divergence? Wasserstein metric is also referred to as Earth mover's distance. From Wikipedia: Wasserstein (or Vaserstein) metric is a distance function defined between…
60
votes
2 answers

What is the relationship between a chi squared test and test of equal proportions?

Suppose that I have three populations with four, mutually exclusive characteristics. I take random samples from each population and construct a crosstab or frequency table for the characteristics that I am measuring. Am I correct in saying…
hgcrpd
  • 1,307
  • 2
  • 11
  • 13
60
votes
5 answers

Backpropagation with Softmax / Cross Entropy

I'm trying to understand how backpropagation works for a softmax/cross-entropy output layer. The cross entropy error function is $$E(t,o)=-\sum_j t_j \log o_j$$ with $t$ and $o$ as the target and output at neuron $j$, respectively. The sum is over…
micha
  • 703
  • 1
  • 6
  • 5