Most Popular
1500 questions
61
votes
6 answers
Is the "hybrid" between Fisher and Neyman-Pearson approaches to statistical testing really an "incoherent mishmash"?
There exists a certain school of thought according to which the most widespread approach to statistical testing is a "hybrid" between two approaches: that of Fisher and that of Neyman-Pearson; these two approaches, the claim goes, are "incompatible"…

amoeba
- 93,463
- 28
- 275
- 317
60
votes
3 answers
Testing equality of coefficients from two different regressions
This seems to be a basic issue, but I just realized that I actually don't know how to test equality of coefficients from two different regressions. Can anyone shed some light on this?
More formally, suppose I ran the following two regressions:…

coffeinjunky
- 1,646
- 1
- 16
- 22
60
votes
5 answers
Is it important to scale data before clustering?
I found this tutorial, which suggests that you should run the scale function on features before clustering (I believe that it converts data to z-scores).
I'm wondering whether that is necessary. I'm asking mostly because there's a nice elbow point…

Jeremy
- 1,259
- 3
- 12
- 17
60
votes
5 answers
How to calculate pseudo-$R^2$ from R's logistic regression?
Christopher Manning's writeup on logistic regression in R shows a logistic regression in R as follows:
ced.logr <- glm(ced.del ~ cat + follows + factor(class),
family=binomial)
Some output:
> summary(ced.logr)
Call:
glm(formula = ced.del ~ cat +…

dfrankow
- 2,816
- 6
- 30
- 39
60
votes
5 answers
Is every covariance matrix positive definite?
I guess the answer should be yes, but I still feel something is not right. There should be some general results in the literature, could anyone help me?

Jingjings
- 1,173
- 1
- 9
- 13
60
votes
46 answers
Most famous statisticians
What are the most important statisticians, and what is it that made them famous?
(Reply just one scientist per answer please.)

mariana soffer
- 1,091
- 2
- 15
- 18
60
votes
3 answers
Linear model with log-transformed response vs. generalized linear model with log link
In this paper titled "CHOOSING AMONG GENERALIZED LINEAR MODELS APPLIED TO MEDICAL DATA" the authors write:
In a generalized linear model, the mean is transformed, by the link
function, instead of transforming the response itself. The two methods
…

miura
- 3,364
- 3
- 21
- 27
60
votes
1 answer
Logistic regression in R resulted in perfect separation (Hauck-Donner phenomenon). Now what?
I'm trying to predict a binary outcome using 50 continuous explanatory variables (the range of most of the variables is $-\infty$ to $\infty$). My data set has almost 24,000 rows. When I run glm in R, I get:
Warning messages:
1: glm.fit: algorithm…

Dcook
- 733
- 1
- 7
- 8
60
votes
5 answers
How does one interpret SVM feature weights?
I am trying to interpret the variable weights given by fitting a linear SVM.
(I'm using scikit-learn):
from sklearn import svm
svm = svm.SVC(kernel='linear')
svm.fit(features, labels)
svm.coef_
I cannot find anything in the documentation that…

Austin Richardson
- 928
- 1
- 8
- 10
60
votes
4 answers
How to generate correlated random numbers (given means, variances and degree of correlation)?
I'm sorry if this seems a bit too basic, but I guess I'm just looking to confirm understanding here. I get the sense I'd have to do this in two steps, and I've started trying to grok correlation matrices, but it's just starting to seem really…

Joseph Weissman
- 703
- 1
- 6
- 7
60
votes
7 answers
Why is the regularization term *added* to the cost function (instead of multiplied etc.)?
Whenever regularization is used, it is often added onto the cost function such as in the following cost function.
$$
J(\theta)=\frac 1 2(y-\theta X^T)(y-\theta X^T)^T+\alpha\|\theta\|_2^2
$$
This makes intuitive sense to me since minimize the cost…

grenmester
- 725
- 1
- 6
- 5
60
votes
5 answers
Is adjusting p-values in a multiple regression for multiple comparisons a good idea?
Lets assume you are a social science researcher/econometrician trying to find relevant predictors of demand for a service. You have 2 outcome/dependent variables describing the demand (using the service yes/no, and the number of occasions). You have…

Mikael M
- 703
- 1
- 6
- 6
60
votes
5 answers
What is the advantages of Wasserstein metric compared to Kullback-Leibler divergence?
What is the practical difference between Wasserstein metric and Kullback-Leibler divergence? Wasserstein metric is also referred to as Earth mover's distance.
From Wikipedia:
Wasserstein (or Vaserstein) metric is a distance function defined between…

Thomas Fauskanger
- 703
- 1
- 6
- 5
60
votes
2 answers
What is the relationship between a chi squared test and test of equal proportions?
Suppose that I have three populations with four, mutually exclusive characteristics. I take random samples from each population and construct a crosstab or frequency table for the characteristics that I am measuring. Am I correct in saying…

hgcrpd
- 1,307
- 2
- 11
- 13
60
votes
5 answers
Backpropagation with Softmax / Cross Entropy
I'm trying to understand how backpropagation works for a softmax/cross-entropy output layer.
The cross entropy error function is
$$E(t,o)=-\sum_j t_j \log o_j$$
with $t$ and $o$ as the target and output at neuron $j$, respectively. The sum is over…

micha
- 703
- 1
- 6
- 5