Most Popular

1500 questions
37
votes
3 answers

Why is a likelihood-ratio test distributed chi-squared?

Why is the test statistic of a likelihood ratio test distributed chi-squared? $2(\ln \text{ L}_{\rm alt\ model} - \ln \text{ L}_{\rm null\ model} ) \sim \chi^{2}_{df_{\rm alt}-df_{\rm null}}$
Dr. Beeblebrox
  • 1,120
  • 1
  • 11
  • 16
37
votes
6 answers

Is there a name for the opposite of the gambler's fallacy?

The gambler's fallacy is a fallacy because of the assumed probability and the independence of the events. However, if, after flipping a coin 100 times and obtaining heads each time, I still believe the probability of obtaining tails to be 0.5, am I…
Igor F.
  • 6,004
  • 1
  • 16
  • 41
37
votes
3 answers

Explanation of finite population correction factor?

I understand that when sampling from a finite population and our sample size is more than 5% of the population, we need to make a correction on the sample's mean and standard error using this formula: $\hspace{10mm} FPC=\sqrt{\frac{N-n}{N-1}}$ Where…
Sara
  • 1,347
  • 4
  • 13
  • 16
37
votes
4 answers

Intuitive explanation of Kolmogorov Smirnov Test

What is the cleanest, easiest way to explain someone the concept of Kolmogorov Smirnov Test? What does it intuitively mean? It's a concept that I have difficulty in articulating - especially when explaining to someone. Can someone please explain it…
37
votes
5 answers

What is a good use of the 'comment' function in R?

I just discovered the comment function in R. Example: x <- matrix(1:12, 3,4) comment(x) <- c("This is my very important data from experiment #0234", "Jun 5, 1998") x comment(x) This is the first time I came by this function and was…
Tal Galili
  • 19,935
  • 32
  • 133
  • 195
37
votes
2 answers

Dealing with singular fit in mixed models

Let's say we have a model mod <- Y ~ X*Condition + (X*Condition|subject) # Y = logit variable # X = continuous variable # Condition = values A and B, dummy coded; the design is repeated # so all participants go through both…
User33268
  • 1,408
  • 2
  • 10
  • 21
37
votes
7 answers

What is the minimum recommended number of groups for a random effects factor?

I'm using a mixed model in R (lme4) to analyze some repeated measures data. I have a response variable (fiber content of feces) and 3 fixed effects (body mass, etc.). My study only has 6 participants, with 16 repeated measures for each one (though…
Chris
  • 799
  • 1
  • 7
  • 15
37
votes
3 answers

Difference between generalized linear models & generalized linear mixed models

I am wondering what the differences are between mixed and unmixed GLMs. For instance, in SPSS the drop down menu allows users to fit either: analyze-> generalized linear models-> generalized linear models & analyze-> mixed models-> generalized…
37
votes
5 answers

Timing functions in R

I would like to measure the time that it takes to repeat the running of a function. Are replicate() and using for-loops equivalent? For example: system.time(replicate(1000, f())); system.time(for(i in 1:1000){f()}); Which is the prefered…
Tim
  • 1
  • 29
  • 102
  • 189
37
votes
3 answers

Linearity of PCA

PCA is considered a linear procedure, however: $$\mathrm{PCA}(X)\neq \mathrm{PCA}(X_1)+\mathrm{PCA}(X_2)+\ldots+\mathrm{PCA}(X_n),$$ where $X=X_1+X_2+\ldots+X_n$. This is to say that the eigenvectors obtained by the PCAs on the data matrices $X_i$…
AlphaOmega
  • 667
  • 7
  • 13
37
votes
10 answers

What are the most useful sources of economics data?

When doing research in Economy, one frequently needs to verify theoretical conclusions on real data. What are reliable data sources to use and cite? I am mainly interested in sources that provide various statistical data such as GDP, population,…
Karel Petranek
  • 341
  • 1
  • 3
  • 3
37
votes
3 answers

How to estimate shrinkage parameter in Lasso or ridge regression with >50K variables?

I want to use Lasso or ridge regression for a model with more than 50,000 variables. I want do so using software package in R. How can I estimate the shrinkage parameter ($\lambda$)? Edits: Here is the point I got up to: set.seed (123) Y <- runif…
John
  • 2,088
  • 6
  • 27
  • 37
37
votes
4 answers

How does one measure the non-uniformity of a distribution?

I'm trying to come up with a metric for measuring non-uniformity of a distribution for an experiment I'm running. I have a random variable that should be uniformly distributed in most cases, and I'd like to be able to identify (and possibly measure…
JJC
  • 473
  • 1
  • 4
  • 7
37
votes
4 answers

Why does logistic regression become unstable when classes are well-separated?

Why is it that logistic regression becomes unstable when classes are well-separated? What does well-separated classes mean? I would really appreciate if someone can explain with an example.
Jane Dow
  • 471
  • 1
  • 4
  • 3
37
votes
2 answers

Quantile regression: Loss function

I am trying to understand the quantile regression, but one thing that makes me suffer is the choice of the loss function. $\rho_\tau(u) = u(\tau-1_{\{u<0\}})$ I know that the minimum of the expectation of $\rho_\tau(y-u)$ is equal to the…
CDO
  • 473
  • 1
  • 4
  • 6