Most Popular

1500 questions
32
votes
4 answers

Why is the exponential family so important in statistics?

Why is the exponential family so important in statistics? I was recently reading about the exponential family within statistics. As far as I understand, the exponential family refers to any probability distribution function that can be written in…
stats_noob
  • 5,882
  • 1
  • 21
  • 42
32
votes
3 answers

How can a statistician who has the data for a non-normal distribution guess better than one who only has the mean?

Let's say we have a game with two players. Both of them know that five samples are drawn from some distribution (not normal). None of them know the parameters of the distribution used to generate the data. The goal of the game is to estimate the…
ryu576
  • 2,220
  • 1
  • 16
  • 25
32
votes
3 answers

Student t as mixture of gaussian

Using the student t-distribution with $k > 0$ degrees of freedom, location parameter $l$ and scale parameter $s$ having density $$\frac{\Gamma \left(\frac{k+1}{2}\right)}{\Gamma\left(\frac{k}{2}\sqrt{k \pi s^2}\right)} \left\{ 1 + k^{-1}\left(…
32
votes
5 answers

What can cause PCA to worsen results of a classifier?

I have a classifier that I'm doing cross-validation on, along with a hundred or so features that I'm doing forward selection on to find optimal combinations of features. I also compare this against running the same experiments with PCA, where I take…
Dolan Antenucci
  • 749
  • 3
  • 7
  • 13
32
votes
5 answers

How can you account for COVID-19 in your models?

How are you dealing with the coronavirus "event" in your machine learning models? Let's say you used to predict the number of sales each month. The virus affected your results last year and it will affect for at least a couple of months. So your…
32
votes
7 answers

Absence of evidence is not evidence of absence: What does Bayesian probability have to say about it?

A famous aphorism by cosmologist Martin Rees(*) goes "absence of evidence is not evidence of absence". On the other hand, quoting Wikipedia: In carefully designed scientific experiments, even null results can be evidence of absence. For instance, a…
rasmodius
  • 1,558
  • 1
  • 7
  • 17
32
votes
1 answer

Best factor extraction methods in factor analysis

SPSS offers several methods of factor extraction: Principal components (which isn't factor analysis at all) Unweighted least squares Generalized least squares Maximum Likelihood Principal Axis Alpha factoring Image factoring Ignoring the first…
Placidia
  • 13,501
  • 6
  • 33
  • 62
32
votes
13 answers

If R were reprogrammed from scratch today, what changes would be most useful to the statistics community?

Many people in the statistics community and other academic fields use R as their primary language for data analysis and statistical computing. It is a wonderful and versatile language that has become extremely popular across both academic and…
Ben
  • 91,027
  • 3
  • 150
  • 376
32
votes
6 answers

Under which assumptions a regression can be interpreted causally?

First, don't panic. Yes, there are many similar question on this site. But I believe none gives a conclusive answer to the question below. Please bear with me. Consider a data generation process $\text{D}_X(x_1, ... , x_n|\theta)$, where…
luchonacho
  • 2,568
  • 3
  • 21
  • 38
32
votes
3 answers

Why do transformers use layer norm instead of batch norm?

Both batch norm and layer norm are common normalization techniques for neural network training. I am wondering why transformers primarily use layer norm.
SantoshGupta7
  • 629
  • 1
  • 6
  • 12
32
votes
3 answers

How to calculate goodness of fit in glm (R)

I have the following result from running glm function. How can I interpret the following values: Null deviance Residual deviance AIC Do they have something to do with the goodness of fit? Can I calculate some goodness of fit measure from these…
learner
  • 775
  • 3
  • 8
  • 13
32
votes
3 answers

What is the difference between EM and Gradient Ascent?

What is the difference between the algorithms EM (Expectation Maximization) and Gradient Ascent (or descent)? Is there any condition under which they are equivalent?
Aslan986
  • 728
  • 2
  • 7
  • 18
32
votes
2 answers

StackExchange fires a moderator, and now in response hundreds of moderators resign: is the increase in resignations statistically significant?

I am doing a study on StackExchange. The management of StackExchange has demodded (for unclear reasons) a moderator, and now the network is on fire. Currently many moderators resign or suspend their activities because they are dissatisfied. I wish…
32
votes
3 answers

Why does finding small effects in large studies indicate publication bias?

Several methodological papers (e.g. Egger et al 1997a, 1997b) discuss publication bias as revealed by meta-analyses, using funnel plots such as the one below. The 1997b paper goes on to say that "if publication bias is present, it is expected…
z8080
  • 1,598
  • 1
  • 19
  • 38
32
votes
4 answers

Are there default functions for discrete uniform distributions in R?

Most standard distributions in R have a family of commands - pdf/pmf, cdf/cmf, quantile, random deviates (for example- dnorm, pnorm, qnorm, rnorm). I know it's easy enough to make use of some standard commands to reproduce these functions for the…
Joseph Hsieh