Most Popular

1500 questions
156
votes
34 answers

The Sleeping Beauty Paradox

The situation Some researchers would like to put you to sleep. Depending on the secret toss of a fair coin, they will briefly awaken you either once (Heads) or twice (Tails). After each waking, they will put you back to sleep with a drug that…
whuber
  • 281,159
  • 54
  • 637
  • 1,101
156
votes
1 answer

How to reverse PCA and reconstruct original variables from several principal components?

Principal component analysis (PCA) can be used for dimensionality reduction. After such dimensionality reduction is performed, how can one approximately reconstruct the original variables/features from a small number of principal…
amoeba
  • 93,463
  • 28
  • 275
  • 317
152
votes
6 answers

Why are p-values uniformly distributed under the null hypothesis?

Recently, I have found in a paper by Klammer, et al. a statement that p-values should be uniformly distributed. I believe the authors, but cannot understand why it is so. Klammer, A. A., Park, C. Y., and Stafford Noble, W. (2009) Statistical…
golobor
  • 1,543
  • 3
  • 10
  • 8
151
votes
25 answers

R vs SAS, why is SAS preferred by private companies?

I learned R but it seems that companies are much more interested in SAS experience. What are the advantages of SAS over R?
Benoit_Plante
  • 2,461
  • 4
  • 18
  • 25
150
votes
9 answers

Objective function, cost function, loss function: are they the same thing?

In machine learning, people talk about objective function, cost function, loss function. Are they just different names of the same thing? When to use them? If they are not always refer to the same thing, what are the differences?
Bin
  • 1,619
  • 3
  • 10
  • 9
149
votes
1 answer

Crossed vs nested random effects: how do they differ and how are they specified correctly in lme4?

Here is how I have understood nested vs. crossed random effects: Nested random effects occur when a lower level factor appears only within a particular level of an upper level factor. For example, pupils within classes at a fixed point in time.…
147
votes
15 answers

Amazon interview question—probability of 2nd interview

I got this question during an interview with Amazon: 50% of all people who receive a first interview receive a second interview 95% of your friends that got a second interview felt they had a good first interview 75% of your friends that DID NOT…
Rick
  • 1,431
  • 2
  • 11
  • 9
147
votes
3 answers

When is R squared negative?

My understanding is that $R^2$ cannot be negative as it is the square of R. However I ran a simple linear regression in SPSS with a single independent variable and a dependent variable. My SPSS output give me a negative value for $R^2$. If I was to…
Anne
  • 1,967
  • 6
  • 17
  • 13
146
votes
9 answers

Does causation imply correlation?

Correlation does not imply causation, as there could be many explanations for the correlation. But does causation imply correlation? Intuitively, I would think that the presence of causation means there is necessarily some correlation. But my…
Matthew
  • 1,562
  • 2
  • 11
  • 7
146
votes
6 answers

Correlations with unordered categorical variables

I have a dataframe with many observations and many variables. Some of them are categorical (unordered) and the others are numerical. I'm looking for associations between these variables. I've been able to compute correlation for numerical variables…
Clément F
  • 1,717
  • 4
  • 12
  • 13
145
votes
5 answers

How to choose between Pearson and Spearman correlation?

How do I know when to choose between Spearman's $\rho$ and Pearson's $r$? My variable includes satisfaction and the scores were interpreted using the sum of the scores. However, these scores could also be ranked.
user3636
145
votes
3 answers

Help me understand Bayesian prior and posterior distributions

In a group of students, there are 2 out of 18 that are left-handed. Find the posterior distribution of left-handed students in the population assuming uninformative prior. Summarize the results. According to the literature 5-20% of people are…
Bob
  • 1,451
  • 3
  • 10
  • 3
143
votes
6 answers

Pearson's or Spearman's correlation with non-normal data

I get this question frequently enough in my statistics consulting work, that I thought I'd post it here. I have an answer, which is posted below, but I was keen to hear what others have to say. Question: If you have two variables that are not…
Jeromy Anglim
  • 42,044
  • 23
  • 146
  • 250
141
votes
8 answers

Is Facebook coming to an end?

Recently, this paper has received a lot of attention (e.g. from WSJ). Basically, the authors conclude that Facebook will lose 80% of its members by 2017. They base their claims on an extrapolation of the SIR model, a compartmental model frequently…
141
votes
8 answers

Why L1 norm for sparse models

I am reading books about linear regression. There are some sentences about the L1 and L2 norm. I know the formulas, but I don't understand why the L1 norm enforces sparsity in models. Can someone give a simple explanation?
Yongwei Xing
  • 1,583
  • 3
  • 11
  • 7