Most Popular
1500 questions
61
votes
4 answers
How are regression, the t-test, and the ANOVA all versions of the general linear model?
How are they all versions of the same basic statistical method?

Amahabirsingh
- 691
- 1
- 6
- 5
61
votes
8 answers
Does it ever make sense to treat categorical data as continuous?
In answering this question on discrete and continuous data I glibly asserted that it rarely makes sense to treat categorical data as continuous.
On the face of it that seems self-evident, but intuition is often a poor guide for statistics, or at…

walkytalky
- 1,857
- 2
- 22
- 24
61
votes
6 answers
A chart of daily cases of COVID-19 in a Russian region looks suspiciously level to me - is this so from the statistics viewpoint?
Below is a daily chart of newly-detected COVID infections in Krasnodar Krai, a region of Russia, from April 29 to May 19. The population of the region is 5.5 million people.
I read about it and wondered - does this (relatively smooth dynamics of new…

CopperKettle
- 1,123
- 12
- 18
61
votes
11 answers
Examples of Bayesian and frequentist approach giving different answers
Note: I am aware of philosophical differences between Bayesian and frequentist statistics.
For example "what is the probability that the coin on the table is heads" doesn't make sense in frequentist statistics, since it has either already landed…

user541686
- 1,075
- 1
- 9
- 21
61
votes
10 answers
What does "Scientists rise up against statistical significance" mean? (Comment in Nature)
The title of the Comment in Nature Scientists rise up against statistical significance begins with:
Valentin Amrhein, Sander Greenland, Blake McShane and more than 800 signatories call for an end to hyped claims and the dismissal of possibly…

uhoh
- 685
- 1
- 6
- 10
61
votes
11 answers
Brain teaser: How to generate 7 integers with equal probability using a biased coin that has a pr(head) = p?
This is a question I found on Glassdoor: How does one generate 7 integers with equal probability using a coin that has a $\mathbb{Pr}(\text{Head}) = p\in(0,1)$?
Basically, you have a coin that may or may not be fair, and this is the only…

Amazonian
- 1,394
- 1
- 10
- 19
61
votes
1 answer
Can someone explain the concept of 'exchangeability'?
I see the concept of 'exchangeability' being used in different contexts (e.g., bayesian models) but I have never understood the term very well.
What does this concept mean?
Under what circumstances is this concept invoked and why?

sxv
- 805
- 1
- 8
- 6
61
votes
4 answers
Why is expectation the same as the arithmetic mean?
Today I came across a new topic called the Mathematical Expectation. The book I am following says, expectation is the arithmetic mean of random variable coming from any probability distribution. But, it defines expectation as the sum of product of…

pranphy
- 851
- 1
- 8
- 10
61
votes
4 answers
Standard error for the mean of a sample of binomial random variables
Suppose I'm running an experiment that can have 2 outcomes, and I'm assuming that the underlying "true" distribution of the 2 outcomes is a binomial distribution with parameters $n$ and $p$: ${\rm Binomial}(n, p)$.
I can compute the standard error,…

Frank
- 1,305
- 1
- 12
- 17
61
votes
5 answers
How exactly does a "random effects model" in econometrics relate to mixed models outside of econometrics?
I used to think that "random effects model" in econometrics corresponds to a "mixed model with random intercept" outside of econometrics, but now I am not sure. Does it?
Econometrics uses terms like "fixed effects" and "random effects" somewhat…

amoeba
- 93,463
- 28
- 275
- 317
61
votes
2 answers
Are mean normalization and feature scaling needed for k-means clustering?
What are the best (recommended) pre-processing steps before performing k-means?

pedrosaurio
- 1,283
- 2
- 14
- 19
61
votes
6 answers
What is the difference between estimation and prediction?
For example, I have historical loss data and I am calculating extreme quantiles (Value-at-Risk or Probable Maximum Loss). The results obtained is for estimating the loss or predicting them? Where can one draw the line? I am confused.

melon
- 611
- 1
- 6
- 3
61
votes
4 answers
Difference between Random Forest and Extremely Randomized Trees
I understood that Random Forest and Extremely Randomized Trees differ in the sense that the splits of the trees in the Random Forest are deterministic whereas they are random in the case of an Extremely Randomized Trees (to be more accurate, the…

RUser4512
- 9,226
- 5
- 29
- 59
61
votes
17 answers
Machine learning cookbook / reference card / cheatsheet?
I find resources like the Probability and Statistics Cookbook and The R Reference Card for Data Mining incredibly useful. They obviously serve well as references but also help me to organize my thoughts on a subject and get the lay of the land.
Q:…

lowndrul
- 2,057
- 1
- 18
- 20
61
votes
7 answers
Period detection of a generic time series
This post is the continuation of another post related to a generic method for outlier detection in time series.
Basically, at this point I'm interested in a robust way to discover the periodicity/seasonality of a generic time series affected by a…

gianluca
- 1,921
- 4
- 16
- 9