Advises on statistical analysis that are often useful in practice (but are not always guaranteed to work).
Questions tagged [rule-of-thumb]
58 questions
111
votes
11 answers
Calculating optimal number of bins in a histogram
I'm interested in finding as optimal of a method as I can for determining how many bins I should use in a histogram. My data should range from 30 to 350 objects at most, and in particular I'm trying to apply thresholding (like Otsu's method) where…

Tony Stark
- 1,213
- 2
- 9
- 5
88
votes
24 answers
Rules of thumb for "modern" statistics
I like G van Belle's book on Statistical Rules of Thumb, and to a lesser extent Common Errors in Statistics (and How to Avoid Them) from Phillip I Good and James W. Hardin. They address common pitfalls when interpreting results from experimental and…

chl
- 50,972
- 18
- 205
- 364
79
votes
7 answers
Rules of thumb for minimum sample size for multiple regression
Within the context of a research proposal in the social sciences, I was asked the following question:
I have always gone by 100 + m (where m
is the number of predictors) when
determining minimum sample size for
multiple regression. Is…

Jeromy Anglim
- 42,044
- 23
- 146
- 250
46
votes
4 answers
What references should be cited to support using 30 as a large enough sample size?
I have read/heard many times that the sample size of at least 30 units is considered as "large sample" (normality assumptions of means usually approximately holds due to the CLT, ...). Therefore, in my experiments, I usually generate samples of 30…

Lan
- 1,231
- 2
- 14
- 16
22
votes
2 answers
Good online resource with tips on graphing association between two numeric variables under various conditions
Context:
Over the while I've acquired a set of heuristics on how to effectively plot the association between two numeric variables. I imagine most people who work with data would have a similar set of rules.
Examples of such rules might be:
If one…

Jeromy Anglim
- 42,044
- 23
- 146
- 250
15
votes
2 answers
"When to use boxplot and when barplot" rules (of thumb?)
Both box-and-whisker plot and bar chart are appropriate graphics
for ANOVA according to The R Book (Crawley, 2013),
but which is more appropriate? I suppose it depends on situation... can anybody help me?

Ladislav Naďo
- 2,202
- 4
- 21
- 45
15
votes
2 answers
What is the logic behind "rule of thumb" for meaningful differences in AIC?
I've been struggling to find meaningful guidelines for comparing models based on differences in AIC. I keep coming back to the rule of thumb offered by Burnham & Anderson 2004, pp. 270-272:
Some simple rules of thumb are often useful in assessing…

Tripartio
- 1,517
- 1
- 13
- 19
15
votes
3 answers
$L_1$ or $L_.5$ metrics for clustering?
Does anyone use the $L_1$ or $L_.5$ metrics for clustering, rather than $L_2$ ?
Aggarwal et al.,
On the surprising behavior of distance metrics in high dimensional space
said (in 2001) that
$L_1$ is consistently more preferable
then the Euclidean…

denis
- 3,187
- 20
- 34
12
votes
1 answer
Relation between learning rate and number of hidden layers?
Is there any rule of thumb between depth of a neural network and learning rate? I have been noticing that the deeper the network is, the lower the learning rate must be.
If that's correct, why is that?

user_1177868
- 712
- 4
- 13
11
votes
1 answer
Sample size required to determine which of a set of advertisements has the highest click through rate
I am a software designer by trade and I am working on a project for a client, and I would like to make sure that my analysis is statistically sound.
Consider the following: We have n advertisements (n < 10), and we simply want to know which ad…

Jonathan
- 211
- 1
- 4
11
votes
4 answers
MANOVA and correlations between dependent variables: how strong is too strong?
The dependent variables in a MANOVA should not be "too strongly correlated". But how strong a correlation is too strong? It would be interesting to get people's opinions on this issue. For instance, would you proceed with MANOVA in the following…

Freya Harrison
- 3,212
- 4
- 25
- 31
11
votes
3 answers
Revisiting the Rule of Three
The rule of three is a method for calculating a 95% confidence interval when estimating $p$ from a set of $n$ IID Bernoulli trials with no successes.
My understanding from its derivation is that the confidence interval it produces,…

Thoth
- 1,213
- 9
- 24
10
votes
1 answer
Histogram with uniform vs non-uniform Bins
This question describes the basic difference between a uniform and a nonuniform histogram. And this question discusses the rule of thumb for picking the number of bins of a uniform histogram that optimizes (in some sense) the degree to which the…

Alan Turing
- 223
- 2
- 8
10
votes
1 answer
How does one formalize a prior probability distribution? Are there rules of thumb or tips one should use?
While I like to think I have good grasp of the concept of prior information in Bayesian statistical analysis and decision making, I often have trouble wrapping my head around its application. I have in mind a couple of situations that exemplify my…

Phil
- 365
- 2
- 14
8
votes
1 answer
Basic easy rules for statistics
In a binomial experiment, if we observe $x=0$ positive individual among $n$ individuals, then the proportion of positive individuals is significantly lower than $3/n$ with a type 1 error less than and very close to $5\%$. This fact, sometimes called…

Stéphane Laurent
- 17,425
- 5
- 59
- 101