Highest Voted Questions - Statistical Analysis Stack Exchange

50

votes

3 answers

Suppression effect in regression: definition and visual explanation/depiction

What is a suppressor variable in multiple regression and what might be the ways to display suppression effect visually (its mechanics or its evidence in results)? I'd like to invite everybody who has a thought, to share.

multiple-regression data-visualization geometry suppressor

asked Oct 27 '13 at 12:08

ttnphns

51,648
40
253
462

50

votes

3 answers

Why does correlation matrix need to be positive semi-definite and what does it mean to be or not to be positive semi-definite?

I have been researching the meaning of positive semi-definite property of correlation or covariance matrices. I am looking for any information on Definition of positive semi-definiteness; Its important properties, practical implications; The…

covariance-matrix eigenvalues determinant correlation-matrix

asked Sep 03 '13 at 22:21

Melon

509
1
5
4

50

votes

3 answers

How can I calculate $\int^{\infty}_{-\infty}\Phi\left(\frac{w-a}{b}\right)\phi(w)\,\mathrm dw$

Suppose $\phi(\cdot)$ and $\Phi(\cdot)$ are density function and distribution function of the standard normal distribution. How can one calculate the integral: $$\int^{\infty}_{-\infty}\Phi\left(\frac{w-a}{b}\right)\phi(w)\,\mathrm dw$$

mathematical-statistics normal-distribution integral

asked Jun 06 '13 at 18:17

hadisanji

793
6
7

50

votes

4 answers

Normality of dependent variable = normality of residuals?

This issue seems to rear its ugly head all the time, and I'm trying to decapitate it for my own understanding of statistics (and sanity!). The assumptions of general linear models (t-test, ANOVA, regression etc.) include the "assumption of…

normal-distribution residuals normality-assumption

asked May 30 '13 at 05:36

DeanP

841
2
11
11

50

votes

2 answers

Random forest assumptions

I am kind of new to random forest so I am still struggling with some basic concepts. In linear regression, we assume independent observations, constant variance… What are the basic assumptions/hypothesis we make, when we use random forest? …

regression classification random-forest

asked May 15 '13 at 14:13

user1848018

745
1
7
10

50

votes

2 answers

Regression: Transforming Variables

When transforming variables, do you have to use all of the same transformation? For example, can I pick and choose differently transformed variables, as in: Let, $x_1,x_2,x_3$ be age, length of employment, length of residence, and income. Y =…

r regression logistic data-transformation

asked Nov 23 '10 at 17:41

Brandon Bertelsen

6,672
9
35
46

50

votes

6 answers

Debunking wrong CLT statement

The central limit theorem (CLT) gives some nice properties about converging to a normal distribution. Prior to studying statistics formally, I was under the extremely wrong impression that the CLT said that data approached normality. I now find…

normal-distribution convergence intuition central-limit-theorem communication

asked Jun 22 '20 at 16:14

Dave

28,473
4
52
104

50

votes

15 answers

A smaller dataset is better: Is this statement false in statistics? How to refute it properly?

Dr. Raoult, who promotes Hydroxychloroquine, has some really intriguing statement about statistics in the biomedical field: It's counterintuitive, but the smaller the sample size of a clinical test, the more significant its results are. The…

statistical-significance sample-size

asked Apr 09 '20 at 21:36

Stephane Rolland

654
6
13

50

votes

6 answers

Motivation for Kolmogorov distance between distributions

There are many ways to measure how similar two probability distributions are. Among methods which are popular (in different circles) are: the Kolmogorov distance: the sup-distance between the distribution functions; the Kantorovich-Rubinstein…

distributions probability hypothesis-testing mathematical-statistics

asked Jul 21 '10 at 13:39

Mark Meckes

2,916
3
19
18

50

votes

5 answers

What is the difference between the forward-backward and Viterbi algorithms?

I want to know what the differences between the forward-backward algorithm and the Viterbi algorithm for inference in hidden Markov models (HMM) are.

algorithms hidden-markov-model viterbi-algorithm forward-backward

asked Jul 06 '12 at 03:46

user34790

6,049
6
42
64

50

votes

7 answers

Why would someone use a Bayesian approach with a 'noninformative' improper prior instead of the classical approach?

If the interest is merely estimating the parameters of a model (pointwise and/or interval estimation) and the prior information is not reliable, weak, (I know this is a bit vague but I am trying to establish an scenario where the choice of a prior…

bayesian inference prior likelihood information-theory

asked May 02 '12 at 22:47

user10525

50

votes

3 answers

Why do we only see $L_1$ and $L_2$ regularization but not other norms?

I am just curious why there are usually only $L_1$ and $L_2$ norms regularization. Are there proofs of why these are better?

lasso regularization ridge-regression

asked Mar 23 '17 at 09:28

user10024395

1
2
11
20

50

votes

3 answers

What is the root cause of the class imbalance problem?

I've been thinking a lot about the "class imbalance problem" in machine/statistical learning lately, and am drawing ever deeper into a feeling that I just don't understand what is going on. First let me define (or attempt to) define my terms: The…

classification predictive-models unbalanced-classes scoring-rules

asked Nov 25 '16 at 19:02

Matthew Drury

33,314
2
101
132

50

votes

1 answer

How does centering the data get rid of the intercept in regression and PCA?

I keep reading about instances where we center the data (e.g., with regularization or PCA) in order to remove the intercept (as mentioned in this question). I know it's simple, but I'm having a hard time intuitively understanding this. Could someone…

regression pca centering

asked Feb 06 '12 at 06:45

Alec

2,185
4
17
14

50

votes

4 answers

When is a biased estimator preferable to unbiased one?

It's obvious many times why one prefers an unbiased estimator. But, are there any circumstances under which we might actually prefer a biased estimator over an unbiased one?

mathematical-statistics bias estimators unbiased-estimator bias-variance-tradeoff

asked Apr 17 '16 at 04:49

Stan Shunpike

3,623
2
27
36

Most Popular