Highest Voted Questions - Statistical Analysis Stack Exchange

39

votes

5 answers

The meaning of "positive dependency" as a condition to use the usual method for FDR control

Benjamini and Hochberg developed the first (and still most widely used, I think) method for controlling the false discovery rate (FDR). I want to start with a bunch of P values, each for a different comparison, and decide which ones are low enough…

multiple-comparisons non-independent false-discovery-rate

asked Aug 13 '14 at 18:39

Harvey Motulsky

14,903
11
51
98

39

votes

4 answers

What is the meaning of the "." (dot) in R?

I'm just reading the book "R in a Nutshell". And it seems as if I skipped the part where the "." as in "sample.formula" was explained. > sample.formula <- as.formula(y~x1+x2) Is sample an object with a field formula as in other languages? And if…

r

asked May 12 '11 at 14:11

Fabian

1,341
4
12
13

38

votes

3 answers

Application of machine learning methods in StackExchange websites

I have a Machine Learning course this semester and the professor asked us to find a real-world problem and solve it by one of machine learning methods introduced in the class, as: Decision Trees Artificial Neural Networks Support Vector…

machine-learning

asked Apr 22 '11 at 22:27

Isaac

973
1
9
20

38

votes

2 answers

Why is logistic regression a linear model?

I want to know why logistic regression is called a linear model. It uses a sigmoid function, which is not linear. So why is logistic regression a linear model?

regression logistic terminology

asked Mar 03 '14 at 17:52

user34790

6,049
6
42
64

38

votes

3 answers

Is standardisation before Lasso really necessary?

I have read three main reasons for standardising variables before something such as Lasso regression: 1) Interpretability of coefficients. 2) Ability to rank the coefficient importance by the relative magnitude of post-shrinkage coefficient…

normalization lasso standardization regularization

asked Feb 13 '14 at 08:29

Jase

1,904
3
20
33

38

votes

3 answers

Distribution of scalar products of two random unit vectors in $D$ dimensions

If $\mathbf{x}$ and $\mathbf{y}$ are two independent random unit vectors in $\mathbb{R}^D$ (uniformly distributed on a unit sphere), what is the distribution of their scalar product (dot product) $\mathbf x \cdot \mathbf y$? I guess as $D$ grows the…

mathematical-statistics linear-algebra beta-distribution

asked Feb 08 '14 at 22:33

amoeba

93,463
28
275
317

38

votes

8 answers

Simple examples of uncorrelated but not independent $X$ and $Y$

Any hard-working student is a counterexample to "all students are lazy". What are some simple counterexamples to "if random variables $X$ and $Y$ are uncorrelated then they are independent"?

correlation random-variable independence

asked Feb 04 '14 at 06:58

Clare Brown

51
1
2
3

38

votes

4 answers

Difference between longitudinal design and time series

What is/are the difference(s) between a longitudinal design and a time series?

time-series panel-data

asked Feb 11 '11 at 22:51

DrWho

799
4
12
23

38

votes

4 answers

(Why) do overfitted models tend to have large coefficients?

I imagine that the larger a coefficient on a variable is, the more ability the model has to "swing" in that dimension, providing an increased opportunity to fit noise. Although I think I've got a reasonable sense of the relationship between the…

regression variance linear-model bias regularization

asked Jul 13 '13 at 01:30

David Marx

6,647
1
25
43

38

votes

8 answers

Is it possible to prove a null hypothesis?

As the question states - Is it possible to prove the null hypothesis? From my (limited) understanding of hypothesis, the answer is no but I can't come up with a rigorous explanation for it. Does the question have a definitive answer?

hypothesis-testing proof equivalence

asked Jan 13 '11 at 16:46

Pulkit Sinha

491
1
4
6

38

votes

2 answers

Variance of a function of one random variable

Lets say we have random variable $X$ with known variance and mean. The question is: what is the variance of $f(X)$ for some given function f. The only general method that I'm aware of is the delta method, but it gives only aproximation. Now I'm…

variance random-variable delta-method

asked Dec 28 '10 at 14:13

Tomek Tarczynski

3,854
7
29
37

38

votes

4 answers

How do you Interpret RMSLE (Root Mean Squared Logarithmic Error)?

I've been doing a machine learning competition where they use RMSLE (Root Mean Squared Logarithmic Error) to evaluate the performance predicting the sale price of a category of equipment. The problem is I'm not sure how to interpret the success of…

regression machine-learning interpretation measurement-error mathematical-statistics

asked Apr 20 '13 at 04:39

Opus

381
1
3
3

38

votes

6 answers

Why does "explaining away" make intuitive sense?

I recently learned about a principle of probabilistic reasoning called "explaining away," and I am trying to grasp an intuition for it. Let me set up a scenario. Let $A$ be the event that an earthquake is occurring. Let event $B$ be the event that…

probability intuition

asked Apr 01 '13 at 07:05

David Faux

853
2
11
16

38

votes

9 answers

Is overfitting "better" than underfitting?

I've understood the main concepts behind overfitting and underfitting, even though some reasons as to why they occur might not be as clear to me. But what I am wondering is: isn't overfitting "better" than underfitting? If we compare how well the…

machine-learning neural-networks overfitting bias-variance-tradeoff

asked Apr 28 '21 at 11:03

LeLuc

621
2
10

38

votes

2 answers

Which statistical model is being used in the Pfizer study design for vaccine efficacy?

I know there's a similar question here: How to calculate 95% CI of vaccine with 90% efficacy? but it doesn't have an answer, at the moment. Also, my question is different: the other question asks how to compute VE, using functions from a R package.…

bayesian statistical-power epidemiology clinical-trials relative-risk

asked Nov 17 '20 at 10:14

DeltaIV

15,894
4
62
104

Most Popular