Highest Voted Questions - Statistical Analysis Stack Exchange

67

votes

3 answers

Is standardization needed before fitting logistic regression?

My question is do we need to standardize the data set to make sure all variables have the same scale, between [0,1], before fitting logistic regression. The formula is: $$\frac{x_i-\min(x_i)}{\max(x_i)-\min(x_i)}$$ My data set has 2 variables,…

regression logistic standardization

asked Jan 23 '13 at 16:33

user1946504

1,247
3
14
17

67

votes

8 answers

Regression with multiple dependent variables?

Is it possible to have a (multiple) regression equation with two or more dependent variables? Sure, you could run two separate regression equations, one for each DV, but that doesn't seem like it would capture any relationship between the two DVs?

regression

asked Nov 14 '10 at 02:50

Jeff

3,525
5
27
38

67

votes

9 answers

Taleb and the Black Swan

Taleb's book "The Black Swan" was a New York Times best seller when it came out several years ago. The book is now in its second edition. After meeting with statisticians at a JSM (an annual statistical conference), Taleb toned down his criticism…

extreme-value rare-events

asked Sep 09 '12 at 12:54

Michael R. Chernick

39,640
28
74
143

67

votes

5 answers

How small a quantity should be added to x to avoid taking the log of zero?

I have analysed my data as they are. Now I want to look at my analyses after taking the log of all variables. Many variables contain many zeros. Therefore I add a small quantity to avoid taking the log of zero. So far I've added 10^-10, without any…

data-transformation chemometrics faq

asked Jun 19 '12 at 09:47

miura

3,364
3
21
27

67

votes

6 answers

Criticism of Pearl's theory of causality

In the year 2000, Judea Pearl published Causality. What controversies surround this work? What are its major criticisms?

causality

asked Apr 13 '12 at 23:08

Neil G

13,633
3
41
84

67

votes

7 answers

Where did the frequentist-Bayesian debate go?

The world of statistics was divided between frequentists and Bayesians. These days it seems everyone does a bit of both. How can this be? If the different approaches are suitable for different problems, why did the founding fathers of statistics did…

bayesian frequentist history philosophical

asked Jan 03 '12 at 20:08

JohnRos

5,336
26
56

67

votes

3 answers

Variables are often adjusted (e.g. standardised) before making a model - when is this a good idea, and when is it a bad one?

In what circumstances would you want to, or not want to scale or standardize a variable prior to model fitting? And what are the advantages / disadvantages of scaling a variable?

modeling predictive-models feature-selection mathematical-statistics standardization

asked Dec 01 '11 at 16:29

Andrew

5,478
5
21
21

67

votes

3 answers

References containing arguments against null hypothesis significance testing?

In the last few years I've read a number of papers arguing against the use of null hypothesis significance testing in science, but didn't think to keep a persistent list. A colleague recently asked me for such a list, so I thought I'd ask everyone…

hypothesis-testing statistical-significance references p-value

asked May 08 '11 at 16:09

Mike Lawrence

12,691
8
40
65

66

votes

4 answers

Why is logistic regression a linear classifier?

Since we are using the logistic function to transform a linear combination of the input into a non-linear output, how can logistic regression be considered a linear classifier? Linear regression is just like a neural network without the hidden…

logistic classification neural-networks

asked Apr 12 '14 at 19:34

Jack Twain

7,781
14
48
74

66

votes

4 answers

What is a contrast matrix?

What exactly is contrast matrix (a term, pertaining to an analysis with categorical predictors) and how exactly is contrast matrix specified? I.e. what are columns, what are rows, what are the constraints on that matrix and what does number in…

regression categorical-data definition contrasts categorical-encoding

asked Dec 02 '13 at 21:19

Tomas

5,735
11
52
93

66

votes

9 answers

How to visualize what ANOVA does?

What way (ways?) is there to visually explain what is ANOVA? Any references, link(s) (R packages?) will be welcomed.

data-visualization anova teaching

asked Dec 08 '10 at 21:45

Tal Galili

19,935
32
133
195

66

votes

1 answer

Why is the square root transformation recommended for count data?

It is often recommended to take the square root when you have count data. (For some examples on CV, see @HarveyMotulsky's answer here, or @whuber's answer here.) On the other hand, when fitting a generalized linear model with a response variable…

generalized-linear-model data-transformation poisson-distribution count-data variance-stabilizing

asked Dec 22 '12 at 03:11

gung - Reinstate Monica

132,789
81
357
650

66

votes

1 answer

Are unbalanced datasets problematic, and (how) does oversampling (purport to) help?

TL;DR See title. Motivation I am hoping for a canonical answer along the lines of "(1) No, (2) Not applicable, because (1)", which we can use to close many wrong questions about unbalanced datasets and oversampling. I would be quite as happy to be…

unbalanced-classes oversampling

asked Jul 16 '18 at 21:22

Stephan Kolassa

95,027
13
197
357

66

votes

8 answers

Is the R language reliable for the field of economics?

I am a graduate student in economics who recently converted to R from other very well-known statistical packages (I was using SPSS mainly). My little problem at the moment is that I am the only R user in my class. My classmates use Stata and Gauss…

r software econometrics

asked Apr 03 '12 at 23:40

SavedByJESUS

1,123
3
10
14

66

votes

1 answer

40,000 neuroscience papers might be wrong

I saw this article in the Economist about a seemingly devastating paper [1] casting doubt on "something like 40,000 published [fMRI] studies." The error, they say, is because of "erroneous statistical assumptions." I read the paper and see it's…

hypothesis-testing multiple-comparisons spatial neuroimaging neuroscience

asked Jul 25 '16 at 17:09

R Greg Stacey

2,202
2
15
30

Most Popular