Highest Voted Questions - Statistical Analysis Stack Exchange

101

votes

18 answers

Including the interaction but not the main effects in a model

Is it ever valid to include a two-way interaction in a model without including the main effects? What if your hypothesis is only about the interaction, do you still need to include the main effects?

regression modeling interaction regression-coefficients

asked May 20 '11 at 01:19

Glen

6,320
4
37
59

100

votes

5 answers

Why is ANOVA taught / used as if it is a different research methodology compared to linear regression?

ANOVA is equivalent to linear regression with the use of suitable dummy variables. The conclusions remain the same irrespective of whether you use ANOVA or linear regression. In light of their equivalence, is there any reason why ANOVA is used…

regression anova

asked Jul 23 '10 at 15:17

user28

100

votes

9 answers

Is there an intuitive explanation why multicollinearity is a problem in linear regression?

The wiki discusses the problems that arise when multicollinearity is an issue in linear regression. The basic problem is multicollinearity results in unstable parameter estimates which makes it very difficult to assess the effect of independent…

regression multicollinearity intuition faq

asked Aug 02 '10 at 22:42

user28

99

votes

1 answer

Interpreting plot.lm()

I had a question about interpreting the graphs generated by plot(lm) in R. I was wondering if you guys could tell me how to interpret the scale-location and leverage-residual plots? Any comments would be appreciated. Assume basic knowledge of…

r regression data-visualization residuals outliers

asked May 04 '13 at 21:34

Guest

991
2
7
3

99

votes

5 answers

What is the relation between k-means clustering and PCA?

It is a common practice to apply PCA (principal component analysis) before a clustering algorithm (such as k-means). It is believed that it improves the clustering results in practice (noise reduction). However I am interested in a comparative and…

clustering pca k-means

asked Nov 23 '15 at 22:42

mic

3,848
3
23
38

98

votes

8 answers

What is the benefit of breaking up a continuous predictor variable?

I'm wondering what the value is in taking a continuous predictor variable and breaking it up (e.g., into quintiles), before using it in a model. It seems to me that by binning the variable we lose information. Is this just so we can model…

regression continuous-data regression-strategies binning faq

asked Aug 31 '13 at 05:32

Tom

1,511
1
12
17

98

votes

5 answers

Mean absolute error OR root mean squared error?

Why use Root Mean Squared Error (RMSE) instead of Mean Absolute Error (MAE)?? Hi I've been investigating the error generated in a calculation - I initially calculated the error as a Root Mean Normalised Squared Error. Looking a little closer, I…

least-squares mean rms mae

asked Jan 22 '13 at 17:11

user1665220

1,105
1
8
6

98

votes

13 answers

What is the best way to identify outliers in multivariate data?

Suppose I have a large set of multivariate data with at least three variables. How can I find the outliers? Pairwise scatterplots won't work as it is possible for an outlier to exist in 3 dimensions that is not an outlier in any of the 2 dimensional…

multivariate-analysis outliers

asked Jul 20 '10 at 05:02

Rob Hyndman

51,928
23
126
178

98

votes

9 answers

Understanding "variance" intuitively

What is the cleanest, easiest way to explain someone the concept of variance? What does it intuitively mean? If one is to explain this to their child how would one go about it? It's a concept that I have difficulty in articulating - especially when…

distributions variance standard-deviation inference intuition

asked Oct 25 '11 at 21:28

PhD

13,429
19
45
47

98

votes

30 answers

Is there a way to remember the definitions of Type I and Type II Errors?

I'm not a statistician by education, I'm a software engineer. Yet statistics comes up a lot. In fact, questions specifically about Type I and Type II error are coming up a lot in the course of my studying for the Certified Software Development…

terminology type-i-and-ii-errors

asked Aug 12 '10 at 19:55

Thomas Owens

1,091
1
10
19

98

votes

8 answers

Generate a random variable with a defined correlation to an existing variable(s)

For a simulation study I have to generate random variables that show a predefined (population) correlation to an existing variable $Y$. I looked into the R packages copula and CDVine which can produce random multivariate distributions with a given…

r correlation random-variable random-generation

asked Aug 31 '11 at 09:18

Felix S

4,432
4
26
34

98

votes

3 answers

Can someone explain Gibbs sampling in very simple words?

I'm doing some reading on topic modeling (with Latent Dirichlet Allocation) which makes use of Gibbs sampling. As a newbie in statistics―well, I know things like binomials, multinomials, priors, etc.―,I find it difficult to grasp how Gibbs sampling…

modeling sampling conditional-probability gibbs

asked May 01 '11 at 19:37

Thea

983
1
7
4

97

votes

1 answer

Correlation between a nominal (IV) and a continuous (DV) variable

I have a nominal variable (different topics of conversation, coded as topic0=0 etc) and a number of scale variables (DV) such as the length of a conversation. How can I derive correlations between the nominal and scale variables?

correlation continuous-data categorical-data

asked Oct 13 '14 at 06:02

Paul Miller

971
2
7
3

96

votes

2 answers

Solving for regression parameters in closed-form vs gradient descent

In Andrew Ng's machine learning course, he introduces linear regression and logistic regression, and shows how to fit the model parameters using gradient descent and Newton's method. I know gradient descent can be useful in some applications of…

regression machine-learning logistic gradient-descent

asked Feb 20 '12 at 01:47

Jeff

3,525
5
27
38

95

votes

4 answers

How to choose nlme or lme4 R library for mixed effects models?

I have fit a few mixed effects models (particularly longitudinal models) using lme4 in R but would like to really master the models and the code that goes with them. However, before diving in with both feet (and buying some books) I want to be sure…

r mixed-model lme4-nlme

asked Dec 10 '10 at 09:31

Chris Beeley

5,465
5
36
40

Most Popular