Highest Voted Questions - Statistical Analysis Stack Exchange

37

votes

2 answers

Interpretation of plot (glm.model)

Can anyone tell me how to interpret the 'residuals vs fitted', 'normal q-q', 'scale-location', and 'residuals vs leverage' plots? I am fitting a binomial GLM, saving it and then plotting it.

r logistic data-visualization generalized-linear-model qq-plot

asked Oct 26 '14 at 17:38

Summer

371
1
4
4

37

votes

5 answers

How to visualize/understand what a neural network is doing?

Neural networks are often treated as "black boxes" due to their complex structure. This is not ideal, as it is often beneficial to have an intuitive grasp of how a model is working internally. What are methods of visualizing how a trained neural…

data-visualization neural-networks

asked Jun 09 '11 at 17:19

rm999

748
5
10

37

votes

2 answers

Probability inequalities

I am looking for some probability inequalities for sums of unbounded random variables. I would really appreciate it if anyone can provide me some thoughts. My problem is to find an exponential upper bound over the probability that the sum of…

probability mathematical-statistics probability-inequalities moment-generating-function

asked May 26 '11 at 19:27

Farzad

575
3
7

37

votes

5 answers

Examples of PCA where PCs with low variance are "useful"

Normally in principal component analysis (PCA) the first few PCs are used and the low variance PCs are dropped, as they do not explain much of the variation in the data. However, are there examples where the low variation PCs are useful (i.e. have…

pca

asked Jun 07 '14 at 00:01

Michael

373
3
4

37

votes

6 answers

Assumptions of linear models and what to do if the residuals are not normally distributed

I am a little bit confused on what the assumptions of linear regression are. So far I checked whether: all of the explanatory variables correlated linearly with the response variable. (This was the case) there was any collinearity among the…

linear-model residuals assumptions normality-assumption

asked May 27 '14 at 16:23

Stefan

705
2
8
9

36

votes

5 answers

Free data set for very high dimensional classification

What are the freely available data set for classification with more than 1000 features (or sample points if it contains curves)? There is already a community wiki about free data sets: Locating freely available data samples But here, it would be…

machine-learning classification dataset large-data

asked Jul 29 '10 at 12:02

robin girard

6,335
6
46
60

36

votes

2 answers

When is logistic regression solved in closed form?

Take $x \in \{0,1\}^d$ and $y \in \{0,1\}$ and suppose we model the task of predicting y given x using logistic regression. When can logistic regression coefficients be written in closed form? One example is when we use a saturated model. That is,…

logistic generalized-linear-model

asked Jul 28 '10 at 21:59

Yaroslav Bulatov

5,167
2
24
38

36

votes

2 answers

Relative importance of a set of predictors in a random forests classification in R

I'd like to determine the relative importance of sets of variables toward a randomForest classification model in R. The importance function provides the MeanDecreaseGini metric for each individual predictor--is it as simple as summing this across…

r machine-learning classification random-forest

asked Apr 03 '14 at 00:17

Max Ghenis

780
1
9
17

36

votes

1 answer

What does the anova() command do with a lmer model object?

Hopefully this is a question that someone here can answer for me on the nature of decomposing sums of squares from a mixed-effects model fit with lmer (from the lme4 R package). First off I should say that I am aware of the controversy with using…

r anova mixed-model lme4-nlme

asked Oct 04 '13 at 14:05

Martyn

506
1
4
7

36

votes

1 answer

Error metrics for cross-validating Poisson models

I'm cross validating a model that's trying to predict a count. If this was a binary classification problem, I'd calculate out-of-fold AUC, and if this was a regression problem I'd calculate out-of-fold RMSE or MAE. For a Poisson model, what error…

cross-validation poisson-distribution count-data deviance scoring-rules

asked Oct 02 '13 at 18:56

Zach

22,308
18
114
158

36

votes

3 answers

Which variance inflation factor should I be using: $\text{GVIF}$ or $\text{GVIF}^{1/(2\cdot\text{df})}$?

I'm trying to interpret variance inflation factors using the vif function in the R package car. The function prints both a generalised $\text{VIF}$ and also $\text{GVIF}^{1/(2\cdot\text{df})}$. According to the help file, this latter value To…

r multicollinearity variance-inflation-factor

asked Sep 22 '13 at 04:57

jay

1,045
1
12
23

36

votes

5 answers

Neural network with skip-layer connections

I am interested in regression with neural networks. Neural networks with zero hidden nodes + skip-layer connections are linear models. What about the same neural nets but with hidden nodes ? I am wondering what would be the role of the skip-layer…

regression machine-learning neural-networks deep-learning

asked Apr 23 '13 at 12:42

Ben

521
1
5
5

36

votes

3 answers

Is it possible to find the combined standard deviation?

Suppose I have 2 sets: Set A: number of items $n= 10$, $\mu = 2.4$ , $\sigma = 0.8$ Set B: number of items $n= 5$, $\mu = 2$, $\sigma = 1.2$ I can find the combined mean ($\mu$) easily, but how am I supposed to find the combined standard deviation?

standard-deviation

asked Apr 13 '13 at 09:04

kype

495
1
4
5

36

votes

6 answers

Backpropagation vs Genetic Algorithm for Neural Network training

I've read a few papers discussing pros and cons of each method, some arguing that GA doesn't give any improvement in finding the optimal solution while others show that it is more effective. It seems GA is generally preferred in literature (although…

neural-networks genetic-algorithms backpropagation

asked Apr 11 '13 at 23:42

sashkello

2,198
1
20
26

36

votes

1 answer

Multiple comparisons on a mixed effects model

I am trying to analyse some data using a mixed effect model. The data I collected represent the weight of some young animals of different genotype over time. I am using the approach proposed…

r anova mixed-model multiple-comparisons repeated-measures

asked Dec 08 '10 at 11:22

nico

4,246
3
28
42

Most Popular