Highest Voted Questions - Statistical Analysis Stack Exchange

38

votes

2 answers

Why use stratified cross validation? Why does this not damage variance related benefit?

I've been told that is beneficial to use stratified cross validation especially when response classes are unbalanced. If one purpose of cross-validation is to help account for the randomness of our original training data sample, surely making each…

cross-validation resampling stratification

asked Oct 02 '14 at 16:45

James Owers

627
1
5
11

38

votes

7 answers

Is there a good browser/viewer to see an R dataset (.rda file)

I want to browse a .rda file (R dataset). I know about the View(datasetname) command. The default R.app that comes for Mac does not have a very good browser for data (it opens a window in X11). I like the RStudio data browser that opens with the…

r

asked Jun 04 '11 at 04:45

Curious2learn

695
2
6
8

38

votes

3 answers

How does R handle missing values in lm?

I'd like to regress a vector B against each of the columns in a matrix A. This is trivial if there are no missing data, but if matrix A contains missing values, then my regression against A is constrained to include only rows where all values are…

r missing-data linear-model

asked May 19 '11 at 21:03

David Quigley

483
1
4
7

38

votes

2 answers

How do I know which method of cross validation is best?

I am trying to figure out which cross validation method is best for my situation. The following data are just an example for working through the issue (in R), but my real X data (xmat) are correlated with each other and correlated to different…

r regression cross-validation linear-model

asked Jun 15 '14 at 15:25

rdorlearn

3,493
6
26
29

38

votes

1 answer

Why does glmer not achieve the maximum likelihood (as verified by applying further generic optimization)?

Numerically deriving the MLEs of GLMM is difficult and, in practice, I know, we should not use brute force optimization (e.g., using optim in a simple way). But for my own educational purpose, I want to try it to make sure I correctly understand the…

r maximum-likelihood optimization lme4-nlme

asked May 26 '14 at 02:20

quibble

1,167
10
17

37

votes

4 answers

Functions of Independent Random Variables

Is the claim that functions of independent random variables are themselves independent, true? I have seen that result often used implicitly in some proofs, for example in the proof of independence between the sample mean and the sample variance of…

probability self-study random-variable independence

asked Apr 23 '14 at 14:39

JohnK

18,298
10
60
103

37

votes

8 answers

Help me calculate how many people will come to my wedding! Can I attribute a percentage to each person and add them?

I am planning my wedding. I wish to estimate how many people will come to my wedding. I have created a list of people and the chance that they will attend in percentage. For example Dad 100% Mom 100% Bob 50% Marc 10% Jacob 25% Joseph 30% I…

probability

asked Apr 13 '14 at 04:55

Behacad

4,916
8
30
48

37

votes

5 answers

How to resolve Simpson's paradox?

Simpson's paradox is a classic puzzle discussed in introductory statistics courses worldwide. However, my course was content to simply note that a problem existed and did not provide a solution. I would like to know how to resolve the paradox. That…

simpsons-paradox

asked Dec 02 '13 at 01:39

Potato

1,025
1
11
12

37

votes

4 answers

What is the difference between McNemar's test and the chi-squared test, and how do you know when to use each?

I have tried reading up on different sources, but I am still not clear what test would be the appropriate in my case. There are three different questions I am asking about my dataset: The subjects are tested for infections from X at different…

r chi-squared-test mcnemar-test

asked Nov 18 '13 at 13:37

Anto

693
1
8
13

37

votes

2 answers

Error "system is computationally singular" when running a glm

I'm using the robustbase package to run a glm estimation. However when I do it, I get the following error: Error in solve.default(crossprod(X, DiagB * X)/nobs, EEq) : system is computationally singular: reciprocal condition number =…

r generalized-linear-model robust

asked Nov 13 '13 at 18:11

NK1

543
1
5
6

37

votes

4 answers

Area under curve of ROC vs. overall accuracy

I am a little bit confused about the Area Under Curve (AUC) of ROC and the overall accuracy. Will the AUC be proportional to the overall accuracy? In other words, when we have a larger overall accuracy will we definitely a get larger AUC? Or are…

classification roc

asked Sep 01 '13 at 10:21

Samo Jerom

1,439
2
19
31

37

votes

8 answers

What is Bayes' theorem all about?

What are the main ideas, that is, concepts related to Bayes' theorem? I am not asking for any derivations of complex mathematical notation.

probability bayesian mathematical-statistics

asked Jul 26 '10 at 20:30

user333

6,621
17
44
54

37

votes

2 answers

Model selection and cross-validation: The right way

There are numerous threads in CrossValidated on the topic of model selection and cross validation. Here are a few: Internal vs external cross-validation and model selection @DikranMarsupial's top answer to Feature selection and…

cross-validation model-selection

asked Jul 21 '13 at 01:08

Amelio Vazquez-Reina

17,546
26
74
110

37

votes

3 answers

Difference between a SVM and a perceptron

I am a bit confused with the difference between an SVM and a perceptron. Let me try to summarize my understanding here, and please feel free to correct where I am wrong and fill in what I have missed. The Perceptron does not try to optimize the…

machine-learning svm kernel-trick

asked Jun 07 '13 at 19:15

CuriousMind

2,133
5
24
32

37

votes

3 answers

How do I interpret the 'correlations of fixed effects' in my glmer output?

I have the following output: Generalized linear mixed model fit by the Laplace approximation Formula: aph.remain ~ sMFS2 +sAG2 +sSHDI2 +sbare +season +crop +(1|landscape) AIC BIC logLik deviance 4062 4093 -2022 4044 Random…

mixed-model poisson-distribution lme4-nlme

asked Apr 25 '13 at 15:46

susie

641
2
8
9

Most Popular