Highest Voted Questions - Statistical Analysis Stack Exchange

40

votes

6 answers

Testing for autocorrelation: Ljung-Box versus Breusch-Godfrey

I am used to seeing Ljung-Box test used quite frequently for testing autocorrelation in raw data or in model residuals. I had nearly forgotten that there is another test for autocorrelation, namely, Breusch-Godfrey test. Question: what are the main…

time-series hypothesis-testing autocorrelation

asked Apr 23 '15 at 19:24

Richard Hardy

54,375
10
95
219

40

votes

7 answers

Why shouldn't the denominator of the covariance estimator be n-2 rather than n-1?

The denominator of the (unbiased) variance estimator is $n-1$ as there are $n$ observations and only one parameter is being estimated. $$ \mathbb{V}\left(X\right)=\frac{\sum_{i=1}^{n}\left(X_{i}-\overline{X}\right)^{2}}{n-1} $$ By the same token I…

self-study variance covariance descriptive-statistics unbiased-estimator

asked Mar 19 '15 at 13:13

MYaseen208

2,379
7
32
46

40

votes

3 answers

How is Naive Bayes a Linear Classifier?

I've seen the other thread here but I don't think the answer satisfied the actual question. What I have continually read is that Naive Bayes is a linear classifier (ex: here) (such that it draws a linear decision boundary) using the log odds…

classification naive-bayes

asked Mar 17 '15 at 22:52

Kevin Pei

749
2
9
13

40

votes

5 answers

What is the difference between errors and residuals?

While these two ubiquitous terms are often used synonymously, there sometimes seems to be a distinction. Is there indeed a difference, or are they exactly synonymous?

residuals error terminology

asked Jan 14 '15 at 15:27

Constantin

1,117
1
9
24

40

votes

3 answers

Clojure versus R: advantages and disadvantages for data analysis

I had a plan of learning R in the near future. Reading another question I found out about Clojure. Now I don't know what to do. I think a big advantage of R for me is that some people in Economics use it, including one of my supervisors (though the…

r

asked Jul 19 '10 at 21:26

Vivi

1,241
2
14
20

40

votes

3 answers

What are the measure for accuracy of multilabel data?

Consider a scenario where you are provided with KnownLabel Matrix and PredictedLabel matrix. I would like to measure the goodness of the PredictedLabel matrix against the KnownLabel Matrix. But the challenge here is that KnownLabel Matrix have few…

machine-learning data-mining multilabel

asked Jul 06 '11 at 05:05

Learner

4,007
11
37
39

40

votes

1 answer

Detecting Outliers in Time Series (LS/AO/TC) using tsoutliers package in R. How to represent outliers in equation format?

Comments: Firstly I would like to say a big thank you to the author of the new tsoutliers package which implements Chen and Liu's time series outlier detection which was published in the Journal of the American Statistical Association in 1993 in…

time-series forecasting arima outliers

asked Jun 26 '14 at 15:43

forecaster

7,349
9
43
81

40

votes

3 answers

Is Kolmogorov-Smirnov test valid with discrete distributions?

I'm comparing a sample and checking whether it distributes as some, discrete, distribution. However, I'm not enterily sure that Kolmogorov-Smirnov applies. Wikipedia seems to imply it does not. If it does not, how can I test the sample's…

hypothesis-testing discrete-data kolmogorov-smirnov-test

asked Jul 30 '10 at 17:00

Wilhelm

730
1
6
10

40

votes

2 answers

Understanding shape and calculation of confidence bands in linear regression

I am trying to understand the origin of the curved shaped of confidence bands associated with an OLS linear regression and how it relates to the confidence intervals of the regression parameters (slope and intercept), for example (using…

regression confidence-interval

asked Jun 05 '14 at 16:18

David

401
1
5
5

40

votes

4 answers

Standard error clustering in R (either manually or in plm)

I am trying to understand standard error "clustering" and how to execute in R (it is trivial in Stata). In R I have been unsuccessful using either plm or writing my own function. I'll use the diamonds data from the ggplot2 package. I can do fixed…

r panel-data standard-error fixed-effects-model clustered-standard-errors

asked Apr 27 '11 at 02:34

Richard Herron

1,161
2
13
20

39

votes

3 answers

Guideline to select the hyperparameters in Deep Learning

I'm looking for a paper that could help in giving a guideline on how to choose the hyperparameters of a deep architecture, like stacked auto-encoders or deep believe networks. There are a lot of hyperparameters and I'm very confused on how to choose…

machine-learning deep-learning deep-belief-networks hyperparameter

asked Apr 28 '14 at 12:48

Jack Twain

7,781
14
48
74

39

votes

2 answers

How does Factor Analysis explain the covariance while PCA explains the variance?

Here is a quote from Bishop's "Pattern Recognition and Machine Learning" book, section 12.2.4 "Factor analysis": According to the highlighted part, factor analysis captures the covariance between variables in the matrix $W$. I wonder HOW? Here is…

pca factor-analysis geometry

asked Apr 24 '14 at 14:15

avocado

3,045
5
32
45

39

votes

3 answers

What are correct values for precision and recall when the denominators equal 0?

Precision is defined as: p = true positives / (true positives + false positives) What is the value of precision if (true positives + false positives) = 0? Is it just undefined? Same question for recall: r = true positives / (true positives +…

precision-recall

asked Mar 08 '11 at 16:31

Raffi Khatchadourian

641
1
5
10

39

votes

4 answers

Justification of one-tailed hypothesis testing

I understand two-tailed hypothesis testing. You have $H_0 : \theta = \theta_0$ (vs. $H_1 = \neg H_0 : \theta \ne \theta_0$). The $p$-value is the probability that $\theta$ generates data at least as extreme as what was observed. I don't understand…

hypothesis-testing

asked Mar 03 '11 at 19:35

Yang

2,981
3
20
18

39

votes

9 answers

Why use vector error correction model?

I am confused about the Vector Error Correction Model (VECM). Technical background: VECM offers a possibility to apply Vector Autoregressive Model (VAR) to integrated multivariate time series. In the textbooks they name some problems in applying a…

time-series forecasting vector-autoregression cointegration vector-error-correction-model

asked Nov 27 '13 at 02:00

DatamineR

1,477
3
18
25

Most Popular