Highest Voted Questions - Statistical Analysis Stack Exchange

60

votes

7 answers

Is chi-squared always a one-sided test?

A published article (pdf) contains these 2 sentences: Moreover, misreporting may be caused by the application of incorrect rules or by a lack of knowledge of the statistical test. For example, the total df in an ANOVA may be taken to be the error…

hypothesis-testing chi-squared-test

asked Feb 06 '12 at 14:48

Joel W.

3,096
3
31
45

60

votes

2 answers

What is the variance of the weighted mixture of two gaussians?

Say I have two normal distributions A and B with means $\mu_A$ and $\mu_B$ and variances $\sigma_A$ and $\sigma_B$. I want to take a weighted mixture of these two distributions using weights $p$ and $q$ where $0\le p \le 1$ and $q = 1-p$. I know…

normal-distribution mixture-distribution

asked Oct 06 '11 at 16:22

JoFrhwld

2,247
3
20
22

60

votes

13 answers

Does 10 heads in a row increase the chance of the next toss being a tail?

I assume the following is true: assuming a fair coin, getting 10 heads in a row whilst tossing a coin does not increase the chance of the next coin toss being a tail, no matter what amount of probability and/or statistical jargon is tossed around…

probability independence intuition games bernoulli-process

asked Feb 09 '15 at 08:15

user68492

601
1
6
3

60

votes

3 answers

What is the objective function of PCA?

Principal component analysis can use matrix decomposition, but that is just a tool to get there. How would you find the principal components without the use of matrix algebra? What is the objective function (goal), and what are the constraints?

pca

asked May 02 '11 at 23:10

Neil McGuigan

9,292
13
54
62

59

votes

5 answers

What are disadvantages of state-space models and Kalman Filter for time-series modelling?

Given all good properties of state-space models and KF, I wonder - what are disadvantages of state-space modelling and using Kalman Filter (or EKF, UKF or particle filter) for estimation? Over let's say conventional methodologies like ARIMA, VAR or…

time-series arima kalman-filter vector-autoregression

asked Dec 02 '13 at 10:53

Kochede

2,037
1
16
18

59

votes

6 answers

Is random forest a boosting algorithm?

Short definition of boosting: Can a set of weak learners create a single strong learner? A weak learner is defined to be a classifier which is only slightly correlated with the true classification (it can label examples better than random…

machine-learning random-forest boosting bagging

asked Nov 19 '13 at 16:34

Atilla Ozgur

1,251
1
11
17

59

votes

7 answers

How do I test that two continuous variables are independent?

Suppose I have a sample $(X_n,Y_n), n=1..N$ from the joint distribution of $X$ and $Y$. How do I test the hypothesis that $X$ and $Y$ are independent? No assumption is made on the joint or marginal distribution laws of $X$ and $Y$ (least of all…

hypothesis-testing references independence

asked Oct 23 '13 at 23:54

sds

2,016
1
22
31

59

votes

2 answers

Optimal number of folds in $K$-fold cross-validation: is leave-one-out CV always the best choice?

Computing power considerations aside, are there any reasons to believe that increasing the number of folds in cross-validation leads to better model selection/validation (i.e. that the higher the number of folds the better)? Taking the argument to…

cross-validation bias-variance-tradeoff

asked Jun 12 '13 at 13:24

Amelio Vazquez-Reina

17,546
26
74
110

59

votes

2 answers

What does having "constant variance" in a linear regression model mean?

What does having "constant variance" in the error term mean? As I see it, we have a data with one dependent variable and one independent variable. Constant variance is one of the assumptions of linear regression. I am wondering what homoscedasticity…

regression heteroscedasticity

asked Mar 13 '13 at 12:51

Mukul

737
1
6
8

59

votes

2 answers

How can I change the title of a legend in ggplot2?

I have a plot I'm making in ggplot2 to summarize data that are from a 2 x 4 x 3 celled dataset. I have been able to make panels for the 2-leveled variable using facet_grid(. ~ Age) and to set the x and y axes using aes(x=4leveledVariable, y=DV). I…

r data-visualization ggplot2

asked Nov 29 '10 at 20:54

russellpierce

17,079
16
67
98

59

votes

4 answers

Confidence interval for Bernoulli sampling

I have a random sample of Bernoulli random variables $X_1 ... X_N$, where $X_i$ are i.i.d. r.v. and $P(X_i = 1) = p$, and $p$ is an unknown parameter. Obviously, one can find an estimate for $p$: $\hat{p}:=(X_1+\dots+X_N)/N$. My question is how can…

confidence-interval binomial-distribution bernoulli-distribution

asked Nov 20 '10 at 12:05

Oleg

59

votes

4 answers

Choosing between LM and GLM for a log-transformed response variable

I'm trying to understand the philosophy behind using a Generalized Linear Model (GLM) vs a Linear Model (LM). I've created an example data set below where: $$\log(y) = x + \varepsilon $$ The example does not have the error $\varepsilon$ as a…

r generalized-linear-model linear-model gamma-distribution link-function

asked Nov 19 '12 at 13:28

Marc in the box

3,532
3
33
47

59

votes

7 answers

Intuitive explanation of the bias-variance tradeoff?

I am looking for an intuitive explanation of the bias-variance tradeoff, both in general and specifically in the context of linear regression.

regression variance bias intuition bias-variance-tradeoff

asked Nov 07 '10 at 10:57

NPE

5,351
5
33
44

59

votes

4 answers

Are all values within a 95% confidence interval equally likely?

I have found discordant information on the question: "If one constructs a 95% confidence interval (CI) of a difference in means or a difference in proportions, are all values within the CI equally likely? Or, is the point estimate the most likely,…

confidence-interval

asked Oct 19 '12 at 18:32

pmgjones

5,543
8
36
36

59

votes

5 answers

Is it true that the percentile bootstrap should never be used?

In the MIT OpenCourseWare notes for 18.05 Introduction to Probability and Statistics, Spring 2014 (currently available here), it states: The bootstrap percentile method is appealing due to its simplicity. However it depends on the bootstrap…

confidence-interval bootstrap

asked Jul 12 '18 at 11:58

Clarinetist

3,761
3
25
70

Most Popular