Highest Voted Questions - Statistical Analysis Stack Exchange

80

votes

3 answers

What is the intuition behind SVD?

I have read about singular value decomposition (SVD). In almost all textbooks it is mentioned that it factorizes the matrix into three matrices with given specification. But what is the intuition behind splitting the matrix in such form? PCA and…

pca matrix intuition linear-algebra svd

asked Oct 15 '15 at 17:17

SHASHANK GUPTA

1,139
2
10
17

80

votes

6 answers

Is there any good reason to use PCA instead of EFA? Also, can PCA be a substitute for factor analysis?

In some disciplines, PCA (principal component analysis) is systematically used without any justification, and PCA and EFA (exploratory factor analysis) are considered as synonyms. I therefore recently used PCA to analyse the results of a scale…

pca factor-analysis exploratory-data-analysis

asked Nov 07 '14 at 10:56

Carine

809
2
7
4

80

votes

9 answers

Skills hard to find in machine learners?

It seems that data mining and machine learning became so popular that now almost every CS student knows about classifiers, clustering, statistical NLP ... etc. So it seems that finding data miners is not a hard thing nowadays. My question is: What…

machine-learning data-mining

asked Jun 24 '14 at 07:11

Jack Twain

7,781
14
48
74

79

votes

5 answers

What are good RMSE values?

Suppose I have some dataset. I perform some regression on it. I have a separate test dataset. I test the regression on this set. Find the RMSE on the test data. How should I conclude that my learning algorithm has done well, I mean what properties…

regression error

asked Apr 16 '13 at 21:03

Shishir Pandey

1,051
2
9
11

79

votes

8 answers

Is there a name for the phenomenon of false positives counterintuitively outstripping true positives

It seems very counter intuitive to many people that a given diagnostic test with very high accuracy (say 99%) can generate massively more false positives than true positives in some situations, namely where the population of true positives is very…

probability terminology intuition

asked Oct 14 '19 at 11:29

Roger Heathcote

893
1
4
6

79

votes

2 answers

Bayes regression: how is it done in comparison to standard regression?

I got some questions about the Bayesian regression: Given a standard regression as $y = \beta_0 + \beta_1 x + \varepsilon$. If I want to change this into a Bayesian regression, do I need prior distributions both for $\beta_0$ and $\beta_1$ (or…

regression bayesian

asked Dec 20 '16 at 17:35

TinglTanglBob

878
1
8
13

79

votes

9 answers

What algorithm should I use to detect anomalies on time-series?

Background I'm working in Network Operations Center, we monitor computer systems and their performance. One of the key metrics to monitor is a number of visitors\customers currently connected to our servers. To make it visible we (Ops team) collect…

machine-learning time-series python computational-statistics anomaly-detection

asked May 16 '15 at 21:10

Ilya Khadykin

891
1
7
6

79

votes

1 answer

How to interpret coefficients in a Poisson regression?

How can I interpret the main effects (coefficients for dummy-coded factor) in a Poisson regression? Assume the following example: treatment <- factor(rep(c(1, 2), c(43, 41)), levels = c(1, 2), …

r generalized-linear-model interpretation poisson-distribution regression-coefficients

asked May 21 '11 at 15:10

user734124

79

votes

7 answers

Rules of thumb for minimum sample size for multiple regression

Within the context of a research proposal in the social sciences, I was asked the following question: I have always gone by 100 + m (where m is the number of predictors) when determining minimum sample size for multiple regression. Is…

regression sample-size statistical-power rule-of-thumb

asked Apr 28 '11 at 06:40

Jeromy Anglim

42,044
23
146
250

79

votes

5 answers

How exactly did statisticians agree to using (n-1) as the unbiased estimator for population variance without simulation?

The formula for computing variance has $(n-1)$ in the denominator: $s^2 = \frac{\sum_{i=1}^N (x_i - \bar{x})^2}{n-1}$ I've always wondered why. However, reading and watching a few good videos about "why" it is, it seems, $(n-1)$ is a good unbiased…

variance unbiased-estimator proof history

asked May 26 '14 at 00:09

PhD

13,429
19
45
47

78

votes

2 answers

Basic question about Fisher Information matrix and relationship to Hessian and standard errors

Ok, this is a quite basic question, but I am a little bit confused. In my thesis I write: The standard errors can be found by calculating the inverse of the square root of the diagonal elements of the (observed) Fisher Information…

maximum-likelihood fisher-information

asked Aug 22 '13 at 15:16

Jen Bohold

1,410
2
13
19

78

votes

6 answers

What are good initial weights in a neural network?

I have just heard, that it's a good idea to choose initial weights of a neural network from the range $(\frac{-1}{\sqrt d} , \frac{1}{\sqrt d})$, where $d$ is the number of inputs to a given neuron. It is assumed, that the sets are normalized - mean…

neural-networks normalization

asked Jan 12 '13 at 21:26

elmes

907
1
7
10

78

votes

3 answers

Diagnostics for logistic regression?

For linear regression, we can check the diagnostic plots (residuals plots, Normal QQ plots, etc) to check if the assumptions of linear regression are violated. For logistic regression, I am having trouble finding resources that explain how to…

regression logistic diagnostic

asked Dec 03 '12 at 23:15

ialm

1,707
2
19
19

78

votes

12 answers

Famous statistical wins and horror stories for teaching purposes

I am designing a one year program in data analysis with a local community college. The program aims to prepare students to handle basic tasks in data analysis, visualization and summarization, advanced Excel skills and R programming. I would like…

mathematical-statistics data-visualization experiment-design teaching

asked Nov 01 '19 at 13:07

Placidia

13,501
6
33
62

78

votes

1 answer

How does a simple logistic regression model achieve a 92% classification accuracy on MNIST?

Even though all the images in the MNIST dataset are centered, with a similar scale, and face up with no rotations, they have a significant handwriting variation that puzzles me how a linear model achieves such a high classification accuracy. As far…

logistic image-processing

asked Sep 11 '19 at 22:54

Nitish Agarwal

813
4
6

Most Popular