Highest Voted Questions - Statistical Analysis Stack Exchange

270

votes

6 answers

What is batch size in neural network?

I'm using Python Keras package for neural network. This is the link. Is batch_size equals to number of test samples? From Wikipedia we have this information: However, in other cases, evaluating the sum-gradient may require expensive evaluations…

neural-networks python terminology keras

asked May 22 '15 at 09:15

user2991243

3,621
4
22
48

269

votes

6 answers

Is $R^2$ useful or dangerous?

I was skimming through some lecture notes by Cosma Shalizi (in particular, section 2.1.1 of the second lecture), and was reminded that you can get very low $R^2$ even when you have a completely linear model. To paraphrase Shalizi's example: suppose…

regression r-squared

asked Jul 20 '11 at 20:32

raegtin

9,090
12
48
53

265

votes

13 answers

Is there any reason to prefer the AIC or BIC over the other?

The AIC and BIC are both methods of assessing model fit penalized for the number of estimated parameters. As I understand it, BIC penalizes models more for free parameters than does AIC. Beyond a preference based on the stringency of the criteria,…

modeling aic cross-validation bic model-selection

asked Jul 23 '10 at 20:49

russellpierce

17,079
16
67
98

262

votes

10 answers

How would you explain covariance to someone who understands only the mean?

...assuming that I'm able to augment their knowledge about variance in an intuitive fashion ( Understanding "variance" intuitively ) or by saying: It's the average distance of the data values from the 'mean' - and since variance is in square units,…

variance covariance intuition teaching

asked Nov 07 '11 at 19:41

PhD

13,429
19
45
47

261

votes

11 answers

How would you explain Markov Chain Monte Carlo (MCMC) to a layperson?

Maybe the concept, why it's used, and an example.

bayesian markov-chain-montecarlo intuition teaching

asked Jul 19 '10 at 23:21

Neil McGuigan

9,292
13
54
62

254

votes

3 answers

How to know that your machine learning problem is hopeless?

Imagine a standard machine-learning scenario: You are confronted with a large multivariate dataset and you have a pretty blurry understanding of it. What you need to do is to make predictions about some variable based on what you have. As…

machine-learning forecasting modeling model-selection forecastability

asked Jul 05 '16 at 08:22

Tim

108,699
20
212
390

252

votes

15 answers

What are the differences between Factor Analysis and Principal Component Analysis?

It seems that a number of the statistical packages that I use wrap these two concepts together. However, I'm wondering if there are different assumptions or data 'formalities' that must be true to use one over the other. A real example would be…

pca factor-analysis

asked Aug 12 '10 at 03:46

Brandon Bertelsen

6,672
9
35
46

247

votes

46 answers

What are common statistical sins?

I'm a grad student in psychology, and as I pursue more and more independent studies in statistics, I am increasingly amazed by the inadequacy of my formal training. Both personal and second hand experience suggests that the paucity of statistical…

fallacy

asked Nov 15 '10 at 18:46

Mike Lawrence

12,691
8
40
65

242

votes

7 answers

How to choose a predictive model after k-fold cross-validation?

I am wondering how to choose a predictive model after doing K-fold cross-validation. This may be awkwardly phrased, so let me explain in more detail: whenever I run K-fold cross-validation, I use K subsets of the training data, and end up with K…

cross-validation model-selection

asked Mar 15 '13 at 02:21

Berk U.

4,265
5
21
42

231

votes

38 answers

What is the best introductory Bayesian statistics textbook?

Which is the best introductory textbook for Bayesian statistics? One book per answer, please.

bayesian references

asked Jul 19 '10 at 21:18

Shane

11,961
17
71
89

228

votes

8 answers

Algorithms for automatic model selection

I would like to implement an algorithm for automatic model selection. I am thinking of doing stepwise regression but anything will do (it has to be based on linear regressions though). My problem is that I am unable to find a methodology, or an…

feature-selection model-selection aic stepwise-regression faq

asked Jan 09 '12 at 18:22

S4M

2,432
3
13
6

226

votes

4 answers

ROC vs precision-and-recall curves

I understand the formal differences between them, what I want to know is when it is more relevant to use one vs. the other. Do they always provide complementary insight about the performance of a given classification/detection system? When is it…

machine-learning roc precision-recall

asked Feb 14 '11 at 17:10

Amelio Vazquez-Reina

17,546
26
74
110

222

votes

4 answers

When (and why) should you take the log of a distribution (of numbers)?

Say I have some historical data e.g., past stock prices, airline ticket price fluctuations, past financial data of the company... Now someone (or some formula) comes along and says "let's take/use the log of the distribution" and here's where I go…

distributions data-transformation logarithm

asked Nov 23 '11 at 20:41

PhD

13,429
19
45
47

219

votes

13 answers

What is the difference between data mining, statistics, machine learning and AI?

What is the difference between data mining, statistics, machine learning and AI? Would it be accurate to say that they are 4 fields attempting to solve very similar problems but with different approaches? What exactly do they have in common and…

machine-learning data-mining

asked Nov 30 '10 at 11:26

Olivier Lalonde

121
3
3
5

219

votes

9 answers

Why is Newton's method not widely used in machine learning?

This is something that has been bugging me for a while, and I couldn't find any satisfactory answers online, so here goes: After reviewing a set of lectures on convex optimization, Newton's method seems to be a far superior algorithm than gradient…

machine-learning optimization gradient-descent hessian

asked Dec 29 '16 at 01:00

Fei Yang

2,181
3
8
4

Most Popular