Highest Voted Questions - Statistical Analysis Stack Exchange

53

votes

3 answers

How do I find peaks in a dataset?

If I have a data set that produces a graph such as the following, how would I algorithmically determine the x-values of the peaks shown (in this case three of them):

data-visualization mode

asked Sep 14 '12 at 15:35

nonaxiomatic

531
1
5
4

53

votes

6 answers

What are the main theorems in Machine (Deep) Learning?

Al Rahimi has recently given a very provocative talk in NIPS 2017 comparing current Machine Learning to Alchemy. One of his claims is that we need to get back to theoretical developments, to have simple theorems proving foundational results. When…

machine-learning deep-learning mathematical-statistics

asked Jan 06 '18 at 15:37

user188529

53

votes

6 answers

What book is recommendable to start learning statistics using R at the same time?

Books to Learn Statistics using R What exactly is the book I'm looking for. What I am looking for is a book that teaches you statistics while using R to give you hands-on experience and thus end up helping you learn R together. I've seen on amazon…

r references

asked Apr 01 '12 at 05:08

Oeufcoque Penteano

756
1
12
23

53

votes

10 answers

What is the difference between prediction and inference?

I'm reading through "An Introduction to Statistical Learning" . In chapter 2, they discuss the reason for estimating a function $f$. 2.1.1 Why Estimate $f$? There are two main reasons we may wish to estimate f : prediction and inference. We discuss…

prediction terminology causality

asked Nov 03 '16 at 14:47

user1592380

631
1
6
4

53

votes

2 answers

How to read Cook's distance plots?

Does anyone know how to work out whether points 7, 16 and 29 are influential points or not? I read somewhere that because Cook's distance is lower than 1, they are not. Am, I right?

r regression residuals diagnostic cooks-distance

asked Feb 02 '12 at 12:02

Platypezid

1,197
3
13
16

53

votes

16 answers

Most confusing statistical terms

We statisticians use many words in ways that are slightly different from the way everyone else uses them. This cause lots of problems when we teach or explain what we are doing. I'll start a list (and now I'll add some definitions, per…

terminology communication

asked Jan 12 '12 at 12:35

Peter Flom

94,055
35
143
276

53

votes

3 answers

Is there any difference between lm and glm for the gaussian family of glm?

Specifically, I want to know if there is a difference between lm(y ~ x1 + x2) and glm(y ~ x1 + x2, family=gaussian). I think that this particular case of glm is equal to lm. Am I wrong?

r normal-distribution generalized-linear-model lm

asked Nov 10 '15 at 18:37

user3682457

653
1
6
6

53

votes

9 answers

Are all models useless? Is any exact model possible -- or useful?

This question has been festering in my mind for over a month. The February 2015 issue of Amstat News contains an article by Berkeley Professor Mark van der Laan that scolds people for using inexact models. He states that by using models, statistics…

machine-learning maximum-likelihood modeling nonparametric targeted-maximum-likelihood

asked Apr 02 '15 at 00:59

Russ Lenth

15,161
20
53

53

votes

2 answers

Intuition behind why Stein's paradox only applies in dimensions $\ge 3$

Stein's Example shows that the maximum likelihood estimate of $n$ normally distributed variables with means $\mu_1,\ldots,\mu_n$ and variances $1$ is inadmissible (under a square loss function) iff $n\ge 3$. For a neat proof, see the first chapter…

maximum-likelihood unbiased-estimator intuition steins-phenomenon

asked Jul 26 '11 at 08:54

Har

1,494
11
15

53

votes

4 answers

Class imbalance in Supervised Machine Learning

This is a question in general, not specific to any method or data set. How do we deal with a class imbalance problem in Supervised Machine learning where the number of 0 is around 90% and number of 1 is around 10% in your dataset.How do we optimally…

machine-learning unbalanced-classes supervised-learning

asked Jan 05 '15 at 12:14

NG_21

1,436
3
17
25

53

votes

3 answers

Why do we care so much about normally distributed error terms (and homoskedasticity) in linear regression when we don't have to?

I suppose I get frustrated every time I hear someone say that non-normality of residuals and /or heteroskedasticity violates OLS assumptions. To estimate parameters in an OLS model neither of these assumptions are necessary by the Gauss-Markov…

regression assumptions normality-assumption robust teaching

asked Dec 30 '14 at 22:22

Zachary Blumenfeld

3,826
1
14
21

53

votes

3 answers

Data APIs/feeds available as packages in R

EDIT: The Web Technologies and Services CRAN task view contains a much more comprehensive list of data sources and APIs available in R. You can submit a pull request on github if you wish to add a package to the task view. I'm making a list of the…

r references dataset

asked Jul 05 '11 at 14:31

Zach

22,308
18
114
158

53

votes

6 answers

Why downsample?

Suppose I want to learn a classifier that predicts if an email is spam. And suppose only 1% of emails are spam. The easiest thing to do would be to learn the trivial classifier that says none of the emails are spam. This classifier would give us…

machine-learning classification

asked Nov 02 '14 at 19:25

Jessica

1,781
2
15
17

53

votes

5 answers

Interpreting QQplot - Is there any rule of thumb to decide for non-normality?

I have read enough threads on QQplots here to understand that a QQplot can be more informative than other normality tests. However, I am inexperienced with interpreting QQplots. I googled a lot; I found a lot of graphs of non-normal QQplots, but no…

interpretation normality-assumption qq-plot

asked Aug 07 '14 at 08:41

greymatter0

743
1
6
11

53

votes

4 answers

What is perplexity?

I came across term perplexity which refers to the log-averaged inverse probability on unseen data. Wikipedia article on perplexity does not give an intuitive meaning for the same. This perplexity measure was used in pLSA paper. Can anyone explain…

intuition information-theory measurement perplexity

asked May 04 '11 at 06:04

Learner

4,007
11
37
39

Most Popular