Most Popular

1500 questions
53
votes
3 answers

How do I find peaks in a dataset?

If I have a data set that produces a graph such as the following, how would I algorithmically determine the x-values of the peaks shown (in this case three of them):
nonaxiomatic
  • 531
  • 1
  • 5
  • 4
53
votes
6 answers

What are the main theorems in Machine (Deep) Learning?

Al Rahimi has recently given a very provocative talk in NIPS 2017 comparing current Machine Learning to Alchemy. One of his claims is that we need to get back to theoretical developments, to have simple theorems proving foundational results. When…
user188529
53
votes
6 answers

What book is recommendable to start learning statistics using R at the same time?

Books to Learn Statistics using R What exactly is the book I'm looking for. What I am looking for is a book that teaches you statistics while using R to give you hands-on experience and thus end up helping you learn R together. I've seen on amazon…
Oeufcoque Penteano
  • 756
  • 1
  • 12
  • 23
53
votes
10 answers

What is the difference between prediction and inference?

I'm reading through "An Introduction to Statistical Learning" . In chapter 2, they discuss the reason for estimating a function $f$. 2.1.1 Why Estimate $f$? There are two main reasons we may wish to estimate f : prediction and inference. We discuss…
user1592380
  • 631
  • 1
  • 6
  • 4
53
votes
2 answers

How to read Cook's distance plots?

Does anyone know how to work out whether points 7, 16 and 29 are influential points or not? I read somewhere that because Cook's distance is lower than 1, they are not. Am, I right?
Platypezid
  • 1,197
  • 3
  • 13
  • 16
53
votes
16 answers

Most confusing statistical terms

We statisticians use many words in ways that are slightly different from the way everyone else uses them. This cause lots of problems when we teach or explain what we are doing. I'll start a list (and now I'll add some definitions, per…
Peter Flom
  • 94,055
  • 35
  • 143
  • 276
53
votes
3 answers

Is there any difference between lm and glm for the gaussian family of glm?

Specifically, I want to know if there is a difference between lm(y ~ x1 + x2) and glm(y ~ x1 + x2, family=gaussian). I think that this particular case of glm is equal to lm. Am I wrong?
user3682457
  • 653
  • 1
  • 6
  • 6
53
votes
9 answers

Are all models useless? Is any exact model possible -- or useful?

This question has been festering in my mind for over a month. The February 2015 issue of Amstat News contains an article by Berkeley Professor Mark van der Laan that scolds people for using inexact models. He states that by using models, statistics…
53
votes
2 answers

Intuition behind why Stein's paradox only applies in dimensions $\ge 3$

Stein's Example shows that the maximum likelihood estimate of $n$ normally distributed variables with means $\mu_1,\ldots,\mu_n$ and variances $1$ is inadmissible (under a square loss function) iff $n\ge 3$. For a neat proof, see the first chapter…
53
votes
4 answers

Class imbalance in Supervised Machine Learning

This is a question in general, not specific to any method or data set. How do we deal with a class imbalance problem in Supervised Machine learning where the number of 0 is around 90% and number of 1 is around 10% in your dataset.How do we optimally…
NG_21
  • 1,436
  • 3
  • 17
  • 25
53
votes
3 answers

Why do we care so much about normally distributed error terms (and homoskedasticity) in linear regression when we don't have to?

I suppose I get frustrated every time I hear someone say that non-normality of residuals and /or heteroskedasticity violates OLS assumptions. To estimate parameters in an OLS model neither of these assumptions are necessary by the Gauss-Markov…
53
votes
3 answers

Data APIs/feeds available as packages in R

EDIT: The Web Technologies and Services CRAN task view contains a much more comprehensive list of data sources and APIs available in R. You can submit a pull request on github if you wish to add a package to the task view. I'm making a list of the…
Zach
  • 22,308
  • 18
  • 114
  • 158
53
votes
6 answers

Why downsample?

Suppose I want to learn a classifier that predicts if an email is spam. And suppose only 1% of emails are spam. The easiest thing to do would be to learn the trivial classifier that says none of the emails are spam. This classifier would give us…
Jessica
  • 1,781
  • 2
  • 15
  • 17
53
votes
5 answers

Interpreting QQplot - Is there any rule of thumb to decide for non-normality?

I have read enough threads on QQplots here to understand that a QQplot can be more informative than other normality tests. However, I am inexperienced with interpreting QQplots. I googled a lot; I found a lot of graphs of non-normal QQplots, but no…
greymatter0
  • 743
  • 1
  • 6
  • 11
53
votes
4 answers

What is perplexity?

I came across term perplexity which refers to the log-averaged inverse probability on unseen data. Wikipedia article on perplexity does not give an intuitive meaning for the same. This perplexity measure was used in pLSA paper. Can anyone explain…
Learner
  • 4,007
  • 11
  • 37
  • 39