Most Popular

1500 questions
33
votes
1 answer

Negative binomial regression question - is it a poor model?

I am reading a very interesting article by Sellers and Shmueli on regression models for count data. Near the beginning (p. 944) they cite McCullaugh and Nelder (1989) saying that negative binomial regression is unpopular and has a problematic…
Peter Flom
  • 94,055
  • 35
  • 143
  • 276
33
votes
7 answers

How do you convey the beauty of the Central Limit Theorem to a non-statistician?

My father is a math enthusiast, but not interested in statistics much. It would be neat to try to illustrate some of the wonderful bits of statistics, and the CLT is a prime candidate. How would you convey the mathematical beauty and impact of the…
Vince
  • 775
  • 6
  • 7
33
votes
1 answer

Comparing hierarchical clustering dendrograms obtained by different distances & methods

[The initial title "Measurement of similarity for hierarchical clustering trees" was later changed by @ttnphns to better reflect the topic] I am performing a number of hierarchical cluster analyses on a dataframe of patient records (e.g. similar to…
Wouter
  • 2,102
  • 3
  • 17
  • 26
33
votes
1 answer

How to understand SARIMAX intuitively?

I'm trying to understand a paper about electric load forecasting but I'm struggling with the concepts inside, specially the SARIMAX model. This model is used to the predict the load and uses many statistical concepts that I do not understand (I'm an…
Clash
  • 733
  • 2
  • 7
  • 9
33
votes
5 answers

Why should the frequency of heads in a coin toss converge to anything at all?

Suppose we have any kind of coin. Why should the relative frequency of getting a heads converge to any value at all? One answer is that this is simply what we empirically observe this to be the case, and I think this is a valid answer. However, my…
Maximal Ideal
  • 433
  • 2
  • 6
33
votes
6 answers

Inclusion of lagged dependent variable in regression

I'm very confused about if it's legitimate to include a lagged dependent variable into a regression model. Basically I think if this model focuses on the relationship between the change in Y and other independent variables, then adding a lagged…
user22109
  • 351
  • 1
  • 3
  • 3
33
votes
4 answers

How is finding the centroid different from finding the mean?

When performing hierarchical clustering, one can use many metrics to measure the distance between clusters. Two such metrics imply calculation of the centroids and means of data points in the clusters. What is the difference between the mean and the…
John Hoffman
  • 483
  • 1
  • 4
  • 6
33
votes
7 answers

Why is using squared error the standard when absolute error is more relevant to most problems?

I recognize that parts of this topic have been discussed on this forum. Some examples: Is minimizing squared error equivalent to minimizing absolute error? Why squared error is more popular than the latter? Why square the difference instead of…
Ryan Volpi
  • 1,638
  • 8
  • 17
33
votes
3 answers

Why are Gaussian process models called non-parametric?

I am a bit confused. Why are Gaussian processes called non parametric models? They do assume that the functional values, or a subset of them, have a Gaussian prior with mean 0 and covariance function given as the kernel function. These kernel…
user34790
  • 6,049
  • 6
  • 42
  • 64
33
votes
6 answers

Line graph has too many lines, is there a better solution?

I'm trying to graph the number of actions by users (in this case, "likes") over time. So I have "Number of actions" as my y-axis, my x-axis is time (weeks), and each line represents one user. My problem is that I want to look at this data for a set…
regulatethis
  • 433
  • 1
  • 4
  • 6
33
votes
5 answers

Introduction to causal analysis

What are good books that introduce causal analysis? I'm thinking of an introduction that both explains the principles of causal analysis and shows how different statistical methods could be used to apply these principles.
Jack Tanner
  • 4,552
  • 3
  • 27
  • 39
33
votes
5 answers

Extrapolation v. Interpolation

What is the difference between extrapolation and interpolation, and what is the most precise way of using these terms? For example, I have seen a statement in a paper using interpolation as: "The procedure interpolates the shape of the estimated…
Frank Swanton
  • 543
  • 4
  • 9
33
votes
6 answers

When are confidence intervals useful?

If I understand correctly a confidence interval of a parameter is an interval constructed by a method which yields intervals containing the true value for a specified proportion of samples. So the 'confidence' is about the method rather than the…
33
votes
1 answer

Why is PCA sensitive to outliers?

There are many posts on this SE that discuss robust approaches to principal component analysis (PCA), but I cannot find a single good explanation of why PCA is sensitive to outliers in the first place.
Psi
  • 462
  • 4
  • 9
33
votes
1 answer

Dimensionality reduction (SVD or PCA) on a large, sparse matrix

/edit: Further follow up now you can use irlba::prcomp_irlba /edit: following up on my own post. irlba now has "center" and "scale" arguments, which let you use it to calculate principle components, e.g: pc <- M %*% irlba(M, nv=5, nu=0,…
Zach
  • 22,308
  • 18
  • 114
  • 158