Most Popular
1500 questions
42
votes
4 answers
Is LSTM (Long Short-Term Memory) dead?
From my own experience, LSTM has a long training time, and does not improve performance significantly in many real world tasks.
To make the question more specific, I want to ask when LSTM will work better than other deep NN (may be with real world…

Haitao Du
- 32,885
- 17
- 118
- 213
42
votes
15 answers
The Monty Hall Problem - where does our intuition fail us?
From Wikipedia :
Suppose you're on a game show, and
you're given the choice of three
doors: Behind one door is a car;
behind the others, goats. You pick a
door, say No. 1, and the host, who
knows what's behind the doors, opens
another…

Rizwan Kassim
- 745
- 7
- 10
42
votes
2 answers
Simulation of logistic regression power analysis - designed experiments
This question is in response to an answer given by @Greg Snow in regards to a question I asked concerning power analysis with logistic regression and SAS Proc GLMPOWER.
If I am designing an experiment and will analze the results in a factorial…

B_Miner
- 7,560
- 20
- 81
- 144
42
votes
7 answers
Are there any examples of where the central limit theorem does not hold?
Wikipedia says -
In probability theory, the central limit theorem (CLT) establishes that, in most situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution (informally a "bell curve")…

Ryan McCauley
- 553
- 1
- 5
- 6
42
votes
5 answers
Time series 'clustering' in R
I have a set of time series data. Each series covers the same period, although the actual dates in each time series may not all 'line up' exactly.
That is to say, if the Time series were to be read into a 2D matrix, it would look something like…

morpheous
- 605
- 1
- 8
- 9
42
votes
4 answers
Entropy of an image
What is the most information/physics-theoretical correct way to compute the entropy of an image? I don't care about computational efficiency right now - I want it theoretically as correct as possible.
Lets start with a gray-scale image. One…

Davor Josipovic
- 948
- 1
- 12
- 19
42
votes
4 answers
Evaluation measures of goodness or validity of clustering (without having truth labels)
I'm clustering a set of data but I don't have truth document that allow me to evaluate the result of clustering (I have unlabelled data), so I can not use an external evaluation measure. In this case, is there any efficient evaluation measures -…

shn
- 2,479
- 9
- 31
- 38
42
votes
3 answers
Help me understand the quantile (inverse CDF) function
I am reading about the quantile function, but it is not clear to me. Could you provide a more intuitive explanation than the one provided below?
Since the cdf $F$ is a monotonically increasing function, it has an
inverse; let us denote this by…

Inder Gill
- 563
- 1
- 5
- 7
42
votes
5 answers
Confidence interval for median
I have to find a 95% C.I. on the median and other percentiles. I don't know how to approach this. I mainly use R as a programming tool.

Dominic Comtois
- 2,047
- 5
- 20
- 25
42
votes
2 answers
What is model identifiability?
I know that with a model that is not identifiable the data can be said to be generated by multiple different assignments to the model parameters. I know that sometimes it's possible to constrain parameters so that all are identifiable, as in the…

Jack Tanner
- 4,552
- 3
- 27
- 39
42
votes
8 answers
Approximate $e$ using Monte Carlo Simulation
I've been looking at Monte Carlo simulation recently, and have been using it to approximate constants such as $\pi$ (circle inside a rectangle, proportionate area).
However, I'm unable to think of a corresponding method of approximating the value of…

statisticnewbie12345
- 151
- 1
- 2
- 5
42
votes
2 answers
What is elastic net regularization, and how does it solve the drawbacks of Ridge ($L^2$) and Lasso ($L^1$)?
Is elastic net regularization always preferred to Lasso & Ridge since it seems to solve the drawbacks of these methods? What is the intuition and what is the math behind elastic net?

GeorgeOfTheRF
- 5,063
- 14
- 42
- 51
42
votes
1 answer
Relative variable importance for Boosting
I'm looking for an explanation of how relative variable importance is computed in Gradient Boosted Trees that is not overly general/simplistic like:
The measures are based on the number of times a variable is selected for splitting, weighted by the…

Antoine
- 5,740
- 7
- 29
- 53
42
votes
3 answers
Are pooling layers added before or after dropout layers?
I'm creating a convolutional neural network (CNN), where I have a convolutional layer followed by a pooling layer and I want to apply dropout to reduce overfitting. I have this feeling that the dropout layer should be applied after the pooling…

pir
- 4,626
- 10
- 38
- 73
42
votes
10 answers
Are your chances of dying in a plane crash reduced if you fly direct?
I recently had a disagreement with a friend about minimizing the chance of dying in a plane due to a crash. This is a rudimentary statistics question.
He stated that he prefers to fly direct to a destination, as it decreases the probability that he…

Kyle
- 531
- 1
- 4
- 7