Most Popular
1500 questions
56
votes
6 answers
What are alternatives of Gradient Descent?
Gradient Descent has a problem of getting stuck in Local Minima. We need to run gradient descent exponential times in order to find global minima.
Can anybody tell me about any alternatives of gradient descent as applied in neural network learning,…

Tropa
- 765
- 1
- 7
- 13
56
votes
13 answers
Mean absolute deviation vs. standard deviation
In the text book "New Comprehensive Mathematics for O Level" by Greer (1983), I see averaged deviation calculated like this:
Sum up absolute differences between single values and the mean. Then
get its average. Througout the chapter the term mean…

itsols
- 729
- 1
- 7
- 8
56
votes
5 answers
How to derive the ridge regression solution?
I am having some issues with the derivation of the solution for ridge regression.
I know the regression solution without the regularization term:
$$\beta = (X^TX)^{-1}X^Ty.$$
But after adding the L2 term $\lambda\|\beta\|_2^2$ to the cost function,…

user34790
- 6,049
- 6
- 42
- 64
56
votes
13 answers
What are the breakthroughs in Statistics of the past 15 years?
I still remember the Annals of Statistics paper on Boosting by Friedman-Hastie-Tibshirani, and the comments on that same issues by other authors (including Freund and Schapire). At that time, clearly Boosting was viewed as a breakthrough in many…

gappy
- 5,390
- 3
- 28
- 50
56
votes
3 answers
Standard deviation of standard deviation
What is an estimator of standard deviation of standard deviation if normality of data can be assumed?
user88
56
votes
7 answers
Graph for relationship between two ordinal variables
What is an appropriate graph to illustrate the relationship between two ordinal variables?
A few options I can think of:
Scatter plot with added random jitter to stop points hiding each other. Apparently a standard graphic - Minitab calls this an…

Silverfish
- 20,678
- 23
- 92
- 180
56
votes
8 answers
R libraries for deep learning
I was wondering if there's any good R libraries out there for deep learning neural networks? I know there's the nnet, neuralnet, and RSNNS, but none of these seem to implement deep learning methods.
I'm especially interested in unsupervised…

Zach
- 22,308
- 18
- 114
- 158
56
votes
5 answers
What is the difference between GARCH and ARMA?
I am confused. I don't understand the difference a ARMA and a GARCH process.. to me there are the same no ?
Here is the (G)ARCH(p, q) process
$$\sigma_t^2 =
\underbrace{
\underbrace{
\alpha_0
+ \sum_{i=1}^q \alpha_ir_{t-i}^2}
…

John
- 735
- 1
- 6
- 9
56
votes
9 answers
Is it wrong to rephrase "1 in 80 deaths is caused by a car accident" as "1 in 80 people die as a result of a car accident?"
Statement One (S1): "One in 80 deaths is caused by a car accident."
Statement Two (S2): "One in 80 people dies as a result of a car accident."
Now, I personally don't see very much difference at all between these two statements. When writing, I…

faulty_ram_sticks
- 671
- 1
- 5
- 8
56
votes
4 answers
Random forest computing time in R
I am using the party package in R with 10,000 rows and 34 features, and some factor features have more than 300 levels. The computing time is too long. (It has taken 3 hours so far and it hasn't finished yet.)
I want to know what elements have a…

Chenghao Liu
- 721
- 1
- 7
- 6
56
votes
4 answers
What should I do when my neural network doesn't generalize well?
I'm training a neural network and the training loss decreases, but the validation loss doesn't, or it decreases much less than what I would expect, based on references or experiments with very similar architectures and data. How can I fix this?
As…

DeltaIV
- 15,894
- 4
- 62
- 104
56
votes
7 answers
Is there any gold standard for modeling irregularly spaced time series?
In field of economics (I think) we have ARIMA and GARCH for regularly spaced time series and Poisson, Hawkes for modeling point processes, so how about attempts for modeling irregularly (unevenly) spaced time series - are there (at least) any…

Qbik
- 1,457
- 2
- 17
- 27
56
votes
5 answers
Regression when the OLS residuals are not normally distributed
There are several threads on this site discussing how to determine if the OLS residuals are asymptotically normally distributed. Another way to evaluate the normality of the residuals with R code is provided in this excellent answer. This is another…

Robert Kubrick
- 4,078
- 8
- 38
- 55
56
votes
4 answers
What is the definition of a "feature map" (aka "activation map") in a convolutional neural network?
Intro Background
Within a convolutional neural network, we usually have a general structure / flow that looks like this:
input image (i.e. a 2D vector x)
(1st Convolutional layer (Conv1) starts here...)
convolve a set of filters (w1) along the…

Atlas7
- 663
- 1
- 6
- 7
56
votes
4 answers
Regression for an outcome (ratio or fraction) between 0 and 1
I am thinking of building a model predicting a ratio $a/b$, where $a \le b$ and $a > 0$ and $b > 0$. So, the ratio would be between $0$ and $1$.
I could use linear regression, although it doesn't naturally limit to 0..1. I have no reason to believe…

dfrankow
- 2,816
- 6
- 30
- 39