Highest Voted Questions - Statistical Analysis Stack Exchange

56

votes

6 answers

What are alternatives of Gradient Descent?

Gradient Descent has a problem of getting stuck in Local Minima. We need to run gradient descent exponential times in order to find global minima. Can anybody tell me about any alternatives of gradient descent as applied in neural network learning,…

machine-learning svm neural-networks

asked May 09 '14 at 07:21

Tropa

765
1
7
13

56

votes

13 answers

Mean absolute deviation vs. standard deviation

In the text book "New Comprehensive Mathematics for O Level" by Greer (1983), I see averaged deviation calculated like this: Sum up absolute differences between single values and the mean. Then get its average. Througout the chapter the term mean…

distributions standard-deviation frequency variability

asked Jan 12 '14 at 09:53

itsols

729
1
7
8

56

votes

5 answers

How to derive the ridge regression solution?

I am having some issues with the derivation of the solution for ridge regression. I know the regression solution without the regularization term: $$\beta = (X^TX)^{-1}X^Ty.$$ But after adding the L2 term $\lambda\|\beta\|_2^2$ to the cost function,…

regression least-squares regularization ridge-regression

asked Sep 04 '13 at 15:49

user34790

6,049
6
42
64

56

votes

13 answers

What are the breakthroughs in Statistics of the past 15 years?

I still remember the Annals of Statistics paper on Boosting by Friedman-Hastie-Tibshirani, and the comments on that same issues by other authors (including Freund and Schapire). At that time, clearly Boosting was viewed as a breakthrough in many…

mathematical-statistics history

asked Jan 21 '11 at 02:53

gappy

5,390
3
28
50

56

votes

3 answers

Standard deviation of standard deviation

What is an estimator of standard deviation of standard deviation if normality of data can be assumed?

estimation standard-deviation normality-assumption

asked Jul 26 '10 at 16:10

user88

56

votes

7 answers

Graph for relationship between two ordinal variables

What is an appropriate graph to illustrate the relationship between two ordinal variables? A few options I can think of: Scatter plot with added random jitter to stop points hiding each other. Apparently a standard graphic - Minitab calls this an…

data-visualization categorical-data ordinal-data scatterplot

asked Apr 17 '13 at 00:31

Silverfish

20,678
23
92
180

56

votes

8 answers

R libraries for deep learning

I was wondering if there's any good R libraries out there for deep learning neural networks? I know there's the nnet, neuralnet, and RSNNS, but none of these seem to implement deep learning methods. I'm especially interested in unsupervised…

r neural-networks deep-learning restricted-boltzmann-machine deep-belief-networks

asked Nov 02 '12 at 17:35

Zach

22,308
18
114
158

56

votes

5 answers

What is the difference between GARCH and ARMA?

I am confused. I don't understand the difference a ARMA and a GARCH process.. to me there are the same no ? Here is the (G)ARCH(p, q) process $$\sigma_t^2 = \underbrace{ \underbrace{ \alpha_0 + \sum_{i=1}^q \alpha_ir_{t-i}^2} …

arima garch finance

asked Oct 30 '12 at 14:20

John

735
1
6
9

56

votes

9 answers

Is it wrong to rephrase "1 in 80 deaths is caused by a car accident" as "1 in 80 people die as a result of a car accident?"

Statement One (S1): "One in 80 deaths is caused by a car accident." Statement Two (S2): "One in 80 people dies as a result of a car accident." Now, I personally don't see very much difference at all between these two statements. When writing, I…

interpretation risk

asked Jan 22 '19 at 15:16

faulty_ram_sticks

671
1
5
8

56

votes

4 answers

Random forest computing time in R

I am using the party package in R with 10,000 rows and 34 features, and some factor features have more than 300 levels. The computing time is too long. (It has taken 3 hours so far and it hasn't finished yet.) I want to know what elements have a…

r random-forest

asked Sep 16 '12 at 06:18

Chenghao Liu

721
1
7
6

56

votes

4 answers

What should I do when my neural network doesn't generalize well?

I'm training a neural network and the training loss decreases, but the validation loss doesn't, or it decreases much less than what I would expect, based on references or experiments with very similar architectures and data. How can I fix this? As…

neural-networks overfitting faq

asked Sep 07 '18 at 09:12

DeltaIV

15,894
4
62
104

56

votes

7 answers

Is there any gold standard for modeling irregularly spaced time series?

In field of economics (I think) we have ARIMA and GARCH for regularly spaced time series and Poisson, Hawkes for modeling point processes, so how about attempts for modeling irregularly (unevenly) spaced time series - are there (at least) any…

time-series garch poisson-process point-process unevenly-spaced-time-series

asked Aug 06 '12 at 21:05

Qbik

1,457
2
17
27

56

votes

5 answers

Regression when the OLS residuals are not normally distributed

There are several threads on this site discussing how to determine if the OLS residuals are asymptotically normally distributed. Another way to evaluate the normality of the residuals with R code is provided in this excellent answer. This is another…

regression least-squares residuals assumptions normality-assumption

asked Jun 03 '12 at 13:24

Robert Kubrick

4,078
8
38
55

56

votes

4 answers

What is the definition of a "feature map" (aka "activation map") in a convolutional neural network?

Intro Background Within a convolutional neural network, we usually have a general structure / flow that looks like this: input image (i.e. a 2D vector x) (1st Convolutional layer (Conv1) starts here...) convolve a set of filters (w1) along the…

neural-networks deep-learning conv-neural-network

asked Jul 16 '17 at 14:16

Atlas7

663
1
6
7

56

votes

4 answers

Regression for an outcome (ratio or fraction) between 0 and 1

I am thinking of building a model predicting a ratio $a/b$, where $a \le b$ and $a > 0$ and $b > 0$. So, the ratio would be between $0$ and $1$. I could use linear regression, although it doesn't naturally limit to 0..1. I have no reason to believe…

regression logistic generalized-linear-model beta-distribution beta-regression

asked May 23 '12 at 22:13

dfrankow

2,816
6
30
39

Most Popular