Highest Voted Questions - Statistical Analysis Stack Exchange

114

votes

5 answers

What skills are required to perform large scale statistical analyses?

Many statistical jobs ask for experience with large scale data. What are the sorts of statistical and computational skills that would be need for working with large data sets. For example, how about building regression models given a data set with…

regression machine-learning multivariate-analysis large-data

asked Mar 02 '11 at 19:05

bit-question

2,637
6
25
26

114

votes

4 answers

Why does the Lasso provide Variable Selection?

I've been reading Elements of Statistical Learning, and I would like to know why the Lasso provides variable selection and ridge regression doesn't. Both methods minimize the residual sum of squares and have a constraint on the possible values of…

regression feature-selection lasso regularization

asked Nov 04 '13 at 14:39

Zhi Zhao

1,352
3
9
9

114

votes

4 answers

What does a "closed-form solution" mean?

I have come across the term "closed-form solution" quite often. What does a closed-form solution mean? How does one determine if a close-form solution exists for a given problem? Searching online, I found some information, but nothing in the context…

regression machine-learning probability terminology stochastic-processes

asked Sep 23 '13 at 23:31

arjsgh21

2,403
6
15
8

114

votes

16 answers

What misused statistical terms are worth correcting?

Statistics is everywhere; common usage of statistical terms is, however, often unclear. The terms probability and odds are used interchangeable in lay English despite their well-defined and different mathematical expressions. Not separating the term…

terminology

asked Mar 21 '16 at 21:24

Antoni Parellada

23,430
15
100
197

114

votes

2 answers

tanh activation function vs sigmoid activation function

The tanh activation function is: $$tanh \left( x \right) = 2 \cdot \sigma \left( 2 x \right) - 1$$ Where $\sigma(x)$, the sigmoid function, is defined as: $$\sigma(x) = \frac{e^x}{1 + e^x}$$. Questions: Does it really matter between using those…

machine-learning neural-networks optimization

asked Jun 08 '14 at 06:11

satya

1,293
2
9
9

113

votes

6 answers

What loss function for multi-class, multi-label classification tasks in neural networks?

I'm training a neural network to classify a set of objects into n-classes. Each object can belong to multiple classes at the same time (multi-class, multi-label). I read that for multi-class problems it is generally recommended to use softmax and…

neural-networks python loss-functions keras cross-entropy

asked Apr 17 '16 at 14:28

aKzenT

1,231
2
8
5

112

votes

6 answers

Is it possible to train a neural network without backpropagation?

Many neural network books and tutorials spend a lot of time on the backpropagation algorithm, which is essentially a tool to compute the gradient. Let's assume we are building a model with ~10K parameters / weights. Is it possible to run the…

machine-learning neural-networks optimization backpropagation

asked Sep 20 '16 at 01:48

Haitao Du

32,885
17
118
213

111

votes

11 answers

Calculating optimal number of bins in a histogram

I'm interested in finding as optimal of a method as I can for determining how many bins I should use in a histogram. My data should range from 30 to 350 objects at most, and in particular I'm trying to apply thresholding (like Otsu's method) where…

rule-of-thumb histogram

asked Jul 27 '10 at 15:21

Tony Stark

1,213
2
9
5

111

votes

7 answers

Why use gradient descent for linear regression, when a closed-form math solution is available?

I am taking the Machine Learning courses online and learnt about Gradient Descent for calculating the optimal values in the hypothesis. h(x) = B0 + B1X why we need to use Gradient Descent if we can easily find the values with the below formula?…

regression machine-learning gradient-descent

asked May 10 '17 at 16:52

Purus

1,213
2
7
6

111

votes

2 answers

What is an embedding layer in a neural network?

In many neural network libraries, there are 'embedding layers', like in Keras or Lasagne. I am not sure I understand its function, despite reading the documentation. For example, in the Keras documentation it says: Turn positive integers (indexes)…

machine-learning neural-networks python word-embeddings

asked Nov 20 '15 at 16:43

Francesco

1,213
2
9
8

111

votes

5 answers

Using k-fold cross-validation for time-series model selection

Question: I want to be sure of something, is the use of k-fold cross-validation with time series is straightforward, or does one need to pay special attention before using it? Background: I'm modeling a time series of 6 year (with semi-markov…

time-series modeling cross-validation

asked Aug 10 '11 at 17:20

Mickaël S

1,258
3
10
6

110

votes

4 answers

How do you calculate precision and recall for multiclass classification using confusion matrix?

I wonder how to compute precision and recall using a confusion matrix for a multi-class classification problem. Specifically, an observation can only be assigned to its most probable class / label. I would like to compute: Precision = TP / (TP+FP)…

machine-learning classification precision-recall multi-class

asked Mar 04 '13 at 15:56

daiyue

1,203
2
9
7

109

votes

7 answers

Detecting a given face in a database of facial images

I'm working on a little project involving the faces of twitter users via their profile pictures. A problem I've encountered is that after I filter out all but the images that are clear portrait photos, a small but significant percentage of twitter…

machine-learning clustering image-processing

asked Feb 14 '11 at 22:41

ʞɔıu

1,107
2
8
5

109

votes

4 answers

Softmax vs Sigmoid function in Logistic classifier?

What decides the choice of function ( Softmax vs Sigmoid ) in a Logistic classifier ? Suppose there are 4 output classes . Each of the above function gives the probabilities of each class being the correct output . So which one to take for a…

machine-learning logistic classification softmax

asked Sep 06 '16 at 15:46

mach

1,545
3
10
12

108

votes

4 answers

Difference between standard error and standard deviation

I'm struggling to understand the difference between the standard error and the standard deviation. How are they different and why do you need to measure the standard error?

mean standard-deviation standard-error intuition

asked Jul 15 '12 at 10:21

louis xie

1,233
3
10
6

Most Popular