Highest Voted Questions - Statistical Analysis Stack Exchange

32

votes

4 answers

Implementation of CRF in python

Is there a popular implementation of Conditional Random Fields in Python? I can't seem to find any that is widely used and popular!

machine-learning classification python conditional-random-field

asked Sep 28 '12 at 20:19

garak

2,033
4
26
31

32

votes

3 answers

Shouldn't the joint probability of 2 independent events be equal to zero?

If the joint probability is the intersection of 2 events, then shouldn't the joint probability of 2 independent events be zero since they don't intersect at all? I'm confused.

probability joint-distribution

asked Dec 07 '18 at 08:35

gaston

511
4
5

32

votes

1 answer

Comparison between SHAP (Shapley Additive Explanation) and LIME (Local Interpretable Model-Agnostic Explanations)

I am reading up about two popular post hoc model interpretability techniques: LIME and SHAP I am having trouble understanding the key difference in these two techniques. To quote Scott Lundberg, the brains behind SHAP: SHAP values come with the…

interpretation shapley-value lime

asked Dec 01 '18 at 09:20

user248884

431
1
4
4

32

votes

7 answers

What's the point of time series analysis?

What is the point of time series analysis? There are plenty of other statistical methods, such as regression and machine learning, that have obvious use cases: regression can provide information on the relationship between two variables, while…

time-series arima

asked Sep 19 '18 at 08:11

Dhalsim

361
1
3
3

32

votes

4 answers

Machine learning techniques for parsing strings?

I have a lot of address strings: 1600 Pennsylvania Ave, Washington, DC 20500 USA I want to parse them into their components: street: 1600 Pennsylvania Ave city: Washington province: DC postcode: 20500 country: USA But of course the data is dirty:…

machine-learning text-mining

asked Aug 28 '12 at 14:48

Jay Hacker

451
1
5
3

32

votes

5 answers

Strategies for teaching the sampling distribution

The tl;dr version What successful strategies do you employ to teach the sampling distribution (of a sample mean, for example) at an introductory undergraduate level? The background In September I'll be teaching an introductory statistics course for…

distributions sampling teaching

asked Aug 23 '12 at 08:59

smillig

2,336
28
31

32

votes

2 answers

Who first used/invented p-values?

I am attempting to write a series of blog posts on p-values and I thought it would be interesting to go back to where it all started - which appears to be Pearson's 1900 paper. If you are familiar with that paper, you'll remember that this covers…

p-value history ronald-fisher

asked Apr 28 '18 at 06:40

Michelle

3,640
1
23
33

32

votes

1 answer

When to choose SARSA vs. Q Learning

SARSA and Q Learning are both reinforcement learning algorithms that work in a similar way. The most striking difference is that SARSA is on policy while Q Learning is off policy. The update rules are as follows: Q…

reinforcement-learning

asked Feb 04 '18 at 18:06

hh32

1,279
1
8
19

32

votes

4 answers

How does batch size affect convergence of SGD and why?

I've seen similar conclusion from many discussions, that as the minibatch size gets larger the convergence of SGD actually gets harder/worse, for example this paper and this answer. Also I've heard of people using tricks like small learning rates or…

machine-learning neural-networks optimization gradient-descent stochastic-gradient-descent

asked Nov 30 '17 at 12:35

dontloo

13,692
7
51
80

32

votes

3 answers

Feature importance with dummy variables

I am trying to understand how I can get the feature importance of a categorical variable that has been broken down into dummy variables. I am using scikit-learn which doesn't handle categorical variables for you the way R or h2o do. If I break a…

categorical-data random-forest interpretation importance

asked Nov 19 '17 at 18:45

Dan

1,288
2
12
30

32

votes

2 answers

White Noise in Statistics

I often see the term white noise appearing when reading about different statistical models. I must however admit, that I am not completely sure what this means. It is usually abbreviated as $WN(0,σ^2)$. Does that mean it's normally distributed or…

normal-distribution white-noise

asked Oct 24 '17 at 19:42

user13514

421
4
3

32

votes

2 answers

Why are there no deep reinforcement learning engines for chess, similar to AlphaGo?

Computers have for a long time been able to play chess using a "brute-force"-technique, searching to a certain depth and then evaluating the position. The AlphaGo computer however, only use an ANN to evaluate the positions (it does not do any…

neural-networks deep-learning reinforcement-learning games

asked Oct 19 '17 at 07:38

lijas

423
1
4
5

32

votes

8 answers

What is the probability that this person is female?

There is a person behind a curtain - I do not know whether the person is female or male. I know the person has long hair, and that 90% of all people with long hair are female I know the person has a rare blood type AX3, and that 80% of all people…

conditional-probability probability

asked Jun 21 '12 at 02:10

ProbablyWrong

333
3
7

32

votes

3 answers

Why does basic hypothesis testing focus on the mean and not on the median?

In basic under-grad statistics courses, students are (usually?) taught hypothesis testing for the mean of a population. Why is it that the focus is on the mean and not on the median? My guess is that it is easier to test the mean due to the central…

hypothesis-testing mean inference median

asked Oct 08 '17 at 07:58

nafrtiti

665
1
6
9

32

votes

3 answers

Why Beta/Dirichlet Regression are not considered Generalized Linear Models?

The premise is this quote from vignette of R package betareg1. Further-more, the model shares some properties (such as linear predictor, link function, dispersion parameter) with generalized linear models (GLMs; McCullagh and Nelder 1989), but…

generalized-linear-model beta-regression dirichlet-regression

asked Sep 22 '17 at 19:13

Firebug

15,262
5
60
127

Most Popular