Most Popular
1500 questions
32
votes
4 answers
Implementation of CRF in python
Is there a popular implementation of Conditional Random Fields in Python?
I can't seem to find any that is widely used and popular!

garak
- 2,033
- 4
- 26
- 31
32
votes
3 answers
Shouldn't the joint probability of 2 independent events be equal to zero?
If the joint probability is the intersection of 2 events, then shouldn't the joint probability of 2 independent events be zero since they don't intersect at all? I'm confused.

gaston
- 511
- 4
- 5
32
votes
1 answer
Comparison between SHAP (Shapley Additive Explanation) and LIME (Local Interpretable Model-Agnostic Explanations)
I am reading up about two popular post hoc model interpretability techniques: LIME and SHAP
I am having trouble understanding the key difference in these two techniques.
To quote Scott Lundberg, the brains behind SHAP:
SHAP values come with the…

user248884
- 431
- 1
- 4
- 4
32
votes
7 answers
What's the point of time series analysis?
What is the point of time series analysis?
There are plenty of other statistical methods, such as regression and machine learning, that have obvious use cases: regression can provide information on the relationship between two variables, while…

Dhalsim
- 361
- 1
- 3
- 3
32
votes
4 answers
Machine learning techniques for parsing strings?
I have a lot of address strings:
1600 Pennsylvania Ave, Washington, DC 20500 USA
I want to parse them into their components:
street: 1600 Pennsylvania Ave
city: Washington
province: DC
postcode: 20500
country: USA
But of course the data is dirty:…

Jay Hacker
- 451
- 1
- 5
- 3
32
votes
5 answers
Strategies for teaching the sampling distribution
The tl;dr version
What successful strategies do you employ to teach the sampling distribution (of a sample mean, for example) at an introductory undergraduate level?
The background
In September I'll be teaching an introductory statistics course for…

smillig
- 2,336
- 28
- 31
32
votes
2 answers
Who first used/invented p-values?
I am attempting to write a series of blog posts on p-values and I thought it would be interesting to go back to where it all started - which appears to be Pearson's 1900 paper. If you are familiar with that paper, you'll remember that this covers…

Michelle
- 3,640
- 1
- 23
- 33
32
votes
1 answer
When to choose SARSA vs. Q Learning
SARSA and Q Learning are both reinforcement learning algorithms that work in a similar way. The most striking difference is that SARSA is on policy while Q Learning is off policy. The update rules are as follows:
Q…

hh32
- 1,279
- 1
- 8
- 19
32
votes
4 answers
How does batch size affect convergence of SGD and why?
I've seen similar conclusion from many discussions, that as the minibatch size gets larger the convergence of SGD actually gets harder/worse, for example this paper and this answer. Also I've heard of people using tricks like small learning rates or…

dontloo
- 13,692
- 7
- 51
- 80
32
votes
3 answers
Feature importance with dummy variables
I am trying to understand how I can get the feature importance of a categorical variable that has been broken down into dummy variables. I am using scikit-learn which doesn't handle categorical variables for you the way R or h2o do.
If I break a…

Dan
- 1,288
- 2
- 12
- 30
32
votes
2 answers
White Noise in Statistics
I often see the term white noise appearing when reading about different statistical models. I must however admit, that I am not completely sure what this means. It is usually abbreviated as $WN(0,σ^2)$. Does that mean it's normally distributed or…

user13514
- 421
- 4
- 3
32
votes
2 answers
Why are there no deep reinforcement learning engines for chess, similar to AlphaGo?
Computers have for a long time been able to play chess using a "brute-force"-technique, searching to a certain depth and then evaluating the position. The AlphaGo computer however, only use an ANN to evaluate the positions (it does not do any…

lijas
- 423
- 1
- 4
- 5
32
votes
8 answers
What is the probability that this person is female?
There is a person behind a curtain - I do not know whether the person is female or male.
I know the person has long hair, and that 90% of all people with long hair are female
I know the person has a rare blood type AX3, and that 80% of all people…

ProbablyWrong
- 333
- 3
- 7
32
votes
3 answers
Why does basic hypothesis testing focus on the mean and not on the median?
In basic under-grad statistics courses, students are (usually?) taught hypothesis testing for the mean of a population.
Why is it that the focus is on the mean and not on the median? My guess is that it is easier to test the mean due to the central…

nafrtiti
- 665
- 1
- 6
- 9
32
votes
3 answers
Why Beta/Dirichlet Regression are not considered Generalized Linear Models?
The premise is this quote from vignette of R package betareg1.
Further-more, the model shares some properties (such as linear
predictor, link function, dispersion parameter) with generalized
linear models (GLMs; McCullagh and Nelder 1989), but…

Firebug
- 15,262
- 5
- 60
- 127