Most Popular

1500 questions
32
votes
4 answers

Implementation of CRF in python

Is there a popular implementation of Conditional Random Fields in Python? I can't seem to find any that is widely used and popular!
garak
  • 2,033
  • 4
  • 26
  • 31
32
votes
3 answers

Shouldn't the joint probability of 2 independent events be equal to zero?

If the joint probability is the intersection of 2 events, then shouldn't the joint probability of 2 independent events be zero since they don't intersect at all? I'm confused.
gaston
  • 511
  • 4
  • 5
32
votes
1 answer

Comparison between SHAP (Shapley Additive Explanation) and LIME (Local Interpretable Model-Agnostic Explanations)

I am reading up about two popular post hoc model interpretability techniques: LIME and SHAP I am having trouble understanding the key difference in these two techniques. To quote Scott Lundberg, the brains behind SHAP: SHAP values come with the…
user248884
  • 431
  • 1
  • 4
  • 4
32
votes
7 answers

What's the point of time series analysis?

What is the point of time series analysis? There are plenty of other statistical methods, such as regression and machine learning, that have obvious use cases: regression can provide information on the relationship between two variables, while…
Dhalsim
  • 361
  • 1
  • 3
  • 3
32
votes
4 answers

Machine learning techniques for parsing strings?

I have a lot of address strings: 1600 Pennsylvania Ave, Washington, DC 20500 USA I want to parse them into their components: street: 1600 Pennsylvania Ave city: Washington province: DC postcode: 20500 country: USA But of course the data is dirty:…
Jay Hacker
  • 451
  • 1
  • 5
  • 3
32
votes
5 answers

Strategies for teaching the sampling distribution

The tl;dr version What successful strategies do you employ to teach the sampling distribution (of a sample mean, for example) at an introductory undergraduate level? The background In September I'll be teaching an introductory statistics course for…
smillig
  • 2,336
  • 28
  • 31
32
votes
2 answers

Who first used/invented p-values?

I am attempting to write a series of blog posts on p-values and I thought it would be interesting to go back to where it all started - which appears to be Pearson's 1900 paper. If you are familiar with that paper, you'll remember that this covers…
Michelle
  • 3,640
  • 1
  • 23
  • 33
32
votes
1 answer

When to choose SARSA vs. Q Learning

SARSA and Q Learning are both reinforcement learning algorithms that work in a similar way. The most striking difference is that SARSA is on policy while Q Learning is off policy. The update rules are as follows: Q…
hh32
  • 1,279
  • 1
  • 8
  • 19
32
votes
4 answers

How does batch size affect convergence of SGD and why?

I've seen similar conclusion from many discussions, that as the minibatch size gets larger the convergence of SGD actually gets harder/worse, for example this paper and this answer. Also I've heard of people using tricks like small learning rates or…
32
votes
3 answers

Feature importance with dummy variables

I am trying to understand how I can get the feature importance of a categorical variable that has been broken down into dummy variables. I am using scikit-learn which doesn't handle categorical variables for you the way R or h2o do. If I break a…
Dan
  • 1,288
  • 2
  • 12
  • 30
32
votes
2 answers

White Noise in Statistics

I often see the term white noise appearing when reading about different statistical models. I must however admit, that I am not completely sure what this means. It is usually abbreviated as $WN(0,σ^2)$. Does that mean it's normally distributed or…
user13514
  • 421
  • 4
  • 3
32
votes
2 answers

Why are there no deep reinforcement learning engines for chess, similar to AlphaGo?

Computers have for a long time been able to play chess using a "brute-force"-technique, searching to a certain depth and then evaluating the position. The AlphaGo computer however, only use an ANN to evaluate the positions (it does not do any…
lijas
  • 423
  • 1
  • 4
  • 5
32
votes
8 answers

What is the probability that this person is female?

There is a person behind a curtain - I do not know whether the person is female or male. I know the person has long hair, and that 90% of all people with long hair are female I know the person has a rare blood type AX3, and that 80% of all people…
ProbablyWrong
  • 333
  • 3
  • 7
32
votes
3 answers

Why does basic hypothesis testing focus on the mean and not on the median?

In basic under-grad statistics courses, students are (usually?) taught hypothesis testing for the mean of a population. Why is it that the focus is on the mean and not on the median? My guess is that it is easier to test the mean due to the central…
nafrtiti
  • 665
  • 1
  • 6
  • 9
32
votes
3 answers

Why Beta/Dirichlet Regression are not considered Generalized Linear Models?

The premise is this quote from vignette of R package betareg1. Further-more, the model shares some properties (such as linear predictor, link function, dispersion parameter) with generalized linear models (GLMs; McCullagh and Nelder 1989), but…
Firebug
  • 15,262
  • 5
  • 60
  • 127