Most Popular
1500 questions
44
votes
3 answers
Were generative adversarial networks introduced by Jürgen Schmidhuber?
I read on https://en.wikipedia.org/wiki/Generative_adversarial_networks :
[Generative adversarial networks] were introduced by Ian Goodfellow et al in 2014.
but Jurgen Schmidhuber claims to have performed similar work earlier in that direction…

Franck Dernoncourt
- 42,093
- 30
- 155
- 271
44
votes
3 answers
What are the differences between hidden Markov models and neural networks?
I'm just getting my feet wet in statistics so I'm sorry if this question does not make sense. I have used Markov models to predict hidden states (unfair casinos, dice rolls, etc.) and neural networks to study users clicks on a search engine. Both…

Lostsoul
- 683
- 2
- 7
- 12
44
votes
3 answers
Gradient Boosting for Linear Regression - why does it not work?
While learning about Gradient Boosting, I haven't heard about any constraints regarding the properties of a "weak classifier" that the method uses to build and ensemble model. However, I could not imagine an application of a GB that uses linear…

Matek
- 749
- 1
- 6
- 14
44
votes
5 answers
Is machine learning less useful for understanding causality, thus less interesting for social science?
My understanding of the difference between machine learning/other statistical predictive techniques vs. the kind of statistics that social scientists (e.g., economists) use is that economists seem very interested in understanding the effect of a…

d_a_c321
- 1,129
- 1
- 10
- 19
44
votes
8 answers
Is it valid to include a baseline measure as control variable when testing the effect of an independent variable on change scores?
I am attempting to run an OLS regression:
DV: Change in weight over a year (initial weight - end weight)
IV: Whether or not you exercise.
However, it seems reasonable that heavier people will lose more weight per unit of exercise than thinner…

ChrisStata
- 581
- 1
- 6
- 7
44
votes
5 answers
How does linear regression use the normal distribution?
In linear regression, each predicted value is assumed to have been picked from a normal distribution of possible values. See below.
But why is each predicted value assumed to have come from a normal distribution? How does linear regression use this…

luciano
- 12,197
- 30
- 87
- 119
44
votes
3 answers
Intuitive explanation for density of transformed variable?
Suppose $X$ is a random variable with pdf $f_X(x)$. Then the random variable $Y=X^2$ has the pdf
$$f_Y(y)=\begin{cases}\frac{1}{2\sqrt{y}}\left(f_X(\sqrt{y})+f_X(-\sqrt{y})\right) & y \ge 0 \\ 0 & y \lt 0\end{cases}$$
I understand the calculus…

lowndrul
- 2,057
- 1
- 18
- 20
44
votes
6 answers
When to use simulations?
So this is a very simple and basic question. However, when I was in school, I paid very little attention to the whole concept of simulations in class and that's left me a little terrified of that process.
Can you explain the simulation process in…

AMathew
- 1,000
- 12
- 18
44
votes
5 answers
Why does increasing the sample size lower the (sampling) variance?
Big picture:
I'm trying to understand how increasing the sample size increases the power of an experiment.
My lecturer's slides explain this with a picture of 2 normal distributions, one for the null-hypothesis and one for the alternative-hypothesis…

user2740
- 1,226
- 2
- 12
- 19
44
votes
6 answers
Is a time series the same as a stochastic process?
A stochastic process is a process that evolves over time, so is it really a fancier way of saying "time series"?

Victor
- 5,925
- 13
- 43
- 67
44
votes
2 answers
Measures of variable importance in random forests
I've been playing around with random forests for regression and am having difficulty working out exactly what the two measures of importance mean, and how they should be interpreted.
The importance() function gives two values for each variable:…

dcl
- 2,610
- 3
- 19
- 30
44
votes
1 answer
what does the numbers in the classification report of sklearn mean?
I have below an example I pulled from sklearn 's sklearn.metrics.classification_report documentation.
What I don't understand is why there are f1-score, precision and recall values for each class where I believe class is the predictor label? I…

jxn
- 749
- 2
- 7
- 15
44
votes
1 answer
What are posterior predictive checks and what makes them useful?
I understand what the posterior predictive distribution is, and I have been reading about posterior predictive checks, although it isn't clear to me what it does yet.
What exactly is the posterior predictive check?
Why do some authors say that…

Amelio Vazquez-Reina
- 17,546
- 26
- 74
- 110
43
votes
4 answers
What is the difference between finite and infinite variance
What is the difference between finite and infinite variance ? My stats knowledge is rather basic; Wikipedia / Google wasn't much help here.

AfterWorkGuinness
- 583
- 1
- 5
- 10
43
votes
3 answers
What is meant by 'weak learner'?
Can anyone tell me what is meant by the phrase 'weak learner'? Is it supposed to be a weak hypothesis? I am confused about the relationship between a weak learner and a weak classifier. Are both the same or is there some difference?
In the adaboost…

vrushali
- 431
- 1
- 4
- 3