Highest Voted Questions - Statistical Analysis Stack Exchange

32

votes

3 answers

How to rigorously define the likelihood?

The likelihood could be defined by several ways, for instance : the function $L$ from $\Theta\times{\cal X}$ which maps $(\theta,x)$ to $L(\theta \mid x)$ i.e. $L:\Theta\times{\cal X} \rightarrow \mathbb{R} $. the random function $L(\cdot \mid…

mathematical-statistics likelihood likelihood-ratio

asked Jun 02 '12 at 11:15

Stéphane Laurent

17,425
5
59
101

32

votes

2 answers

What does kernel size mean?

When people talk about neural networks, what do they mean when they say "kernel size"? Kernels are similarity functions, but what does that say about kernel size?

machine-learning neural-networks

asked Aug 07 '17 at 18:41

quil

433
1
4
6

32

votes

5 answers

What problem does oversampling, undersampling, and SMOTE solve?

In a recent, well recieved, question, Tim asks when is unbalanced data really a problem in Machine Learning? The premise of the question is that there is a lot of machine learning literature discussing class balance and the problem of imbalanced…

machine-learning classification predictive-models unbalanced-classes

asked Jun 14 '17 at 01:33

Matthew Drury

33,314
2
101
132

32

votes

1 answer

Can degrees of freedom be a non-integer number?

When I use GAM, it gives me residual DF is $26.6$ (last line in the code). What does that mean? Going beyond GAM example, In general, can the number of degrees of freedom be a non-integer number? > library(gam) >…

r degrees-of-freedom generalized-additive-model

asked May 21 '17 at 17:00

Haitao Du

32,885
17
118
213

32

votes

5 answers

Modelling longitudinal data where the effect of time varies in functional form between individuals

Context: Imagine you had a longitudinal study which measured a dependent variable (DV) once a week for 20 weeks on 200 participants. Although I'm interested in general, typical DVs that I'm thinking of include job performance following hire or…

repeated-measures random-effects-model latent-class

asked Sep 17 '10 at 07:12

Jeromy Anglim

42,044
23
146
250

32

votes

6 answers

Sample size for logistic regression?

I want to make a logistic model from my survey data. It is a small survey of four residential colonies in which only 154 respondents were interviewed. My dependent variable is "satisfactory transition to work". I found that, of the 154 respondents,…

logistic sample-size assumptions statistical-power unbalanced-classes

asked Apr 07 '12 at 07:38

Braj-Stat

561
2
7
6

32

votes

3 answers

What stop-criteria for agglomerative hierarchical clustering are used in practice?

I have found extensive literature proposing all sorts of criteria (e.g. Glenn et al. 1985(pdf) and Jung et al. 2002(pdf)). However, most of these are not that easy to implement (at least from my perspective). I am using scipy.cluster.hierarchy to…

clustering

asked Sep 12 '10 at 19:49

Björn Pollex

1,223
2
15
18

32

votes

8 answers

Should I teach Bayesian or frequentist statistics first?

I am helping my boys, currently in high school, understanding statistics, and I am considering beginning with some simple examples without disregarding some glimpses to theory. My goal would be to give them the most intuitive yet instrumentally…

probability hypothesis-testing bayesian frequentist teaching

asked Jan 25 '17 at 09:59

Giuseppe Biondi-Zoccai

2,244
3
19
48

32

votes

1 answer

Benefits of stratified vs random sampling for generating training data in classification

I would like to know if there are any/some advantages of using stratified sampling instead of random sampling, when splitting the original dataset into training and testing set for classification. Also, does stratified sampling introduce more bias…

classification cross-validation random-forest train stratification

asked Dec 07 '16 at 21:24

gc5

877
2
12
23

32

votes

3 answers

Is hour of day a categorical variable?

Is "hour of the day" where the value can be 0, 1, 2, ..., 23 a categorical variable? I would be tempted to say no, since 5, for example, is 'closer' to 4 or 6 than it is to 3 or 7. On the other hand, there is the discontinuity between 23 and 0. So…

categorical-data circular-statistics

asked Nov 14 '16 at 16:54

Paul Reiners

747
2
8
11

32

votes

4 answers

LASSO with interaction terms - is it okay if main effects are shrunk to zero?

LASSO regression shrinks coefficients towards zero, thus providing effectively model selection. I believe that in my data there are meaningful interactions between nominal and continuous covariates. Not necessarily, however, are the 'main effects'…

machine-learning lasso glmnet regularization

asked Nov 07 '16 at 19:41

tomka

5,874
3
30
71

32

votes

1 answer

Derivation of change of variables of a probability density function?

In the book pattern recognition and machine learning (formula 1.27), it gives $$p_y(y)=p_x(x) \left | \frac{d x}{d y} \right |=p_x(g(y)) | g'(y) |$$ where $x=g(y)$, $p_x(x)$ is the pdf that corresponds to $p_y(y)$ with respect to the change of the…

probability distributions self-study density-function jacobian

asked Oct 11 '16 at 13:10

dontloo

13,692
7
51
80

32

votes

2 answers

PCA in numpy and sklearn produces different results

Am i misunderstanding something. This is my code using sklearn import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D from sklearn import decomposition from sklearn import datasets from sklearn.preprocessing…

pca python scikit-learn

asked Sep 20 '16 at 04:45

aceminer

813
1
9
20

32

votes

3 answers

How to build the final model and tune probability threshold after nested cross-validation?

Firstly, apologies for posting a question that has already been discussed at length here, here, here, here, here, and for reheating an old topic. I know @DikranMarsupial has written about this topic at length in posts and journal papers, but I'm…

machine-learning cross-validation model-selection glmnet hyperparameter

asked Sep 01 '16 at 17:17

Andrew John Lowe

421
5
5

32

votes

2 answers

What is the reason that the Adam Optimizer is considered robust to the value of its hyper parameters?

I was reading about the Adam optimizer for Deep Learning and came across the following sentence in the new book Deep Learning by Bengio, Goodfellow and Courville: Adam is generally regarded as being fairly robust to the choice of hyper parameters,…

neural-networks deep-learning optimization hyperparameter adam

asked Aug 31 '16 at 18:27

Charlie Parker

5,836
11
57
113

Most Popular