For questions about how to parameterize some statistical model, or comparisons between different ways to parameterize.
Questions tagged [parameterization]
297 questions
47
votes
8 answers
Would a Bayesian admit that there is one fixed parameter value?
In Bayesian data analysis, parameters are treated as random variables. This stems from the Bayesian subjective conceptualization of probability. But do Bayesians theoretically acknowledge that there is one true fixed parameter value out in the 'real…

ATJ
- 1,711
- 1
- 15
- 20
31
votes
1 answer
For which distributions are the parameterizations in BUGS and R different?
I have found some distributions for which BUGS and R have different parameterizations: Normal, log-Normal, and Weibull.
For each of these, I gather that the second parameter used by R needs to be inverse transformed (1/parameter) before being used…

David LeBauer
- 7,060
- 6
- 44
- 89
19
votes
5 answers
What's in a name: hyperparameters
So in a normal distribution, we have two parameters: mean $\mu$ and variance $\sigma^2$. In the book Pattern Recognition and Machine Learning, there suddenly appears a hyperparameter $\lambda$ in the regularization terms of the error function.
What…

cgo
- 7,445
- 10
- 42
- 61
17
votes
2 answers
Cross validation and parameter optimization
I have a question about the parameter optimization when I use the 10-fold cross validation.
I want to ask that whether the parameters should fix or not during every fold's model training , i.e. (1) select one set of optimized parameters for every…

Kevin
- 173
- 1
- 1
- 4
15
votes
0 answers
Asymptotic property of tuning parameter in penalized regression
I'm currently working on asymptotic properties of penalized regression. I've read a myriad of papers by now, but there is an essential issue that I cannot get my head around.
To keep things simple, I'm going to look at the minimization…

Nick Sabbe
- 12,119
- 2
- 35
- 43
13
votes
2 answers
Random Forest: what if I know a variable is important
My understanding is the the random forest picks randomly mtry variables to build each decision tree. So if mtry=ncol/3 then each variables will be used on average in 1/3 of the trees. And 2/3 of the trees will not use them.
But what if I know that a…

Benoit_Plante
- 2,461
- 4
- 18
- 25
11
votes
5 answers
Fitting SIR model with 2019-nCoV data doesn't conververge
I am trying to calculate the basic reproduction number $R_0$ of the new 2019-nCoV virus by fitting a SIR model to the current data. My code is based on https://arxiv.org/pdf/1605.01931.pdf, p.…

vonjd
- 5,886
- 4
- 47
- 59
11
votes
2 answers
gamma parameter in xgboost
I came across one comment in an xgboost tutorial. It says "Remember that gamma brings improvement when you want to use shallow (low max_depth) trees".
My understanding is that higher gamma higher regularization. If we have deep (high max_depth)…

Salty Gold Fish
- 377
- 1
- 2
- 7
11
votes
3 answers
Understanding parameter as a random variable in Bayesian statistics
If I understand correctly, in Bayesian statistics, a parameter is a random variable. When estimating the parameter, a prior distribution is combined with the data to yield a posterior distribution.
Question:
Is every data point (in the sample as…

Richard Hardy
- 54,375
- 10
- 95
- 219
10
votes
1 answer
Parametrizing the Behrens–Fisher distributions
"On the Behrens–Fisher Problem: A Review" by Seock-Ho Kim and Allen S. Cohen
Journal of Educational and Behavioral Statistics, volume 23, number 4, Winter, 1998, pages 356–377
I'm looking at this thing and it says:
Fisher (1935, 1939) chose the…

Michael Hardy
- 7,094
- 1
- 20
- 38
10
votes
1 answer
What does the cost (C) parameter mean in SVM?
I am trying to fit a SVM to my data. My dataset contains 3 classes and I am performing 10 fold cross validation (in LibSVM):
./svm-train -g 0.5 -c 10 -e 0.1 -v 10 training_data
The help thereby states:
-c cost : set the parameter C of C-SVC,…

Md. Abid Hasan
- 233
- 1
- 2
- 8
10
votes
2 answers
Fisher information matrix determinant for an overparameterized model
Consider a Bernoulli random variable $X\in\{0,1\}$ with parameter $\theta$ (probability of success). The likelihood function and Fisher information (a $1 \times 1$ matrix) are:
$$
\begin{align}
\mathcal{L}_1(\theta;X) &= p(\left.X\right|\theta) =…

Tyler Streeter
- 1,035
- 9
- 21
9
votes
1 answer
What exactly makes a model "overparameterized"?
I often read that training "overparameterized" networks works well in practice, and perhaps no one yet knows exactly why yet. However, when I look at the number of samples and parameters many NNs use, they are still fitting with more data than…

Josh
- 3,408
- 4
- 22
- 46
9
votes
1 answer
Algorithms for weighted maximum likelihood parameter estimation
What are the computational or algorithmic considerations for weighted maximum likelihood parameter estimation?
That is, I want to get
$$
\theta^* = \arg\max\limits_\theta \sum_i w_i \log(\mathcal{L}(\theta|x_i))
$$
assuming we have a weight $w_i$…

user3658307
- 1,754
- 1
- 13
- 26
9
votes
1 answer
Accounting for discrete or binary parameters in Bayesian information criterion
BIC penalizes based on the number of parameters. What if some of the parameters are some sort of binary indicator variables? Do these count as full parameters? But I can combine $m$ binary parameters into one discrete variable that takes values in…

highBandWidth
- 2,092
- 2
- 21
- 34