Questions tagged [parameterization]

For questions about how to parameterize some statistical model, or comparisons between different ways to parameterize.

297 questions
47
votes
8 answers

Would a Bayesian admit that there is one fixed parameter value?

In Bayesian data analysis, parameters are treated as random variables. This stems from the Bayesian subjective conceptualization of probability. But do Bayesians theoretically acknowledge that there is one true fixed parameter value out in the 'real…
ATJ
  • 1,711
  • 1
  • 15
  • 20
31
votes
1 answer

For which distributions are the parameterizations in BUGS and R different?

I have found some distributions for which BUGS and R have different parameterizations: Normal, log-Normal, and Weibull. For each of these, I gather that the second parameter used by R needs to be inverse transformed (1/parameter) before being used…
David LeBauer
  • 7,060
  • 6
  • 44
  • 89
19
votes
5 answers

What's in a name: hyperparameters

So in a normal distribution, we have two parameters: mean $\mu$ and variance $\sigma^2$. In the book Pattern Recognition and Machine Learning, there suddenly appears a hyperparameter $\lambda$ in the regularization terms of the error function. What…
cgo
  • 7,445
  • 10
  • 42
  • 61
17
votes
2 answers

Cross validation and parameter optimization

I have a question about the parameter optimization when I use the 10-fold cross validation. I want to ask that whether the parameters should fix or not during every fold's model training , i.e. (1) select one set of optimized parameters for every…
Kevin
  • 173
  • 1
  • 1
  • 4
15
votes
0 answers

Asymptotic property of tuning parameter in penalized regression

I'm currently working on asymptotic properties of penalized regression. I've read a myriad of papers by now, but there is an essential issue that I cannot get my head around. To keep things simple, I'm going to look at the minimization…
Nick Sabbe
  • 12,119
  • 2
  • 35
  • 43
13
votes
2 answers

Random Forest: what if I know a variable is important

My understanding is the the random forest picks randomly mtry variables to build each decision tree. So if mtry=ncol/3 then each variables will be used on average in 1/3 of the trees. And 2/3 of the trees will not use them. But what if I know that a…
Benoit_Plante
  • 2,461
  • 4
  • 18
  • 25
11
votes
5 answers

Fitting SIR model with 2019-nCoV data doesn't conververge

I am trying to calculate the basic reproduction number $R_0$ of the new 2019-nCoV virus by fitting a SIR model to the current data. My code is based on https://arxiv.org/pdf/1605.01931.pdf, p.…
vonjd
  • 5,886
  • 4
  • 47
  • 59
11
votes
2 answers

gamma parameter in xgboost

I came across one comment in an xgboost tutorial. It says "Remember that gamma brings improvement when you want to use shallow (low max_depth) trees". My understanding is that higher gamma higher regularization. If we have deep (high max_depth)…
Salty Gold Fish
  • 377
  • 1
  • 2
  • 7
11
votes
3 answers

Understanding parameter as a random variable in Bayesian statistics

If I understand correctly, in Bayesian statistics, a parameter is a random variable. When estimating the parameter, a prior distribution is combined with the data to yield a posterior distribution. Question: Is every data point (in the sample as…
Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
10
votes
1 answer

Parametrizing the Behrens–Fisher distributions

"On the Behrens–Fisher Problem: A Review" by Seock-Ho Kim and Allen S. Cohen Journal of Educational and Behavioral Statistics, volume 23, number 4, Winter, 1998, pages 356–377 I'm looking at this thing and it says: Fisher (1935, 1939) chose the…
Michael Hardy
  • 7,094
  • 1
  • 20
  • 38
10
votes
1 answer

What does the cost (C) parameter mean in SVM?

I am trying to fit a SVM to my data. My dataset contains 3 classes and I am performing 10 fold cross validation (in LibSVM): ./svm-train -g 0.5 -c 10 -e 0.1 -v 10 training_data The help thereby states: -c cost : set the parameter C of C-SVC,…
10
votes
2 answers

Fisher information matrix determinant for an overparameterized model

Consider a Bernoulli random variable $X\in\{0,1\}$ with parameter $\theta$ (probability of success). The likelihood function and Fisher information (a $1 \times 1$ matrix) are: $$ \begin{align} \mathcal{L}_1(\theta;X) &= p(\left.X\right|\theta) =…
9
votes
1 answer

What exactly makes a model "overparameterized"?

I often read that training "overparameterized" networks works well in practice, and perhaps no one yet knows exactly why yet. However, when I look at the number of samples and parameters many NNs use, they are still fitting with more data than…
Josh
  • 3,408
  • 4
  • 22
  • 46
9
votes
1 answer

Algorithms for weighted maximum likelihood parameter estimation

What are the computational or algorithmic considerations for weighted maximum likelihood parameter estimation? That is, I want to get $$ \theta^* = \arg\max\limits_\theta \sum_i w_i \log(\mathcal{L}(\theta|x_i)) $$ assuming we have a weight $w_i$…
9
votes
1 answer

Accounting for discrete or binary parameters in Bayesian information criterion

BIC penalizes based on the number of parameters. What if some of the parameters are some sort of binary indicator variables? Do these count as full parameters? But I can combine $m$ binary parameters into one discrete variable that takes values in…
highBandWidth
  • 2,092
  • 2
  • 21
  • 34
1
2 3
19 20