I'm trying to implement an event schema induction method from a paper from 2015. The authors use a generative approach to learn a language model. For this, they use a lot of probability distributions with which I'm not very experienced.
Notably, they generate a distribution from a Dirichlet prior dir(a), with a estimated at 0.1.
However, when I research Dirichlet distributions, the alpha parameter is always a list of values instead of a single number. Can someone please explain what I'm understanding incorrectly about this parameter?
The literal text from the paper is:
Generate an attribute distribution from a Dirichlet prior dir(a);
Generate a head word distribution from a Dirichlet prior dir(b);
Generate a trigger distribution from a Dirichlet prior dir(g);
In a later section:
We first tuned hyper-parameters of the models on the development set. The number of slots was set to K = 35. Dirichlet priors were set to a = 0.1, b = 1 and g = 0.1. The model was learned from the whole dataset.
I've looked at this post about the alpha value, but this didn't clear it up for me.