4

It is said that a feature function can represent anything, the first or the last word of a sentence, a capital character and etc. But how exactly can I represent them in such a form: $F_j(x, y)$ or $\sum_i f_j(y_{i-1}, y_i, \bar x, i)$ as explained in this tutorial? Could anyone please help instantiate one in mathematical language?

dontloo
  • 13,692
  • 7
  • 51
  • 80
Lerner Zhang
  • 5,017
  • 1
  • 31
  • 52

2 Answers2

1

In the simplest case, say $p(y|x,w)$ is a Gaussian distribution centered at $x$, $$p(y|x,w)=\frac{1}{Z}\exp(\langle w, F(x,y)\rangle)=\frac{1}{\sqrt{2\pi}\sigma}\exp(-\frac{(y-x)^2}{2\sigma^2})$$ we can interpret the feature function as $F(x,y)=(y-x)^2$, the parameter $w$ is then $-1/2\sigma^2$.

There's more than one way to construct the feature functions in this case, we can also let the feature function produce a vector $F(x,y)=(x^2, xy, y^2)$, then setting the parameter $w=(-1/2\sigma^2, 1/\sigma^2, -1/2\sigma^2)$ will give the same result.


In the case that $y$ is a categorical variable, consider the logistic regression, $$p(y_k|x,w)=\frac{1}{Z}\exp(\langle w, F(x,y_k)\rangle)=\frac{1}{\sum\exp(w_i\phi)}\exp(w_k\phi)$$ the feature function $F(x,y_k)$ should return a N-dimensional vector with phi being the k-th element and 0 elsewhere so that $\langle w, F(x,y_k)\rangle=w_k\phi$.

dontloo
  • 13,692
  • 7
  • 51
  • 80
1

In a CRF, for instance, as below, X is a set of observed variables and Y target variables:

enter image description here

Since the $X$ are all observed, a factor $\phi_i (X, Y_i)$ can be any function. And if it is a logistic regression function, then $X$ are just the features and so as any combination or manipulation of them, as we do in simple logistic regression tasks.

For instance(general cases) if we say, $X_1$ should be the same as $X_2$ for $Y_i$ to be a particular state $Y_c$, then we just define it as $X_1=X_2$ as one feature. If it holds the energy between this feature and $Y_c$ is low and the probability is high.

Lerner Zhang
  • 5,017
  • 1
  • 31
  • 52