In domain adaptation under covariate shift, one approach is to weight the instances from the source domain by a factor $\frac{p_T(x)}{p_S(x)}$ in the training, where $p_S(x)$ and $p_T(x)$ represent the density of $x$ in the source and target domains, respectively. It can be shown that this weighting factor of density ratio is proportional to $\frac{1}{p(\delta=S|x)} - 1$, where $p(\delta=S|x)$ is the probability of an instance $x$ coming from the source domain, typically obtained through training a classifier to distinguish between the two domains.
I have three questions regarding this approach.
- I have seen the weighting factor written simply as $\frac{1}{p(\delta=S|x)}$, with the "-1" part dropped (eg, the link above or here, page 15). Why?
- What if the source and target domains turn out to be the same, $p_S(x) = p_T(x)$? In this case, the weighting factor $\frac{p_T(x)}{p_S(x)}$ should be 1 for all $x$, but the classifier would be confused and returns a somewhat arbitrary boundary and arbitrary $p(\delta=S|x)$. Does that mean the approach fails in this case?
- When we train a classifier to predict $p(\delta=S|x)$, should we make sure it is calibrated?