Questions tagged [covariate-shift]
9 questions
8
votes
1 answer
Difference between distribution shift and data shift, concept drift and model drift
Lately, I am seeing both terms used interchangeably in several scenarios.
Joaquin Quiñonero in MIT press (NIPS), Dataset Shift in ML
NIPS 2021 workshop in DistShift
Model drift: Towards Data Science
Are there differences in the definitions?…

Carlos Mougan
- 238
- 2
- 10
4
votes
1 answer
Why is importance-weighted empirical risk minimization finite-sample biased?
Classical risk minimization (RM) minimizes the expected loss over the training distribution $p_{\mathrm{train}}(x)$,
$$\theta^*_{RM} = \arg \min_\theta E[\ell(x, \theta)]_{p_{\text{train}}}.$$
As the distribution $p_{\text{train}}$ is usually…

jhin
- 749
- 4
- 12
2
votes
2 answers
How to intuit the covariate shift?
Out of distribution and shifting data distribution are two types of dataset shift 1, I can understand what out-of-distribution means but not what shifting data distributions are. In that blog an example of OOD is given as follow:
For example,…

Lerner Zhang
- 5,017
- 1
- 31
- 52
1
vote
1 answer
Covariate shift in k-means clustering
I'm trying to build a customer segmentation framework on e-commerce data. To do this, I'm using k-means clustering on variables which quantify the purchase Recency, purchase Frequency, Monetary value of the purchase (RFM segmentation) + additionally…
1
vote
1 answer
Domain adaptation under covariate shift: estimating density ratio through a classifier
In domain adaptation under covariate shift, one approach is to weight the instances from the source domain by a factor $\frac{p_T(x)}{p_S(x)}$ in the training, where $p_S(x)$ and $p_T(x)$ represent the density of $x$ in the source and target…

Lei Huang
- 756
- 6
- 13
1
vote
0 answers
Should I use statistical tests when the sample size is big (over 100K)?
I'm looking for a method to identify data drift of features between two different times.
Background:
I'm calculating the same features, on almost the same population (for example, company employees) every month. Population size is over 100K.
An…

Shay.G
- 11
- 1
0
votes
0 answers
Can two subsamples of the same dataset have different distributions (covariate shift)?
The reason for my question is that I trained a model for binary classification; once obtained the results, I trained another model on these results where:
The predicted instances are used as a training set.
And the unpredicted instances are used as…

s_am
- 23
- 4
0
votes
0 answers
Test data relevance to a model (covariate shift)
I am trying to design an algorithm that will allow to calculate the relevance of test data to a trained model.
This can be done by checking if predictor variables have a different distribution in train and test data (covariate shift).
Main idea: If…

dokondr
- 247
- 2
- 10
0
votes
0 answers
What type of domain shift exists in my data?
I am trying to understand what type of shift(s) exist in my problem to get a better grasp. I have a dataset which comprises of a deep neural network's (DNN) runtime latency ($y$), its architecture ($a$) as well as the hardware it was run on ($h$).…

saad
- 155
- 7