I am dealing with a task where I train a classification model to predict whether an item is going to be returned in a web shop.
Can I use features which contain information from the target variable?
For example can I create a feature with the relative frequency of how often an item with a specific ID has been returned in the past?
My professor said that this belongs to data leakage and should be avoided (if I remember correctly) but I don't see why this would have a negative impact on the prediction process.