I am facing the following problem: I have a training sample and estimate a model on that training sample. My model is simply OLS: $y_t = a + \beta x_t + \varepsilon_t$. The model is estimated on points in set $t\in T$. The training sample contains well behaved data. When forecasting with this model out of sample, there may be points that are poorly measured and thus take on extreme values. I would like to prevent my model from forecasting extreme output values at times when poorly measured points occur. Thus for points $t \notin T$, I would like my coefficients $(\alpha, \beta)$ to be less sensitive to extremes. I think the appropriate thing is to transform the data in some way (maybe through $ln$)? Maybe Box-Cox?
Let me illustrate. Imagine you have a sensor which functions normally 99.9% of the time but .1% of the time generates a random extreme value that has nothing to do with the measurement. Unless your training set includes that point, you are unable to tailor a model around it. However, you would like not to generate an extreme prediction out of sample when that .1% occurs.
I would like to know what the standard techniques are for dealing with this problem. Please provide some references as well if possible.