As a follow up to this topic OLS assumption normallity of error term really needed? I had a question.
When we do OLS we have usually the normal model:
$$ y_i=ax_i+\epsilon_i [1]$$ $$ \epsilon_i \sim N(0,\sigma^2) [2]$$
with independent residuals. Than OLS can be derived in a maximum likelihood setting maximizing the log likelihood. Fix $f(x|\sigma)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-x^2/\sigma^2}$ the density of the centered standard Gaussian. Then:
$\log L(\alpha,\sigma)=\sum_i \log f(y_i-a x_i|\sigma)$ [3]
and maximizing analytically one recovers ordinary OLS.
Now [2] can be relaxed/changed as a hypothesis, for example we can have $\epsilon_i \sim$ Student's $t (\nu)$ or whatever parametric distribution we choose. Than [3] would change, a different $f$ should be used, and in the general case maybe one should make a numerical optimization.
Is it common to have linear models with non normally distributed residuals applied in real scenario? Is there some example?
If there is any software recommendation, it would also be useful :)