When performing regression to fit a function, $f (x,{\bf \beta})$, to a set of observed data, $y_i(x_i)$, we are seeking to optimize the parameters, $\beta$, of the fitting function, to minimize some relative measure of deviation between $f (x_i,{\bf \beta})$ and $y_i(x_i)$.
For a non-linear least squares regression model that assumes heteroscedastic errors, $\sigma_{y,i}$ on each measurement and $\sigma_{x,i}$ on the explanatory variable, we are trying to maximize the likelihood (i.e. arbitrary measure of similarity) of $$ L \propto \prod_i e^{-\frac{1}{2} \left( \frac{y_i-f (x_i + \sigma_{x,i},{\bf \beta})}{\sigma_{y,i}} \right)^2} $$
by optimizing the parameters $\beta$ in our model. This is equivalent to minimizing the relative deviation between our fit and measurements. But is there an explicit reason why the likelihood takes the Gaussian form displayed above? If our errors $\sigma_{y,i}$ and $\sigma_{x,i}$ were not normally distributed but instead followed an arbitrary distribution (e.g. Poisson, Lorentzian), how would the above likelihood function have to be modified?