I know that if the cost functions are respectively the least squares ($L^2$) and the absolute deviation ($L^1$), the solution to linear regression is the conditional mean and the conditional median respectively. To see this, a simple method will be to set the derivative of the cost function to 0, as follows.
\begin{align*} \frac{\mathrm{d}}{\mathrm{d}\beta} ||y-\beta||^2&=0 \implies \beta = \frac{1}{n}\sum_i y_i,\\ \frac{\mathrm{d}}{\mathrm{d}\beta} \sum_i |y_i-\beta| &= 0 \implies \sum_i \text{sgn}(y_i-\beta) =0 \implies \beta = \text{median}(y_i). \end{align*}
I have read that the conditional mode comes into play for a uniform cost function, i.e., $C(y, \beta) = 1$ for $|y-\beta|>\epsilon$ and 0 else, as $\epsilon\to 0$. I repeat the derivative step above to get to: $$\lim_{\epsilon \to 0}\sum_i-\delta(\beta - \overline{y_i-\epsilon}) + \delta(\beta - \overline{y_i+\epsilon}),$$ ($\delta$ being the Kronecker delta).
- How do we get to the conditional mode from the last step?
- The MAP estimate is also linked to the uniform cost function, but what is the precise relationship between MAP estimate and the above derivation?