2

I am having trouble understanding some of the ideas behind $M$-estimation and it would be great if someone could help me out. From my understanding, it just an estimate that comes out of minimizing $\sum{\rho(r_i)}$ for some arbitrary function $\rho()$. In the linear regression setting, we can show that doing an $M$-estimation is equivalent to minimizing a weighted least squares with weights $w_i=\psi(r_i)/r_i$. So we typically choose an appropriate $\psi$ function to weight points with low residuals more than those with high residuals (allowing for the regression to be robust).

My problem is that I have trouble looking at M-estimation in a non-regression setting. E.g. if I am looking at a distribution (example in my course notes uses t-distribution with location-scale with parameter $\mu$) and I express the estimate from MLE as an M-estimate, how can I interpret the $\psi$ function? E.g. can I still look at it as a mechanism for weighting certain observations (higher weight for observations close to mean and lower for farther from the mean).

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
user153009
  • 255
  • 2
  • 5

0 Answers0