I am looking for advanced theories for M estimation.
Suppose $X_1,\dots,X_n$ comes from some parametric family. They are independent but not identical with one common parameter $\theta$ and one own parameter $a_i$(not same for all cases). I am only interested in common parameter $\theta$.
the estimation process is like (I get rid of $a_i$ by some way) $$\hat{\theta}=\arg\min f(X;\theta)$$
If I want to use the M-estimation idea, can I define true parameter $\theta_0$ as $$\theta_0=\arg\min E f(X;\theta)$$
My question is whether $\hat{\theta}\rightarrow\theta_0$ in some kind of convergence. If yes, what's the regularity conditions? Please provide some reference.
Thanks!