(ML as in Maximum Likelihood and MAP as in Maximum A-posteriori)
I'm going trough a course book on my own, and without really having peers to talk to I'm turning to stack exchange with these rather rudimentary question, I can't tell if I'm over thinking or if I'm missing something rather obvious.
- MAP/ML based classification/inference is widely accepted, even due to unrealistic assumptions, why?
So, here I assume the unrealistic assumptions are the fact that we assume that we can model the feature distribution and source distribution – i.e. assume they're random variables, which brings us to why they're widely accepted... We can work with random variables in a statistical framework, meaning it's convenient. Second, this is widely accepted because it works well, which we prove with minimum-error rates etc.
The alternative unrealistic core assumption is IID, but from my understanding MAP/ML doesn't necessarily mean we have to assume IID? Just because it's convenient to add up log likelihoods, doesn't mean we have to... but, is this the actual right answer? We basically always assume IID, so that's the core assumption and not that our spaces are random variables...?
- A significant effort is paid to learn distributions for each class, what are two main practical problems? How do we mitigate the problems?
Enough data, because we're trying to model the true distribution with a conditioned distribution on our input data. Calculations, so this is where I think I should be mentioning IID, I mean as a mitigation of the problem of tricky calculations we assume IID.
What do you think? Am I on the right track?