I have two related questions about S.E.M. inspired by this question. I think some of my assumptions, detailed below, may be incorrect and the source of my confusion.
When the population SD is unknown, we can use the sample SD to approximate the S.E.M.
What if the SD of the random sample ends up being a poor estimation of the population SD? How would you even know if this were the case in real life data, for example, where you don't even know the population SD? Wouldn't the resulting SE be invalid?
Unless I am mistaken, to get your sample SD, all you need is a single random sample. And yet, from this, you can extrapolate what the spread of all means will be when you calculate S.E.M. How is this possible? Is this at all related to the Central Limit Theorem and what we expect the sampling distribution to look like?