Wikipedia article about standard error
It's clear to me that the formula $${SD}_\bar{x}\ = \frac{\sigma}{\sqrt{n}}$$
is equal to the true standard deviation of the sample mean, given that $\sigma$ is the population s.d., and $n$ is the sample size.
If we don't know the population standard deviation $\sigma$, best we can do is to estimate it with a sample standard deviation $s$, and the formula we get is:
$${SE}_\bar{x}\ = \frac{s}{\sqrt{n}}$$
The quantity above is called the standard error of the mean. Why error? To me this is nothing more than just an estimate of the standard deviation of the sample mean. How can this quantity be related to calculating some kind of error?
If we have the sampling distribution of the sample mean we know exactly that the mean of this distribution is equal to the population mean. Maybe the point is we don't know the population mean, then it would make sense. Because if we don't know the population mean $\mu$ AND that the unknown $\mu$ is equal to the mean of the sample mean, we could make more precise guess as of what is the population mean $\mu$ if the sample mean standard deviation had small standard deviation. Intuitively it makes sense. But why is it calculated that way? Is it possible to prove that it actually calculates the error correctly, or is it just a definition of the error?
And why making the sample size larger (increasing $n$) makes the error smaller?
Yes, I've read other questions already:
General method for deriving the standard error
Difference between standard error and standard deviation
How does the standard error work?
But I still can't understand it.