The following viewpoint may help your intuition:
Let some there be some data distributed according to a quadratic curve:
$$y \sim \mathcal{N}(\mu = a+bx+cx^2, \sigma^2 = 10^{-3})$$
For instance with $x \sim \mathcal{U}(0,1)$ and $a=0.2$, $b=0$ and $c=1$. Then a linear curve and a polynomial curve will have very different coefficients for the linear term.

set.seed(1)
x <- runif(100, 0,1)
y <- rnorm(100, mean = 0.2+0*x+1*x^2,
sd = 10^-1.5)
plot(x,y, ylim = c(0,1.5),
pch = 21, col = 1 , bg = 1, cex = 0.7)
mod1 <- lm(y~x)
mod2 <- lm(y~poly(x,2, raw =TRUE))
xs <- seq(0,10,0.01)
lines(xs,predict(mod1,newdata = list(x = xs)), lty = 2)
lines(xs,predict(mod2,newdata = list(x = xs)),lty =1)
legend(0,1.5,c("y = 0.009 + 1.023 x", "y = 0.193 + 0.016 x + 0.994 x^2"), lty = c(2,1))
Correlation
The reason is that the variables/regressors $x$ and $x^2$ correlate.
The coefficient estimates computed with a linear regression are not a simple correlation (perpendicular projection onto each regressor seperately):
$$\hat{\beta} \neq \alpha = \mathbf{X^t} y$$
(this would give coefficients $\alpha_1$ and $\alpha_2$ in the image below, and these coordinates/coefficients/correlations do not change when you add or remove other regressors)
Using the correlation/projection $\mathbf{X^t}y$ is wrong, because if there is a correlation between the vectors in $\mathbf{X}$, then there will be an overlap between some vectors. This part that overlaps will be redundant and added too much. The predicted value $\hat{y} = \alpha \mathbf{X}$ would be too large.
For this reason there is a correction with a term $(\mathbf{X^t}\mathbf{X})^{-1}$ that accounts for the overlap/correlation between the regressors. This might be clear in the image below which stems from this question: Intuition behind $(X^TX)^{-1}$ in closed form of w in Linear Regression

Intuitive view
So the regressors $x$ and $x^2$ both correlate with the data $y$ and they both will be able to express the variation in the dependent data. But when we use them together then we are not gonna add them according to their single independent effects (according to correlation with $y$) because that would be too much.
If we use both $x$ and $x^2$ in the regression then obviously the coefficient for the linear term $x$ should be very small since this is the same in the true relation.
However, when we are not using the quadratic term $x^2$ in the regression (or otherwise add a bias to the coefficient for the quadratic term), then the coefficient for $x$ which correlates somewhat with $x^2$ will partly take correct this (take over) and... the value of the estimate for the coefficient of the linear term will change.
See also: