First, let me summarize the principle. Let $\boldsymbol{y}$ be the data (sample) and $p(\boldsymbol{y}|\boldsymbol{\theta})$ be the distribution of $\boldsymbol{y}$, parameterized by $\boldsymbol{\theta}$. Let take prior $p(\boldsymbol{\theta})$ for $\boldsymbol{\theta}$. After observing the sample (with the joint distribution $p(\boldsymbol{y}|\boldsymbol{\theta})$, also denoted by $L(\boldsymbol{\theta}|\boldsymbol{y})$) the knowledge about $\boldsymbol{\theta}$ is updated using Bayes' Theorem:
\begin{equation}\label{Ch2 Bayesian Theorem}
p(\boldsymbol{\theta}|\boldsymbol{y})=\dfrac{L(\boldsymbol{\theta}|\boldsymbol{y})p(\boldsymbol{\theta})}{p(\boldsymbol{y})}.
\end{equation}
where $p(\boldsymbol{y})=\int{L(\boldsymbol{\theta}|\boldsymbol{y})p(\boldsymbol{\theta})}d\boldsymbol{\theta}$.
Now, take an example. Suppose we have $y\sim N(\mu, 1)$ and prior for $\mu$ as $N(0, 10^6)$. This is a conjugate prior and then the posterior distribution for $\mu$ is also normal with an analytical form. Bingo!!!
What happens if we take another prior such as a logistic distribution. I do not think this is a conjugate prior.
In general, the problem is at the denominator, which in general is not tractable, i.e., there is no closed form for $p(\boldsymbol{y})=\int{L(\boldsymbol{\theta}|\boldsymbol{y})p(\boldsymbol{\theta})}d\boldsymbol{\theta}$.
In other words, we only know $L(\boldsymbol{\theta}|\boldsymbol{y})p(\boldsymbol{\theta})$ , i.e., we only know $p(\boldsymbol{y})$ up to a multiplicative constant.
In summary:
If we know $p(\boldsymbol{y})$ analytically, for example a common distribution, just sample from it. (Of course we can use MCMC also).
If we just know $p(\boldsymbol{y})$ up to a multiplicative constant, we cannot sample from $p(\boldsymbol{y})$ (since of course we do not know it).
MH algorithm is a solution by the following reasons:
Recall that the acceptance probability is calculated as
\begin{equation*}
\alpha(\boldsymbol{\theta}^k,\boldsymbol{\theta}^{'})=\text{min}\left(\dfrac{p(\boldsymbol{\theta}^{'}|\mathbf{y})q(\boldsymbol{\theta}^k|\boldsymbol{\theta}^{'})}{p(\boldsymbol{\theta}^k|\mathbf{y})q(\boldsymbol{\theta}^{'}|\boldsymbol{\theta}^k)},1\right).
\end{equation*}
It is important to note that since $p(\boldsymbol{y})$ is known up to a multiplicative constant the ratio $\dfrac{p(\boldsymbol{\theta}^{'}|\mathbf{y})}{p(\boldsymbol{\theta}^k|\mathbf{y})}$ is well defined.