It may be easier to understand the relationship between the probability of a Type I error and the p-value if we think about the alternative way of conducting hypothesis tests via a rejection region (which is equivalent to rejecting based on a p-value).
To keep this simple, that suppose we are considering the one sided hypothesis test $H_0:$ $\mu \le \mu_0$ vs $H_1:$ $\mu>\mu_0$ in the case where our data is the single observation $X$ which has a distribution $Normal(\mu,1)$. Consequently, we will use $X$ as our test statistic, which under the null hypothesis has a distribution of $Normal(\mu_0,1)$
When we want to conduct this test via a rejection region, we basically say that we will reject the null hypothesis if our test statistic is too big, and to make this an $\alpha$-level test we want the probability of rejecting when the null hypothesis is true to be $\alpha$. This implies that for some $k$,
\begin{align*}
P(\hbox{reject}|\hbox{H_0 is True}) &= \alpha\\
\Rightarrow P(X>k|\mu = \mu_0) &=\alpha\\
\Rightarrow P(X-\mu_0 >k-\mu_0 |\mu = \mu_0) &= \alpha\\
\Rightarrow P(Z >k-\mu_0) &= \alpha,\\
\end{align*}
where $Z$ is a standard normal Random variable (i.e. $Z \sim Normal(0,1)$). In order for $P(Z >k-\mu_0) = \alpha$, k must be selected such that $k-\mu_0$ is the $1-\alpha$ percentile of the $N(0,1)$ distribution which is usually denoted $Z_\alpha$. This in turn would make $k = Z_{\alpha} + \mu_0$, the $1-\alpha$ percentile of $N(\mu_0,1)$. So, if we reject when $X-\mu_0 >Z_\alpha$ (or equivalently when $X >Z_\alpha+\mu_0$) then our test will be an $\alpha$ level test.
So, we can see how this informs our understanding of the p-value when we look at the definition of the p-value. Let $X_{obs}$ represent our observed data (as compared to $X$, which is a random variable that represents the possible value before we actually observe the data). Then in this case, the probability of seeing data "as extreme" under the null hypothesis is
\begin{align*}
\hbox{p-value} &= P(X\ge X_{obs} |\mu =\mu_0) \\
&=P(X-\mu_0\ge X_{obs}-\mu_0 |\mu =\mu_0). \\
&=P(Z\ge X_{obs} -\mu_0). \\
\end{align*}
If we think about this probability as a function of $X_{obs}$, then we can ask, "what value of $X_{obs}$ would make our p-value $=\alpha$?", the largest p-value we can have where we would reject the null hypothesis. Given our last line, we see that if $X_{obs} -\mu_0 = Z_{\alpha}$ (which implies that $X_{obs} = Z_\alpha+\mu_0$) then our p-value will be $\alpha$. This means that in conducting our hypothesis test via the p-value, if our p-value is less than $\alpha$, then that implies that $X_{obs} >Z_\alpha+\mu_0$, but this is identical to our test conducted via a rejection region!
So, when we are comparing the p-value to our significance, it is because of this equivalence to the test via rejection region and how we constructed the test by requiring that the probability of a Type I error be less than or equal to $\alpha$.
I hope that makes things clearer!