I will give it a try. Assume that the Gauss-Markov assumptions hold and we deal with a simple linear regression model:
$$
\boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{u} \quad, \boldsymbol{u} \vert \boldsymbol{X} \sim {\cal N}(\boldsymbol{0}_n, \sigma^2 \boldsymbol{I}_n)
$$
You want to test $H_0: \boldsymbol{R}\boldsymbol{\beta}=\boldsymbol{r}$ where $\boldsymbol{R} \in \mathbb{R}^{n \times k}$ with $rank(\boldsymbol{R})=r$. The F-statistic is defined as
$$
F= \frac{(\boldsymbol{R}\hat{\boldsymbol{\beta}}-\boldsymbol{r})^\top[\boldsymbol{R}(\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{R}^\top]^{-1}(\boldsymbol{R}\hat{\boldsymbol{\beta}}-\boldsymbol{r})/r}{\hat{\sigma}^2}
$$
which is $F_{r,n-k}$ distributed under $H_0$. By definition, if $X_1 \sim \chi^2(d_1)$ and $X_2 \sim \chi^2(d_2)$, and $X_1$ and $X_2$ are independent, then $F=\frac{X_1/d_1}{X_2/d_2}$ is $F_{d_1,d_2}$ distributed.
Note that an unbiased and consistent estimator for the variance of the disturbances is given by $\hat{\sigma}^2=\frac{\boldsymbol{\hat{u}}^\top\boldsymbol{\hat{u}}}{n-k}$. So, we can write:
\begin{align}
F&=\frac{(\boldsymbol{R}\hat{\boldsymbol{\beta}}-\boldsymbol{r})^\top[\boldsymbol{R}(\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{R}^\top]^{-1}(\boldsymbol{R}\hat{\boldsymbol{\beta}}-\boldsymbol{r})/r}{\hat{\sigma}^2}\\
&=\frac{(\boldsymbol{R}\hat{\boldsymbol{\beta}}-\boldsymbol{r})^\top[\boldsymbol{R}(\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{R}^\top]^{-1}(\boldsymbol{R}\hat{\boldsymbol{\beta}}-\boldsymbol{r})/r}{\frac{\boldsymbol{\hat{u}}^\top\boldsymbol{\hat{u}}}{n-k}}\\
&=\frac{d/r}{q/(n-k)}
\end{align}
Where $d=(\boldsymbol{R}\hat{\boldsymbol{\beta}}-\boldsymbol{r})^\top[\sigma^2\boldsymbol{R}(\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{R}^\top]^{-1}(\boldsymbol{R}\hat{\boldsymbol{\beta}}-\boldsymbol{r})^\top$ and $q=\frac{\boldsymbol{\hat{u}}^\top\boldsymbol{\hat{u}}}{\sigma^2}$, $\sigma^2$ is now the "true" variance of the disturbances. Basically, all we have to do now is to show that:
- $d \vert \boldsymbol{X} \sim \chi^2(r)$
- $q \vert \boldsymbol{X} \sim \chi^2(n-k)$
- $d$ and $q$ are independent conditional on $\boldsymbol{X}$
Proof of 1
Let $\boldsymbol{a}=\boldsymbol{R}\hat{\boldsymbol{\beta}}-\boldsymbol{r}$. Under $H_0$, this becomes $\boldsymbol{R}\hat{\boldsymbol{\beta}}-\boldsymbol{r}=\boldsymbol{R}(\hat{\boldsymbol{\beta}}-\boldsymbol{\beta})$. Since the sampling error $\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}$ is distributed as $\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}\vert \boldsymbol{X} \sim {\cal N}(\boldsymbol{0}_k,\sigma^2(\boldsymbol{X}^\top\boldsymbol{X})^{-1})$, we know that $\boldsymbol{a}$ is Mv normal with mean vector $\boldsymbol{0}_k$ but we still need to calculate the covariance matrix:
\begin{align}
Var(\boldsymbol{a} \vert \boldsymbol{X})=Var(\boldsymbol{R}(\hat{\boldsymbol{\beta}}-\boldsymbol{\beta})\vert \boldsymbol{X})=\boldsymbol{R}Var(\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}\vert \boldsymbol{X})\boldsymbol{R}^\top=\boldsymbol{R}\sigma^2(\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{R}^\top
\end{align}
Thus, $d$ can be written as $\boldsymbol{a}^\top Var(\boldsymbol{a} \vert \boldsymbol{X})\boldsymbol{a}$ which is a quadratic form.
Furthermore, if $\boldsymbol{y} \in \mathbb{R}^{m \times 1}$ is $\cal{N}(\boldsymbol{\mu},\boldsymbol{\Sigma})$ distributed, the quadratic form $(\boldsymbol{y}-\boldsymbol{\mu})^\top\boldsymbol{\Sigma}^{-1}(\boldsymbol{y}-\boldsymbol{\mu})$ is $\chi^2(m)$ distributed.
It follows that $d \vert \boldsymbol{X} \sim \chi^2(r)$.
Proof of 2
Remember:
\begin{align*}
\frac{\boldsymbol{\hat{u}}^\top\boldsymbol{\hat{u}}}{\sigma^2}&=\frac{\boldsymbol{u}^\top\boldsymbol{M}\boldsymbol{u}}{\sigma^2}
\end{align*}
Where $\boldsymbol{M}=\boldsymbol{I}_n-\boldsymbol{X}(\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{X}^\top$ is the residual maker matrix (which is symmetric and idempotent). Since $\boldsymbol{u} \vert \boldsymbol{X} \sim {\cal N}(\boldsymbol{0}_n, \sigma^2 \boldsymbol{I}_n)$ and $rank(\boldsymbol{M})=tr(\boldsymbol{M})=n-k$, we have ${q\vert \boldsymbol{X} \sim \chi^2(n-k)}$. Where the last part follows from the fact that if $\boldsymbol{y} \sim {\cal N}(\boldsymbol{0}_n,\sigma^2 \boldsymbol{I}_n)$ and $\boldsymbol{M} \in \mathbb{R}^{n \times n}$ is a symmetric, idempotent, and non-stochastic matrix, then the distribution of the quadratic form $\boldsymbol{y}^\top\boldsymbol{M}\boldsymbol{y}$ is given by:
\begin{align}
\boldsymbol{y}^\top\boldsymbol{M}\boldsymbol{y} \sim \chi^2(tr(\boldsymbol{M}))
\end{align}
It follows that $q \vert \boldsymbol{X} \sim \chi^2(n-k)$.
Proof of 3
Basically, $d$ is a function of $\hat{\boldsymbol{\beta}}$ and $q$ is a function of the residuals $\boldsymbol{\hat{u}}$. Now, note that:
\begin{align}
Cov(\boldsymbol{\hat{\beta}},\boldsymbol{\hat{u}}\vert \boldsymbol{X})&=E([\boldsymbol{\hat{\beta}}-E(\boldsymbol{\hat{\beta}}\vert \boldsymbol{X})][\boldsymbol{\hat{u}}-E(\boldsymbol{\hat{u}}\vert \boldsymbol{X})]^\top \vert \boldsymbol{X})\\
&=E([\boldsymbol{\hat{\beta}}-\boldsymbol{\beta}][\boldsymbol{M}\boldsymbol{u}-E(\boldsymbol{M}\boldsymbol{u}\vert \boldsymbol{X})]^\top \vert \boldsymbol{X})\\
&=E((\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{X}^\top\boldsymbol{u}[\boldsymbol{M}\boldsymbol{u}-\boldsymbol{M}E(\boldsymbol{u}\vert \boldsymbol{X})]^\top \vert \boldsymbol{X})\\
&=E((\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{X}^\top\boldsymbol{u}[\boldsymbol{M}\boldsymbol{u}-\boldsymbol{M}]^\top \vert \boldsymbol{X})\\
&=E((\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{X}^\top\boldsymbol{u}[\boldsymbol{M}\boldsymbol{u}]^\top \vert \boldsymbol{X})\\
&=E((\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{X}^\top\boldsymbol{u}\boldsymbol{u}^\top\boldsymbol{M}^\top \vert \boldsymbol{X})\\
&=(\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{X}^\top E(\boldsymbol{u}\boldsymbol{u}^\top\vert \boldsymbol{X})\boldsymbol{M}^\top\\
&=(\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{X}^\top \sigma^2\boldsymbol{I}_n\boldsymbol{M}^\top\\
&=\sigma^2(\boldsymbol{X}^\top\boldsymbol{X})^{-1}\boldsymbol{X}^\top\boldsymbol{M}^\top\\
&=\sigma^2(\boldsymbol{X}^\top\boldsymbol{X})^{-1}(\boldsymbol{M}\boldsymbol{X})^\top\\
&=\boldsymbol{0}_{n \times n}
\end{align}
Where the last line follows from the property of the residual-maker matrix that $\boldsymbol{M}\boldsymbol{X}=\boldsymbol{0}_{n \times n}$. Therefore, $\boldsymbol{\hat{\beta}}$ and $\boldsymbol{\hat{u}}$ are independently distributed conditional on $\boldsymbol{X}$ (two joint normal uncorrelated random variables are independent). Furthermore, if two random vectors $\boldsymbol{x}$ and $\boldsymbol{y}$ are independent, so are every measurable functions $f(\boldsymbol{x})$ and $g(\boldsymbol{y})$. Thus, $d$ and $q$ are independent conditional on $\boldsymbol{X}$.
Finally, combining these results yields the desired result $F \vert \boldsymbol{X} \sim F_{r,{n-k}}$. However, since this distribution does not depend on $\boldsymbol{X}$, the conditional distribution is identical to the unconditional distribution and we get $F \sim F_{r,{n-k}}$.
As you can see, you need to have a solid understanding about linear regressions and the distribution of quadratic forms to understand this explanation. If you want to refresh your knowledge about this stuff, I can recommend the book of Greene "Econometric Analysis". Especially the appendix of this book is helpful to get an overview about the distribution of quadratic forms and the multivariate normal distribution.
I hope that you find this answer helpful.