Using the block-inverse formula, if we write the correlation matrix
as
$$M = \left[\begin{matrix}A & B\\B^t & D
\end{matrix}\right]
$$
then the bottom right block of the inverse correlation matrix will be
$$(D-B^tA^{-1}B)^{-1}
$$
Now assume that we break the correlation matrix into blocks of size $n-1$ and $1$, so that $D$ is a $1\times1$ matrix containing the entry $M_{nn}=Cor(X_n,X_n)=1$. In this case, we get
\begin{align*}
M^{-1}_{nn}&=\frac{1}{1-B^tA^{-1}B}\\
1-\frac{1}{M^{-1}_{nn}}&=B^tA^{-1}B.
\end{align*}
Next, assume WLOG (see note below) that the variables involved all have variance 1 and mean 0, so the correlation matrix is also the covariance matrix. Then $A$ is the covariance matrix for $X_{1..(n-1)}$, and $B$ is the vector of covariances between $X_{1..(n-1)}$ and $X_n$.
It follows that the regression coefficients for $X_n$ given $X_1..X_{n-1}$ are $\beta=A^{-1}B$
and therefore, letting $\hat X_n=X_{1..(n-1)}\beta$ denote the least-squares fit of $X_n$ given $X_1..X_{n-1}$, we get
\begin{align*}
1-\frac{1}{M^{-1}_{nn}} =B^tA^{-1}B = (A^{-1}B)^tA(A^{-1}B)
&= \beta^tA\beta\\
&= Var(\hat{X_n})\\
&= Cov(\hat{X_n},X_n).
\end{align*}
Since $Var(X_n)=1$ by assumption, it follows that
$$R=Cor(\hat{X_n},X_n)=\frac{Cov(\hat{X_n},X_n)}{\sqrt{Var(\hat{X_n})}}=\sqrt{1-\frac{1}{M^{-1}_{nn}}}$$
Note: as @MarkStone points out, WLOG means "without loss of generality." In this case, the assumption of mean 0 and variance 1 is without loss of generality because we can recenter and scale if necessary, and the rescaling parameters will carry through the calculations and yield the same ultimate result.