The Wikipedia equation uses the biased, maximum likelihood estimator for the sample variance (divide by $n$), while, as you say the function from the R-packages uses the bias-corrected formula (divide by $n-1$).
The difference in the two formulas is therefore that the maximum likelihood formula of Wikipedia multiplies the moment quotient by $n$, while the R-function multiplies by
$$ \frac {(n-1)^2}{n} = \frac {n^2 -2n +1}{n} = n -2 + \frac 1n$$
The paper Joanes, D. N., & Gill, C. A. (1998). Comparing measures of sample skewness and kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician), 47(1), 183-189.
discusses in some detail different formulas for sample kurtosis (and skewness). They write that the Wikipedia formula you mention - call it "$g_2$" is typical of "older times" (citing Cramer 1946 textbook). Also they report that (at least when they wrote the paper), MINITAB and BMDP were using the same formula as R/timeDate -call it "$b_2$", while SAS, SPSS and Excel were using a third formula for kurtosis (which you can find in the Wikipedia article under "estimation of population kurtosis"), call it "$G_2$".
They perform various tests and they find:
1) In small samples, the differences may be even a reversion in sign (i.e. one formula to give negative kurtosis the other positive...).
2) For normal populations we have in mean-squared error,
$$MSE(g_2) < MSE (b_2) < MSE (G_2) $$
For sample size $100$ the MSE-differences are negligible (+1%) to small (+9%).
3) They also compare the three in MSE terms, for a chi-square for various degrees of freedom and various sample sizes. Here the main results are:
a) For sample size $100+$ the three equations are very close, irrespective of degrees of freedom.
b) For smaller sample sizes, the ranking changes, with $G_2$ being better for d.f=$1$, and $b_2$ being better when d.f. are $50$ or $100$.
Overall, no clear "winner" emerges from this paper.