The choice of larger $k$ in $k$-fold cross-validation (CV) does not give you more training data. If you do $k$-fold CV, ultimately, $k-1$ times, every one of your data points is used for training, so the number of training data points is the same, whatever $k$ you choose.
If you look at a single run in an isolated manner, indeed higher $k$ gives you more training data than lower $k$, and then actually the training is more accurate than for lower $k$. However, this effect is countered by the fact that if you run this on all $k$ folds, the results are highly correlated, because there is a big overlap between the different training sets. Granted, they are all large, but because you always have the same number of data points overall, these large training datasets have big overlap, which increases the variance of the overall $k$-fold procedure where you run all $k$ folds.
So the two statements you read look correct (obviously I can't know their precise context), but they are not conflicting.
(Well, let's say... the second statement with "really" high variance seems exaggerated... the overall variance of this even with very large $k$ may still be OK, though probably not optimal. But one cannot simply say that the variance gets smaller if $k$ gets larger, for the reason given above.)