PCA is an affine linear transformation of your data set.
Since k-means does not use correlations, the rotation and translation aspects of PCA does not have any effect. So what PCA reduce to is reducing the effect of the main componentd, and boosting that of the error components - I'm not surprised that it does not work too well.
Using both the original and the transformed features likely won't help much. Because they have different scales. Usually either the original, or the new features will dominate the result. And even if the total variance of the original and the transformed features is the same, you would essentially do a "half PCA" then; essentially the same as doing PCA, but rather than scaling every component to have unit variance, you scale it to have "original variance + 1".
If you had an approach that would do feature selection (k-means does not; feature selection usually needs training labels for good results), then this would not hold. But for k-means, this will not change much.
Don't try to solve problems by stacking as many tools together as possible. Understand what you need to solve, and how (and which) functions may help get you there.