I am new to statistics and it this the first time I am trying to normalise variables. Thus, sorry for my incompetence.
My goal is to classify the landscape by cluster analysis (using kmeans in R) and see if there is a relationship between species distribution and type of the landscape.
However, some of the variables (totally there are 10 variables) I am using are strongly correlated. Thus, I decided to do PCA and run cluster analysis with principal components. I tried to do it without normalising variables before PCA (however, I standardised them with data.frame(scale(data,center=T,scale=T)) function), but the results I got didn't impressed me, since, I observed stronger and more interpretable relationship, when I tried to run cluster analysis with 4 uncorrelated and hypothetically the most important variables (also standardised). So now I want to normalise the variables and try to rerun PCA and cluster analysis. But to make matters worse, some of the variables I am using are far from normal, the sample sizes are large (n=40038) and I've got no experience in transforming the data.
I have read that tests of normality are useless in cases of large samples and, generally, for deciding if one is able to use parametric methods for his data. So I am inspecting normality visually and by values of kurtosis and skewness. So for example, I have one, very problematic variable with many zero values which looks like to follow gamma distribution:
I transformed the variable to the power of 0.3 (x^0.3) and got the following results:
Skewness= 0.5006657 Kurtosis= 3.255236
I also tried other transformations and yeo.johnson() function, but none of them produced me a better result. However, I see that the result I have is far from normal. Nonetheless, maybe it is still fair enough approximation of normal distribution for PCA, as this method doesn't have a strict requirement for it? And to rephrase this question - what will happen if I run PCA with that kind of variable, while other variables follow more or less normal distribution, and later use k means to classify principal components' values? Finally, maybe there is a better way to transform this variable?