At what spatial scale should PCA be analysed on? Why do the loadings appear so different at each scale?

Question

My dataset has 6 sites. Each site has four quadrants (qi) that I sampled for 12 months to estimate species abundances. I Hellinger transformed the data prior to the analysis.

For each quadrant I have environment data - temperature, salinity, pH etc. Ordination plots, as I see, are particularly useful for visualizing the similarity among species. For example, in this context, sites and seasons that are closer in ordination space can have species assemblages that are more similar to one another than sites that are further apart in ordination space. Here, the goal is to ask what features of the dataset are driving the observed divergence among points. I am trying to get a better sense of “who” is contributing to this trend by plotting explanatory vectors (i.e., species coordinates) in ordination space.

For this, I analyzed the data in two ways, (a) SUB-SITE LEVEL I used the data-set as is (288 data points) (b) SITE LEVEL I summed abundances across quadrants for each site (72 points).

             sp1 sp2 sp3 sp4
site 1  q1.   0   1  1   1
site 1  q2.   3   0  1   0
site 1  q3.   10  11 0   0
site 1  q4.   1   3  1   2
site 2  q1.   0   4  3   5
site 2  q2.   2   0  1   7
site 2  q3.   1   0  7   2
site 2  q4.   1   2  2   1

The coloured points 1,2,3,4 indicate the 4 different species

At what scale would you recommend considering the data on? The results at the two scales appear very different. The species abundances are over-dispersed in the data. So at the quadrant level, there are many more 0s for abundances. At the site level some of those zeros go away.

Furthermore, I am wondering what type of transformation is most appropriate here? Hellinger seems to be the most suggested for species abundance data. However, other posts indicate that a "zero-adjusted Bray Curtis" index maybe more appropriate [and there seems to be little info on how to do that in general]]

You are performing two different analyses. We can't tell you which one, if either, is appropriate until you explain what you are trying to accomplish. You do state it has something to do with "how similar t[w]o different species are," but could you explain how you conceive of and quantify "similarity"? That will be a crucial aspect of answering your question. — whuber, Jul 08 '21 at 20:40
@whuber Thank you. I have updated the question. Does that help? — Biotechgeek, Jul 08 '21 at 20:54

At what spatial scale should PCA be analysed on? Why do the loadings appear so different at each scale?

0 Answers0