Replies: 4 comments 8 replies
-
PCA isn't an appropriate method to use with presence absence data. Instead consider correspondence analysis (CA, via |
Beta Was this translation helpful? Give feedback.
-
I agree that CA or NMDS probably are more adequate methods, but there is nothing that makes PCA unsuitable for 0/1 data. In distance terms, PCA is based on Euclidean distances and with 0/1 data such distances are √[S1 + S2 − 2 × S1∩2], where S are numbers of species in sampling units 1 and 2, and index 1∩2 refers to the number of species occurring in both sampling units (shared species). This is a legitimate way of assessing community distances, and inherently used in PCA of 0/1 data. However, it is not the most efficient and informative way, especially when the numbers of species vary a lot in the data, and the distances may rather reflect these differences in species numbers (α diversity) than differences among sampling units (β diversity). CA handles naturally differences in richness (α) and there are many dissimilarity indices that do even better work, and can be analysed with NMDS ( As to your technical question: PCA (including |
Beta Was this translation helpful? Give feedback.
-
PCA retains Euclidean distances of data: the distances of sampling unit points is related to their Euclidean distances. The Euclidean distance between two quantitative vectors x and y expands to √[Σx2 + Σy2 - 2 Σxy]. For 0/1 data these sums are numbers of species (numbers of ones) and for the last crossproduct term the number of species occurring in both sampling units (shared species). Quantitative PCA with One solution is to transform data to equalize the sampling unit totals. The most often recommended method is to use Helllinger standardization. In vegan you can achieve this with: gen.pca <- rda(decostand(data_rra, "hellinger")) After this standardization the retained distances are called Hellinger distances which can be directly written as √[2 - 2 Σ√xy / √(ΣxΣy)] which no longer is dependent on variable sum of squares. Similar transformation can also be used with 0/1 data. |
Beta Was this translation helpful? Give feedback.
-
Hi all, I would like to perform a PCA with the function rda from the vegan package. I know that if you don't include any other variables such as environmental variables, it works as a PCA. Until now, I worked with relative abundances of trnL sequences, but now I would like to know if it is possibile to do it with presence/absence data. If so, should I include any additional argument to the function?
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions