You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear author,
When I run RDA anaysis, I find that there are not full environment variables in the correlation plot, like this: I use 42 environment variables, but there are only 21 in this plot. I have read the A&Q in issue511,but I am still confused about the alias variables:
In my data, the rda only considers the first 21 variables and ignores the rest. Then, I delete the first 21 variables and perform the rda, it still consider the first 21 variables in the rest variables. But these variables do not have strong correlation, for example: UV and bio1-11.
**Thus, my problems are:
If I want to draw the explanatory plot of all 42 variables (like the plots below), what should I do?
How could I know variable and its alias variable? Could I use one of the variables to perform RDA?
Also, in rda forward selection, we also meet alias variables, if one is significant, does another is significant too?
When I use the results of RDA to get the intersects of the results of LFMM, there are little intersects. Do u have any suggestions? (my workflow is : use forward selection to select envrionment variables--run rda anaysis and LFMM anaysis--get the intersects)**
Thank you sir very much ! I really appreciate it if u could help me!
Looks like you have only ~24 samples, so no matter what you do you can't estimate the effects of 42 environmental variables. It is simply not possible to estimate more things than you have observations in statistical models like this without some form of regularization and there is not option for that in vegan (nor any other software providing RDA in R, but is for canonical correlation analysis for example, though not in vegan).
Aliasing means that 1 or more of your variables is linearly dependent on one or more of the other environmental variables. It doesn't matter what the individual correlations are between pairs of variables are, but whether you can form one or more variables as a linear combination of the other variables. Perhaps once you know bio1 and bio6, bio5 provides no additional information, hence one of those three will become aliased.
I don't understand what you mean by
When I use the results of RDA to get the intersects of the results of LFMM, there are little intersects
What does "intersects" mean? LFMM includes regularization in the form of ridge or lasso penalties, so it doesn't surprise me that it might be able to fit all your variables for reasons I mentioned above. But then you have the issue of interpretting the effects of those variables as the estimates are biased by virtue of the regularization.
You'll see the rank defficiency (as that is what it is called; your matrix of environmental variables is not of full rank (42) but of some lower rank and that is what rda() is telling you with the aliasing) in rda() with forward selection, but only if the one of those variables that is linearly dependent on others in the set are selected for inclusion. We don't recommend doing standard forward selection with inclusion determined by AIC or permutation p value. Instead, advice is to fit the full model and proceed only if the omnibus test of that model is statistically significant. Then we use two stopping rules while doing the forward selection:
Does the adjusted $R^2$ of the current model exceed the adjusted $R^2$ of the full model?
Does the variable that improves the model most at this step make a statistically significant improvement to the model?
The problem you have is that you can't fit this full model because of the linear dependence.
You could use envfit() to visualise the aliased variables.
This discussion was converted from issue #612 on December 13, 2023 16:20.
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Dear author,
When I run RDA anaysis, I find that there are not full environment variables in the correlation plot, like this: I use 42 environment variables, but there are only 21 in this plot. I have read the A&Q in issue511,but I am still confused about the alias variables:
In my data, the rda only considers the first 21 variables and ignores the rest. Then, I delete the first 21 variables and perform the rda, it still consider the first 21 variables in the rest variables. But these variables do not have strong correlation, for example: UV and bio1-11.
**Thus, my problems are:
Thank you sir very much ! I really appreciate it if u could help me!
all variables
change the sorts
delete bio12-19
Beta Was this translation helpful? Give feedback.
All reactions