Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify what blockCol is #48

Open
lwaldron opened this issue Jul 2, 2024 · 2 comments
Open

Clarify what blockCol is #48

lwaldron opened this issue Jul 2, 2024 · 2 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@lwaldron
Copy link
Member

lwaldron commented Jul 2, 2024

@asyakhl what is the historical reason for the naming of the lefser() blockCol argument? This name, and the help for the argument:

character(1) Optional column name in colData(relab) indicating the blocks, usually a factor with two levels (e.g., c("adult", "senior"); default NULL).

implies that it is a blocking variable (ie https://en.wikipedia.org/wiki/Blocking_(statistics) ). However, in LEfSe, the two grouping variables define main groups and subgroups for pairwise comparisons, not a blocking variable. Also, why would subgoups usually have two levels?

I think this causes confusion by users trying to use it to define blocking variables (ie, see `blockCol = "patient" in #47). I'm hesitant to rename the variable and break existing code, but we should make it much clearer in documentation how this argument is used and that it doesn't refer to a blocking variable in the statistical sense.

@lwaldron lwaldron added the documentation Improvements or additions to documentation label Jul 2, 2024
@lwaldron
Copy link
Member Author

lwaldron commented Jul 2, 2024

By the way, I noted that the original Python LEfSe functions and manuscript refer to "class" and "subclass", not "group" and "block" (e.g. https://huttenhower.sph.harvard.edu/lefse/ for galaxy module and https://github.com/SegataLab/lefse/blob/master/lefse/lefse_run.py for Python program). This may be worth going through a deprecation cycle for to avoid confusion.

@asyakhl
Copy link
Collaborator

asyakhl commented Aug 4, 2024

@lwaldron I don't remember the historical reason for this kind of naming. We should deprecate groupCol and blockCol and change it to class and subclass to avoid confusion with blocking variable. There can be more than two levels of subgroups! @shbrief I started PR to clarify man for blockCol.

@shbrief shbrief added the enhancement New feature or request label Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants