Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Random-Effects-Likelihood Method #30

Draft
wants to merge 316 commits into
base: master
Choose a base branch
from
Draft

Implement Random-Effects-Likelihood Method #30

wants to merge 316 commits into from

Conversation

jon-mah
Copy link

@jon-mah jon-mah commented May 30, 2020

Implement a new Random-Effects-Likelihood (REL)-like method as described in Nielsen and Yang (1998). This is an alternative method for identifying sites under diversifying selection as opposed to the FEL-like method which we call --omegabysite. This method can be called using the --empirical_bayes flag.

Added documentation in .rst format for the above-mentioned REL method.

Added a new method for simulation which explicitly relies on phydms models rather than pyvolve.

skhilton and others added 30 commits November 27, 2018 12:09
@jbloom
Copy link
Member

jbloom commented May 30, 2020

@jon-mah @skhilton, is this pull request that I should review now?

@skhilton
Copy link
Collaborator

@jbloom: yes, that would be great!

@jbloom
Copy link
Member

jbloom commented May 30, 2020

@jon-mah @skhilton: The documentation for the --empirical_bayes approach is inadequate. Build the docs and look at them for the phydms program (equivalent of this on the branch).

It doesn't answer the following:

  • Does the --empirical_bayes option only have a meaning if --omegabysite is already activated?

  • Specifying the number of categories by --empirical_bayes is terrible. The average user will have no idea how many categories to specify. Instead, you should have a separate argument for the number of categories with a reasonable default. So this separates the choice of whether to use REL or FEL approach from the number of categories to use. See how the current implementation separates --gammaomega (which is an option user is likely to use) and --ncats (which has a reasonable default most users will probably never understand).

  • What output files are produced if --empirical_bayes is used, and how do those differ from if --omegabysite is used? How does one interpret the output of those files?

  • Can a user run both the REL and FEL at the same time?

  • Furthermore, --empirical_bayes is probably a poor name for REL-based detection of positive selection. Empirical Bayes can mean a lot of things in a lot of contexts in phylogenetics, here we specifically mean for site-specific omega values so the argument names should somehow indicate that.

Just tag me when these are fixed and I'll continue reviewing the rest of the request.

@jbloom
Copy link
Member

jbloom commented May 30, 2020

@jon-mah @skhilton: Other things that appear to be problems:

  • The REL approach should output posterior probability omega_r < 1 and > 1, not just > 1. Sometimes people are interested in purifying selection.

  • Another thing that occurs to me, what happens if someone tries to use REL without specifying --gammaomega. Do they get an error that nicely explains to them that this is not allowed? Do they just not get the REL posterior probabilities? Does it automatically turn on --gammaomega? In particular, in the context of phydms_comprehensive.

@skhilton skhilton marked this pull request as draft June 16, 2020 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants