Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for returning a dataframe from gget.mutate, w/ more context #169

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

austinv11
Copy link

Retained two additional columns for the dataframe: mutation_start and mutation_end. Which avoids the need to manually re-parse mutation strings if needed later on.

Additionally, all the extra info in the dataframe is lost in the python API since there are operations done on the dataframe that are copy-on-write which means that inputted dataframes with the update_df flag set to True only gain two potential columns: mutation_type and wt_sequence_full, whereas I would like all the additional context in my dataframe.

To combat this, I made it so that when out=None and update_df=True, the function will return a dataframe. Otherwise it will revert to the previous behavior.

Let me know if you have any feedback!

@josephrich98
Copy link
Collaborator

josephrich98 commented Dec 23, 2024

The new code looks great! For easier integration, could you please do the following thing:

  • pull the latest dev branch to your forked repo
  • merge your version of gget_mutate.py into dev
  • make a new PR into dev

Thank you!

@josephrich98 josephrich98 changed the base branch from main to dev December 23, 2024 23:09
@josephrich98 josephrich98 changed the base branch from dev to main December 23, 2024 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants