-
Notifications
You must be signed in to change notification settings - Fork 1
PyOpenSci REVIEW - config and download default directories #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyOpenSci REVIEW - config and download default directories #51
Conversation
Codecov ReportBase: 97.50% // Head: 97.50% // No change to project coverage 👍
Additional details and impacted files@@ Coverage Diff @@
## main #51 +/- ##
=======================================
Coverage 97.50% 97.50%
=======================================
Files 5 5
Lines 160 160
=======================================
Hits 156 156
Misses 4 4
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Purely on a user experience basis, adding the option might be good, but the default behavior should be changed IMO. |
Makes sense that the first contact experience should be smooth. Still not sure which default directory would guarantee that, i.e., we could select the current working directory as the default directory to download the PGAP database, however, I'm not sure that's the best place either. I personally like the PGAP database downloaded in the install directory, as I won't need to do anything with these files, Pynteny will take care of fetching them. On the other hand, storing the database in the installation directory is less prone to errors due to relocation of the database. If relocated, pynteny wouldn't know where to look to retrieve the HMM files within the PGAP database. The user would have to modify the config file to tell pynteny the new location. Btw, I forgot about it, but it was already possible to select a different directory with |
I don't have access to a computer right on, how much does the database weight again? If manageable, how do you feel about packaging the compressed file within the package itself? |
At least we should figure out a way to delete the database if the package is uninstalled from the environment |
i see the problem you are pointing to now. Config file and database remain in the environment after removing pynteny. This is definitely inconvenient. I'll relocate these files inside the package itself, hopefully, they will get removed when uninstalling. Will check this and let you know. Thanks... EDITED: Ok, they are not removed when uninstalling pynteny ("Would not remove (might be manually added):". Will try any of the other options, then |
Co-authored-by: Alex Batisse <[email protected]>
Co-authored-by: Alex Batisse <[email protected]>
45193e6
to
8215ff6
Compare
Ok, I might have figured something out in the end. Here is a summary of the possible approaches. Packaging the compressed file within the packagePros:
Cons:
Adding empty files with the same namesSo I thought of something else: to "cheat" Python into uninstalling the file, we can add empty files with the database + metadata names to the package. The download subcommand would then replace those files inside the virtual environment. Pros:
Cons:
Using a single, documented pathSomething like Pros:
Cons:
Do you have a preference among those three propositions? |
Hi @Batalex, thanks for these suggestions, these are very creative, particularly the one cheating Python! so, I have given it some thought. While I like the idea of installing the database within the package, I think the best and easiest option is to download the database somewhere else and make download location a required argument, so the user selects where to download it. The documentation will include a note clarifying that the database cannot be relocated. At the end this HMM database is optional. This way it is the user who is responsible for its deletion once used. Will make the adjustments to the code in the following days. |
You're welcome. I will update the review comment to check what has been done since then! |
This PR addresses:
Addressing reviewer @Batalex comments in PyOpenSci:
Right. Regarding the config file, I think I should perhaps rename it to avoid confusion. It is not really intended to be modified by the user (like a proper config file). Rather, it's been used to store session information, such as whether or not the PGAP database has already been downloaded as well as the path to PGAP. This way, I find it ok to store this file within the installation directory since the user will not interact with the file.
Regarding the default dir in which to download the PGAG database, what's a better alternative to the installation directory for the default path? The user can still select a different location via the flag
pynteny download --outdir
. The advantage I see placing the PGAP's database in the installation directory is that it can be easily removed when uninstalling pynteny with conda. And I don't see a clear disadvantage in placing the PGAP database in the install directory by default. Could you elaborate on why it is inconvenient?I find
python-wget
useful and very easy to use (e.g download file and show a progress bar with a single methoddownload
. However, it is true that it hasn't been updated since 2015.I'm getting an error with
requests
it seems like the file doesn't get downloaded in its totality. I have tried httpx and it gives me this error:
I think we can stick with
python-wget
for now, it works well. I think we can worry about it once it stops working.Not really, changed!