Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate from requests ➡️ httpx #312

Merged
merged 5 commits into from
Dec 19, 2023
Merged

Migrate from requests ➡️ httpx #312

merged 5 commits into from
Dec 19, 2023

Conversation

teutoburg
Copy link
Contributor

@teutoburg teutoburg commented Dec 17, 2023

The httpx package seems to handle web requests better and is easier to work with than the previously requests package. So let's use this instead, as already done in another of our packages.

API changes

  • Moved server.database.ServerError ➡️ server.download_utils.ServerError.
  • Moved server.database.get_server_folder_contents() ➡️ server.download_utils.get_server_folder_contents() and add client argument.
  • Add optional client argument to server.database.crawl_server_dirs().
  • Add optional client argument to server.database.get_all_package_versions().
  • Add client argument to server.database.get_package_folders().
  • Add client argument to server.database.get_server_folder_package_names().
  • Add client argument to server.database._download_single_package().
  • Refactor server.database.get_all_packages_on_server() to simply wrap server.database.crawl_server_dirs().
  • Remove from_cache argument from server.database.download_packages().
  • Modify server.download_utils.handle_download() and add server.download_utils.send_get() to better fit the new backend.

On the client

As stated in the documentation, using a httpx.Client (similar in concept and function to requests.Session) has several performance benefits. Actually keeping the (context-managed) client instance around during one task will ensure we can actually make proper use of this. Best example: download_packages makes two separate requests to the server, first to get the version and location information for the requested packages and second to perform the actual download. This used to be done with separate sessions (or none at all), while it is now done using a single client. This already has a notable performance boost, especially for slower networks. Any function that could benefit from this now has a (usually optional) client argument, as mentioned above. In the cases where it is optional (mostly functions that could be called from outside as well, e.g. to just list versions of packages on the server), if no client is given, a new one will be created for the scope of the function. This ensures that we can get both: optimal performance when the function is called from somewhere that already has a client, and easy standalone use without having to manually create a client first.

Other changes

  • Simplified the GitHub download functions.
  • Centralized exception handling for requests.
  • Re-included some tests that were previously skipped because they failed to often. Let's see if that's still the case now!
  • Various small improvements of docstrings, or adding ones where there were none.

Further considerations

Tip

Unlike requests, the httpx package is built from the ground up to support async functionality. While not currently used, it should be relatively straightforward (as much as any async stuff in Python ever is) to do that.

Note

A number of our current users are experiencing troubles with the download from the IRDB server (while the GitHub-based download does not appear to have the same issues). I would be interested to see if these changes solve that.

Note

As this includes substantial enough changes to the public API (most notably the change to server.database.download_packages()), this should mandate a new minor version (v0.8.0) in my opinion.

@teutoburg teutoburg self-assigned this Dec 17, 2023
@teutoburg teutoburg added dependencies Related to or updating any dependencies refactor Implementation improvement tests Related to unit or integration tests API How users interact with the software labels Dec 17, 2023
Copy link

codecov bot commented Dec 17, 2023

Codecov Report

Attention: 26 lines in your changes are missing coverage. Please review.

Comparison is base (8c4a18b) 80.32% compared to head (e65e922) 80.77%.

Files Patch % Lines
scopesim/server/download_utils.py 76.47% 12 Missing ⚠️
scopesim/server/database.py 81.35% 11 Missing ⚠️
scopesim/utils.py 33.33% 2 Missing ⚠️
scopesim/server/example_data_utils.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff               @@
##           dev_master     #312      +/-   ##
==============================================
+ Coverage       80.32%   80.77%   +0.45%     
==============================================
  Files             148      148              
  Lines           14902    14867      -35     
==============================================
+ Hits            11970    12009      +39     
+ Misses           2932     2858      -74     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@teutoburg teutoburg marked this pull request as ready for review December 19, 2023 00:00
Copy link
Collaborator

@hugobuddel hugobuddel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about httpx, seems like the successor to requests that everyone agrees upon.

Sad to see requests go, it was such a relief when requests was released and it feels like it started a wave of great Python libraries. I like the "Readme Driven Development" model. But no async support is pretty bad. Time to move on.

I noticed you removed all the retry functionality. Are we sure this is not necessary anymore? Let's assume it is indeed fine, and we can add it back later if necessary.

Let's see how this goes!

@teutoburg
Copy link
Contributor Author

I noticed you removed all the retry functionality. Are we sure this is not necessary anymore? Let's assume it is indeed fine, and we can add it back later if necessary.

Retry is still done: https://github.com/AstarVienna/ScopeSim/pull/312/files#diff-6a1f75ec7a32e1ba7c5cb73ec6c29b040624cc828fff8361be76c0113d1dbae1R42

It works a bit differently than in requests, and seems somewhat less configurable (or maybe it isn't, but the documentation is somewhat lacking in that regard). But than again, I don't think we really actually needed all that configurability anyway...

@teutoburg teutoburg merged commit 00ec419 into dev_master Dec 19, 2023
22 checks passed
@teutoburg teutoburg deleted the fh/httpx branch December 19, 2023 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API How users interact with the software dependencies Related to or updating any dependencies refactor Implementation improvement tests Related to unit or integration tests
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants