-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: omit read_session
with latest google-cloud-bigquery-storage
#748
Conversation
`read_session` is unnecessary as of `google-cloud-bigquery-storage>=2.6.0`. This will allow us to more loudly deprecate the use of `rows(read_session)`. Rather than require 2.6.0, version switches will allow us to keep our requirements range wider. Will want to give this version some time to bake before making it required.
google/cloud/bigquery/_helpers.py
Outdated
if installed_version >= _BQ_STORAGE_OPTIONAL_READ_SESSION_VERSION: | ||
_IS_BQ_STORAGE_READ_SESSION_OPTIONAL = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could potentially be tricky, because the value of the _IS_BQ_STORAGE_READ_SESSION_OPTIONAL
"constant" depends on whether _verify_bq_storage_version()
has been called or not.
It would IMO be more straightforward if we performed version detection at import time and already make the final decision on the value of _IS_BQ_STORAGE_READ_SESSION_OPTIONAL
then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't import google.cloud.bigquery_storage
in _download_table_bqstorage_stream
. Since that function gets a client and other objects passed in to it.
How about we make sure we call _verify_bq_storage_version()
from _download_table_bqstorage_stream()
and optimize to not call the version parsing too often?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overhead of parsing a version is most likely negligible compared to the request/response times, I was more concerned that changing a module "constant" might represent a landmine in the "right" constellation of the planets.
Perhaps just adding a comment that this private "constant" is not a constant and can change when calling _verify_bq_storage_version()
would be enough? You know, just to warn the readers about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I think there's a way to do this without "constants" that aren't actually constant. I think maybe I'll create a little class to hold our version comparisons.
Also, use packaging directly, since that's all pkg_resources does https://github.com/pypa/setuptools/blob/a4dbe3457d89cf67ee3aa571fdb149e6eb544e88/pkg_resources/__init__.py\#L112
@@ -26,7 +26,7 @@ | |||
from google.cloud._helpers import _RFC3339_MICROS | |||
from google.cloud._helpers import _RFC3339_NO_FRACTION | |||
from google.cloud._helpers import _to_bytes | |||
import pkg_resources | |||
import packaging.version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to add type annotations, but it turns out pkg_resources
is just a thin wrapper around packaging
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and clean!
Consider maybe adding a few unit tests directly testing the BQStorageVersions
class, although it does seem that the class is tested indirectly through other tests (the coverage does not complain).
Edit: I mean, the coverage check does complain, but not about _helpers.BQStorageVersions
.
google/cloud/bigquery/_helpers.py
Outdated
class BQStorageVersions: | ||
def __init__(self): | ||
self._installed_version = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version checker, sir, is classy (pun intended). Well done!
google/cloud/bigquery/_helpers.py
Outdated
should thus be used in places where this assumption holds. | ||
|
||
Because `pip` can install an outdated version of this extra despite the constraints | ||
in setup.py, the the calling code can use this helper to verify the version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I now noticed you copied my typo, let's remove it.
in setup.py, the the calling code can use this helper to verify the version | |
in `setup.py`, the calling code can use this helper to verify the version |
Turns out the coverage error was due to my use of a |
read_session
is unnecessary as ofgoogle-cloud-bigquery-storage>=2.6.0
.This will allow us to more loudly deprecate the use of
rows(read_session)
. googleapis/python-bigquery-storage#229Rather than require 2.6.0, version switches will allow us to keep our
requirements range wider. Will want to give this version some time to bake
before making it the minimum version. #747
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #742 🦕