-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: build backend and frontend check #47
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I like the approach, it's basically what I was envisioning. However, my main concern is that you'll need to repeat this operation in every (ee) class. Why not creating an intermediate class that will handle this repetitive task??
A tentative mock could be
class EEDataRegressionFixture(DataRegressionFixture):
def check(
self,
data_object: ee.ComputedObject,
basename: Optional[str] = None,
fullpath: Optional[os.PathLike] = None,
**kwargs
):
"""Check the given ee object against a previously recorded version, or generate a new file.
Parameters:
data_list: The list to check.
basename: The basename of the file to test/record. If not given the name of the test is used.
fullpath: complete path to use as a reference file. This option will ignore ``datadir`` fixture when reading *expected* files but will still use it to write *obtained* files. Useful if a reference file is located in the session data dir for example.
kwargs: params to pass to subclass check.
"""
# build the different filename to be consistent between our 3 checks
name = build_fullpath(
self.original_datadir, self.request, "", basename, fullpath, self.with_test_class_names
)
serialized_name = name.with_stem(f"backend_{name.name}").with_suffix(".yml")
data_name = name.with_stem(f"frontend_{name.name}").with_suffix(".yml")
# check the previously registered serialized call from GEE. If it matches the current call,
# we don't need to check the data
serialized = data_object.serialize()
with suppress(BaseException):
super().check(serialized, fullpath=serialized_name)
return
# if it needs to be checked, we need to round the float values to the same precision as the
# reference file
data = self.check_backend(data_object, **kwargs)
try:
super().check(data, fullpath=data_name)
# IF we are here it means the data has been modified so we edit the API call accordingly
# to make sure next run will not be forced to call the API for a response.
serialized_name.unlink(missing_ok=True)
with suppress(BaseException):
super().check(serialized, fullpath=serialized_name)
except BaseException as e:
raise e
class ListFixture(EEDataRegressionFixture):
"""Fixture for regression testing of :py:class:`ee.List`."""
def check_backend(
self,
data_list: ee.List,
prescision: int = 6,
):
return round_data(data_list.getInfo(), prescision)
Because every ee class is performing a different type of check and thus requires a slightly diffrent handle in between the 2 serialize checks. |
Yes, I got that, but that's why I am proposing the |
This PR is blocked by ESSS/pytest-regressions#184 |
Fix #12
This PR is implementing a more sophisticated testing mechanism to avoid executing the GEE API all the time.
As mentioned by @schwehr and @naschmitz,
serialize()
is the best way to get a robust image of the call to the API.The workflow implemented in
list_regression.py
is the following:backend
) without raising anything. If no failure is raised then it stops there (no real API request)The advantage of this method is that it bypass the actual result if the query is always the same (trusting the Google team) but if it differs we actually verify that the result is maintained. The idea being that the user of the lib should not care about the serialized version of the query, reason why it's always run in a
supress
context manager.Happy to read your comments.
TODO: