-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add transformer for JSON strings #7
base: main
Are you sure you want to change the base?
Conversation
This reverts commit 32b9010. This change is somewhat nuclear option for existing snapshots, will be implemented as a separate transformer first.
@pytest.mark.parametrize( | ||
"input_value,transformed_value", | ||
[ | ||
pytest.param('{"a": "b"}', {"a": "b"}, id="simple_json_object"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: add test case for empty value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for thinking about a backward-compatible way of adding such a neat feature, making lists of JSON-encoded snapshots much easier to handle 👏 .
As discussed, the main open part is handling the special edge case of empty strings. Otherwise, I added some suggestions and ideas for future iterations 📈
Idea 💡: One of the disadvantages of automatic parsing is that we are loosing structural parity information, we can:
- a) highlight this in our snapshot testing guidelines
- b) implement a technical solution at some point such as using some kind of metadata to store the raw value or structural information (e.g.,
key__META__
: {"json": True})
Sidenotes (as discussed and partly related):
- Clarify dev installation instructions with troubleshooting for the weird editable install issue (separate quick docs PR)
- Improve Utility inheritance structure such that LocalStack Core only adds AWS-specific transformer utils (e.g., lambda_api) and utils such as
sorting
are implemented in the snapshot library
@pytest.mark.parametrize( | ||
"input_value,transformed_value", | ||
[ | ||
pytest.param('{"a": "b"}', {"a": "b"}, id="simple_json_object"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great parametrization coverage 💯
I like the inline id
within pytest.param
rather having a disconnected id list
class JsonStringTransformer: | ||
""" | ||
Parses JSON string at key. | ||
Additionally, attempts to parse any JSON strings inside the parsed JSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs: Are there any limitations (e.g., we're losing parity information that the original response is actually JSON-encoded and should have some tests covering that)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could provide a simple example (e.g., the first test case) to clarify usage and foster easy adoption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, highlighting the difference to the default behavior might be most relevant, clarifying that it adds extra coverage for JSON lists starting with [
such as [{},{}]
json_value = json.loads(input_data) | ||
input_data = self._transform_nested(json_value) | ||
except JSONDecodeError: | ||
SNAPSHOT_LOGGER.debug( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good log level choice given nested parsing is best effort and we might encounter false positives here 👏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we think it's worth logging here rather than continuing silently similar to the top-level behavior here:
pass # parsing error can be ignored |
if k == self.key and isinstance(v, str): | ||
try: | ||
SNAPSHOT_LOGGER.debug(f"Replacing string value of {k} with parsed JSON") | ||
json_value = json.loads(v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed, we're missing a test and fix for empty strings here ""
to avoid verbose exception logging
|
||
key: str | ||
|
||
def __init__(self, key: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idea: To facilitate an easy transition, it could be nice to have an easy way to apply this transformer globally, for example, when not providing a key.
sidenote: it ultimately comes down to separating anchors (e.g., JSON path selectors) from actual transformer functionality, but that's far out of scope :)
👍 I tested this change against the localstack-ext PR https://github.com/localstack/localstack-ext/pull/3971 with the following changes (not necessary anymore once the Utility re-use hierarchy is fixed) and can confirm it works as expected (apart from the minor empty string stack trace): snapshot.add_transformer(
- snapshot.transform.sorting(
+ SortingTransformer(
"events",
lambda event: (event["message"]["executionId"], event["message"]["messageType"]),
)
)
- snapshot.add_transformer(snapshot.transform.json_string("payload"))
+ # snapshot.add_transformer(snapshot.transform.json_string("payload"))
+ snapshot.add_transformer(JsonStringTransformer("payload")) ✅ There is no regression (identity check) using this transformer via |
Motivation
Some services response contains JSON strings within JSON strings, e.g. Cloudwatch Logs for EventBridge Pipes events.
We do have first level JSON strings parsed but we don't go recursively into parsed JSONs. This leads to the need of extracting and parsing such payloads separately in the test because otherwise only regex transformers can be applied to such string. Ideally, snapshot library should take care of this. See corresponding discussion (thanks @gregfurman for inspiration!).
Changes
Starting to parse nested JSONs in common transformation flow is a breaking change for existing recorded snapshots. This PR proposes non-breaking approach by introducing a separate transformer that can be applied on demand. This draft shows a usage example: https://github.com/localstack/localstack-ext/pull/3971