-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bundle Analysis: associate past assets to current parsed bundle #231
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #231 +/- ##
==========================================
- Coverage 89.52% 89.48% -0.04%
==========================================
Files 328 324 -4
Lines 10480 10375 -105
Branches 1915 1904 -11
==========================================
- Hits 9382 9284 -98
+ Misses 1025 1020 -5
+ Partials 73 71 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think that associate_previous_assets
is quite complex as it is. I'd encourage you to refactor it by breaking up parts of the big loop into helper functions.
Easier to read and to test. But the comments and docstrings are quite helpful, thanks for that.
Otherwise LGTM... left some other comments.
shared/bundle_analysis/report.py
Outdated
def associate_previous_assets(self, prev_bundle_analysis_report: Any) -> None: | ||
""" | ||
Note: prev_bundle_analysis_report is of type BundleAnalysisReport, | ||
typing.Self is not available in 3.10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can write "BundleAnalysisReport" and still have it identified (with the doublequotes) (I've seen that somewhere in the code).
In any case shared is at least 3.11, so maybe it's available there? (see
Line 22 in fbeae69
python_requires=">=3.11", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting I didn't know about the double quote trick!
|
||
|
||
def test_asset_association(): | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have a try:
without an except
?
"to have the finally;
block"
but what's the difference of having it at the end of the try block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part basically simulates what the worker does when it processes a bundle file, except on work the except
part actually handles errors like doing retrying before it enters finally
. With the tests there's really nothing to do when an exception occurs, so just skip that and do finally
. Otherwise the code will be
try:
# do things
cleanup()
except:
cleanup()
Separated out the 2 rules as 2 separate helper functions |
shared/bundle_analysis/report.py
Outdated
@@ -170,6 +171,104 @@ def ingest(self, path: str) -> int: | |||
self.db_session.commit() | |||
return session_id | |||
|
|||
def _associate_bundle_report_assets_by_name( | |||
self, curr_bundle_report, prev_bundle_report | |||
) -> Set: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[very nit] You can be more explicit by saying Set[Tuple]
or even more Set[Tuple[str, str]]
. Starts to get a bit wild fast but possible. I personally think Set[Tuple]
is an improvement over Set
.
shared/bundle_analysis/report.py
Outdated
@@ -170,6 +171,104 @@ def ingest(self, path: str) -> int: | |||
self.db_session.commit() | |||
return session_id | |||
|
|||
def _associate_bundle_report_assets_by_name( | |||
self, curr_bundle_report, prev_bundle_report |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The args don't have a known type?
associated_assets_found = set() | ||
|
||
# Rule 1 check | ||
associated_assets_found |= ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even the code is going "|=" (i.e. 😐)
(Jokes aside I do think this is an improvement, thanks)
Asset name changes from one bundler build to another across commits, therefore we don't know if the current asset is a new one or a continuation of the previous one. This PR adds a function to offer a heuristic to associate the current asset to an asset from the previous commit.
The rules are that if the hashed asset name exists in the previous bundle report then this is considered the same asset. Similarly if all the modules of the asset are the same as any in the previous bundle report it is also considered the same asset. We track assets through a generated UUID, when 2 assets are considered associated then they will have the same UUID.
This mechanism is important for providing analytics asset size trends throughout the course of its existence. This component will be implemented in the coming iterations.
Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.