Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC for hyperscan usage in UrlDispatcher #9907

Draft
wants to merge 37 commits into
base: master
Choose a base branch
from
Draft

POC for hyperscan usage in UrlDispatcher #9907

wants to merge 37 commits into from

Conversation

asvetlov
Copy link
Member

@asvetlov asvetlov commented Nov 15, 2024

https://pypi.org/project/hyperscan/ could speed up url dispatcher if available.

Fallback to the current implementation is supported if hyperscan compilation fails or not available.

@asvetlov asvetlov requested review from bdraco and Dreamsorcerer and removed request for webknjaz November 15, 2024 12:36
Copy link

codspeed-hq bot commented Nov 15, 2024

CodSpeed Performance Report

Merging #9907 will degrade performances by 27.2%

Comparing hyperscan (c82a824) with master (be31bed)

Summary

⚡ 3 improvements
❌ 3 regressions
✅ 37 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark master hyperscan Change
test_resolve_dynamic_resource_url_with_many_dynamic_routes[pyloop] 3.7 ms 5.1 ms -27.2%
test_resolve_dynamic_resource_url_with_many_dynamic_routes_with_common_prefix[pyloop] 258.8 ms 5 ms ×52
test_resolve_dynamic_resource_url_with_many_static_routes[pyloop] 3.7 ms 5 ms -25.87%
test_resolve_gitapi[pyloop] 315 ms 5.8 ms ×55
test_resolve_gitapi_subapps[pyloop] 318.8 ms 7.9 ms ×40
test_resolve_static_root_route[pyloop] 1.1 ms 1.2 ms -11.63%

@@ -1242,8 +1301,52 @@ def freeze(self) -> None:
super().freeze()
for resource in self._resources:
resource.freeze()
if HAS_HYPERSCAN:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't tested this yet but I expect it will break downstream Home Assistant since there is code there to register new routes as new integrations get loaded after startup. It's pretty clear the intent here was to not allow that after freeze but that's currently the downstream use case and hard to change it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.
We can rebuild hypedscan db in UrlDispatcher.register_resource() if the dispatcher was frozen.
I'm curious how Home Assistant works with route table on the fly.
If it adds new routes to started web server it should remove them at some point, isn't it?

Before moving forward, I'd like to add some benchmarks first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About to go to the airport, but will dig up the Home Assistant code this weekend. It predates me though so will take a bit to figure it out.

We don't have any benchmarks for the url dispatcher. I'll look at downstream use cases and come up with some after I get back from travel next week.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you safe flight!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once I figure out how the downstream code works for adding routes at run time. I'll see if I can make that into meaningful test cases here as well so we don't break anything. AFAICT for a cursory review, routes are never removed when integrations are unloaded downstream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Freezing and unfreezing on enter and exit will likely be too many rebuilds. ~297 routes get added at startup for Home Assistant. ~200 can be grouped together which would mean 200 rebuilds at startup

Copy link
Member

@bdraco bdraco Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another method could be to keep adding to the fallback dict as new routes are added, clear the hyperscan db when a route is added after freeze, schedule a timer handle to rebuild the db a few (60... maybe less) seconds later if one is not already scheduled

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~200 can be grouped together which would mean 200 rebuilds at startup

If they're grouped together, shouldn't that be 1 rebuild?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify: Very few can be grouped together. The result is 200 groups of various sizes, mostly 1

Copy link
Member

@bdraco bdraco Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distribution of groupings looks something like 1, 9, 1, 7, 3, 3, 1, 1, 1, 1, 1 .....

However order is indeterminate as it relies on async task completion order

setup.cfg Outdated Show resolved Hide resolved
Co-authored-by: Sam Bull <[email protected]>
@bdraco
Copy link
Member

bdraco commented Nov 15, 2024

Managed to finish the benchmarks before boarding the flight. They should run if you update with master.

Didn't have enough time to dig through downstream or write the tests yet though

@asvetlov
Copy link
Member Author

Thank you!

setup.cfg Outdated Show resolved Hide resolved
Copy link
Member

@Dreamsorcerer Dreamsorcerer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Seems we need a couple more tests for missing coverage though.

@bdraco
Copy link
Member

bdraco commented Nov 17, 2024

Just realized Hyperscan likely doesn't work on armv7l so maybe we need benchmarks for with Hyperscan not available as well. Maybe not though since armv7l is now slowly disappearing from the landscape now that the RPi shortage is over

@asvetlov
Copy link
Member Author

Just realized Hyperscan likely doesn't work on armv7l so maybe we need benchmarks for with Hyperscan not available as well. Maybe not though since armv7l is now slowly disappearing from the landscape now that the RPi shortage is over

Perhaps parametrized benchmarks will help to measure both hyperscan and domestic versions.
Also I'm planing to use parametrization for existing url_dispatcher tests (and maybe web_functional ones) for tests coverage and correctness checking.

@bdraco
Copy link
Member

bdraco commented Nov 17, 2024

From the flame graph on cod speed it looks like test_resolve_gitapi_subapps is producing system routes (maybe 404s)

@asvetlov
Copy link
Member Author

asvetlov commented Nov 17, 2024

Good finding, thanks!
I'll modify benchmarks to check the resolved route to prevent this situation from becoming possible.

I guess an assertion will not hurt or significantly change timings.

bdraco added a commit to home-assistant/core that referenced this pull request Nov 17, 2024
aiohttp is working on integrating hyperscan to improve
url dispatch performance
aio-libs/aiohttp#9907
@asvetlov asvetlov marked this pull request as draft November 17, 2024 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants