Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enabling L2+ Optimizations for EPs #23517
base: main
Are you sure you want to change the base?
Enabling L2+ Optimizations for EPs #23517
Changes from 7 commits
1d5ca89
e9119d5
b7a0b79
3b28ffc
309341e
d0cbc65
b239db0
a83dd11
372342c
39fa897
627a00a
06ca086
a965ffb
4c2697c
2b81789
0c10cd4
3360dfd
5f7da9f
e610bc8
e95f2c3
bad19b9
d4968cb
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of wiring in the GraphTransformerManager would it be better to add a new standalone class that provides lookup-based access to a set of L2 optimizers that are directly usable by an EP? That would decouple the GraphPartitioner from the optimizer registration/lookup a bit more. I don't think GraphTransformerManager is providing value in this case as I don't think we need to loop.
The registration/lookup class for re-usable optimizers could be a singleton with a static 'create' method that calls GenerateTransformersForEP and saves the returned list. We could have InferenceSession call the 'create' method to provide any other required things like the CPU allocator, and potentially apply the optimization level when doing so if we want to do that on the ORT side. GraphPartitioner could call a static 'get' method to get the instance so we don't need to wire it through from the inference session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a new standalone class called
GraphOptimizerRegistry
to mainly provide registration and lookup access.This lookup class is a singleton and its 'create' method is being called during InferenceSession initialization and it calls GenerateTransformersForEP and saves the returned list. BTW, not sure this 'create' function needs to be static?
Given that this lookup instance is a singleton using static variable, provider bridge now can access this instance, which means we don't need to get the instance at graph partitioner and feed to GetCapability. Also, we don't need change GetCapability's signature to add a new parameter (i.e. GraphOptimizerRegistry& ).