-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚘 Auto opt cli #1343
🚘 Auto opt cli #1343
Conversation
olive/cli/auto_opt.py
Outdated
|
||
search_strategy_group = sub_parser.add_argument_group("search strategy options") | ||
search_strategy_group.add_argument( | ||
"--num-samples", type=int, default=5, help="Number of samples for search algorithm" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to expose this in the CLI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking if the search takes long time, user can somehow reduce the numbers of num-samples to stop the search in time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add "auto-opt" needed packages list under extra dependencies so that user can do pip install olive-ai[auto-opt]
I think it might be hard to define a unified extra dependencies lists for auto-opt. Since auto-opt may be used in different devices which causes the conflicts on different onnxruntime packages. I think we have a feature to dynamicly |
We can exclude OnnxRuntime and IHV toolkits. But we should be able to install other required packages using |
Updated. Basically, Also I tested that, after the installation olive-ai[auto-opt], user only needs to install corresponding onnxruntime or onnxruntime-genai(only enable if set --use_model_builder) based on the device(cpu/gpu etc.), olive auto-opt cli can conduct reasonable results. |
What models will be supported by {
"input_model":{
"type": "HfModel",
"model_path": "microsoft/phi-3.5-mini-instruct",
"task": "text-generation"
},
"systems": {
"local_system": {
"type": "LocalSystem",
"accelerators": [
{
"device": "cpu",
"execution_providers": [
"CPUExecutionProvider"
]
}
]
}
},
"auto_optimizer_config": {
"opt_level": 0,
"disable_auto_optimizer": false,
"precision": "int4"
},
"host": "local_system",
"target": "local_system",
"cache_dir": "cache",
"output_dir" : "models"
} I get an error message on the OrtTransformerOptimizer pass:
If these are the only models supported then it is a little underwhelming because they are pretty outdated. It is also a bit odd because the optimizer works for optimum models when I run As a general rule, a user should be able to plug in:
The user should not need to worry about adding data. |
The list and the error message is from OnnxRuntime's transformer optimizer. |
Model supported by the transformer optimizer are listed at https://github.com/microsoft/onnxruntime/blob/f4d62eeb2e058e2c7b5de0eaa9599368d32b23d5/onnxruntime/python/tools/transformers/optimizer.py#L50 |
olive/cli/auto_opt.py
Outdated
|
||
# output options | ||
output_group = sub_parser.add_argument_group("output options") | ||
output_group.add_argument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR but we could gather this to base.py as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. But I saw there are different arguments attributes. Some of them are required, some of them are not. And they have different default values.
Can we update it in follow-up PR?
device = ( | ||
"gpu" | ||
if self.args.providers | ||
and any(p[: -(len("ExecutionProvider"))] in ["CUDA", "Tensorrt", "Dml"] for p in self.args.providers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
system_group.add_argument(
"--providers",
type=str,
nargs="*",
choices=["CPU", "CUDA", "Tensorrt", "Dml", "VitisAI", "Qnn"],
help="List of execution providers to use for optimization",
)
without ExecutionProvider
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should device be set to cpu if VitisAI
and Qnn
provided here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should device be set to cpu if VitisAI and Qnn provided here?
I think yes. For VitisAi/QNN, we can run quantization on cpu, then inference the model with corresponding EP.
and any(p[: -(len("ExecutionProvider"))] in ["CUDA", "Tensorrt", "Dml"] for p in self.args.providers) | ||
else "cpu" | ||
) | ||
providers = self.args.providers or ["CPUExecutionProvider"] if device == "cpu" else ["CUDAExecutionProvider"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
Describe your changes
Auto opt cli.
e.g. for bert model, we can optimize the model from huggingface and conduct onnx model with:
olive auto-opt --model Intel/bert-base-uncased-mrpc --data_config_path data_config.json --task text-classification olive auto-opt --model Intel/bert-base-uncased-mrpc --data_config_path data_config.json --task text-classification --precision int4 --providers CPU olive auto-opt --model Intel/bert-base-uncased-mrpc --data_config_path data_config.json --task text-classification --precision fp16 --providers CUDA # use model builder olive auto-opt --model microsoft/Phi-3-mini-4k-instruct --precision fp16 --providers CUDA --use_model_builder
Checklist before requesting a review
lintrunner -a
(Optional) Issue link