Assumptions

For an operator to be eligible for fusion, it must meet the following conditions:

It has only one input, excluding Constant and initializer type tensors.
It has only one output.
The first dimension of both input and output shapes is annotated with "batch_size".

Therefore, we must first perform a more accurate shape inference, i.e., symbolic shape infer. Run the following command:

python ./tools/symbolic_shape_infer.py --input [input model path] --output [output model path]

Usage

Download the onnxruntime project from https://github.com/microsoft/onnxruntime and build it from source by executing the following commands:
```
git clone https://github.com/microsoft/onnxruntime.git
cd onnxruntime
git apply ./runtime/ort/changes.patches
```
Install the Python package:
```
pip install -e .
```

Examples

We have currently implemented custom CPU ops [Merge and Route] for onnxruntime.

Microbenchmark

In the ./example/micro directory, you can find some files. Follow these instructions to test the functionality for microbenchmark:

cd example/micro
python generate.py
./convert.sh

python fuse.py --num 2
python fuse.py
python test_runtime.py

Transformer Example

In the ./example/transformer directory, follow these instructions to test the functionality. We use two decode layers of the LLaMA model and its LoRA variant as our test models:

cd example/transformer
python generate.py
./convert.sh

python fuse.py
python test_runtime.py

TODO

Generalize input assumptions to handle multiple inputs
Refactor the single Route Op into multiple specialized Route Ops.
Fix height = 256 and width = 256 to obeserve the effect.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
custom_op/cpu		custom_op/cpu
example		example
opt		opt
runtime/ort		runtime/ort
tools		tools
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Assumptions

Usage

Examples

Microbenchmark

Transformer Example

TODO

About

Releases

Packages

Languages

xwhzz/model_fuse

Folders and files

Latest commit

History

Repository files navigation

Assumptions

Usage

Examples

Microbenchmark

Transformer Example

TODO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages