Deeploy: DNN Compiler for Heterogeneous SoCs

Deeploy is an ONNX-to-C compiler that generates low-level optimized C Code for multi-cluster, heterogeneous SoCs. Its goal is to enable configurable deployment flows from a bottom-up compiler perspective, modeling target hardware in a fine-grained and modular manner.

Deeploy is developed as part of the PULP project, a joint effort between ETH Zurich and the University of Bologna.

Documentation & Tutorials

You can find the documentation at the following links:

A DeepWiki generated documentation is availabe here.

Getting started

Download the repository and its submodules:

git clone https://github.com/pulp-platform/Deeploy.git && cd Deeploy
git submodule update --init --recursive

Installing Deeploy is as simple as running:

pip install -e . --extra-index-url=https://pypi.ngc.nvidia.com

However, to run the code generated by Deeploy on a certain target, you need the toolchains and the simulators associated with this platform.

We provide a Docker container where Deeploy works Out-of-the-Box (i.e. with all the dependencies pre-installed). To pull the docker image, run:

docker pull ghcr.io/pulp-platform/deeploy:main

Then you can create and start the container in interactive mode with:

docker run -it --name deeploy_main -v $(pwd):/app/Deeploy ghcr.io/pulp-platform/deeploy:main

Install Deeploy inside the container in editable mode:

cd Deeploy
pip install -e . --extra-index-url=https://pypi.ngc.nvidia.com

Congratulations, you installed Deeploy and its dependencies! Now, to test your installation let's run one simple test on each platform with the following commands:

cd DeeployTest
python testRunner_generic.py -t Tests/Adder
python testRunner_cortexm.py -t Tests/Adder
python testRunner_mempool.py -t Tests/Adder
python testRunner_snitch.py -t Tests/Adder/
python testRunner_siracusa.py -t Tests/Adder --cores=8
python testRunner_snitch.py -t Tests/Adder --cores=9
python testRunner_softhier.py -t Tests/Adder --toolchain=GCC
python testRunner_chimera.py -t Tests/Adder

To restart and connect to the container, run:

docker start -i deeploy_main
cd Deeploy

You can find the ONNX file in DeeployTest/Tests/Adder, to visualize it, you can use Netron. You can also find the generated code for the platform X in TEST_X in DeeployTest and you should notice that the generated code for the Adder test is very simple. However, this gets more complex when you add tiling. Let's generate the code for a single layer but using tiling this time:

python testRunner_tiled_siracusa.py -t Tests/testMatMul --cores=8 --l1=16000

Now you can open the generated code in DeeployTest/TEST_SIRACUSA/Tests/testMatMul/Network.c and see how we executed a tiled layer.

Supported Platforms

Platform	Hardware	Simulator
Generic CPU	Your laptop CPU :)	Host
CortexM Processors	Documentation	QEMU
MemPool + ITA	Mempool paper, ITA paper	Banshee
Siracusa	Siracusa paper	GVSoC
Snitch Cluster	Snitch paper	GVSoC
SoftHier	Repo	GVSoC
Chimera	Repo	GVSoC

Publications

If you use Deeploy in your work or research, you can cite us with:

ESWEEK 2024: Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers

@article{schererDeeployEnablingEnergyEfficient2024,
  title = {Deeploy: {{Enabling Energy-Efficient Deployment}} of {{Small Language Models}} on {{Heterogeneous Microcontrollers}}},
  shorttitle = {Deeploy},
  author = {Scherer, Moritz and Macan, Luka and Jung, Victor J. B. and Wiese, Philip and Bompani, Luca and Burrello, Alessio and Conti, Francesco and Benini, Luca},
  year = {2024},
  month = nov,
  journal = {IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems},
  volume = {43},
  number = {11},
  pages = {4009--4020},
  issn = {1937-4151},
  doi = {10.1109/TCAD.2024.3443718},
}

The preprint is available on arXiv @ arXiv:2408.04413.

IEEE Design & Test: Toward Attention-based TinyML: A Heterogeneous Accelerated Architecture and Automated Deployment Flow

@article{WieseTowardAttentionBasedTinyML2025,
  author={Wiese, Philip and İslamoğlu, Gamze and Scherer, Moritz and Macan, Luka and Jung, Victor J.B. and Burrello, Alessio and Conti, Francesco and Benini, Luca},
  journal={IEEE Design & Test},
  title={Toward Attention-based TinyML: A Heterogeneous Accelerated Architecture and Automated Deployment Flow},
  year={2025},
  pages={1-1},
  keywords={Tiny machine learning;Transformers;Memory management;Hardware acceleration;Bandwidth;Registers;Software;Engines;Energy efficiency;Computational modeling;Neural Networks;TinyML;Deployment;Transformers;Accelerators},
  doi={10.1109/MDAT.2025.3527371}}

The preprint is available on arXiv @ arXiv:2408.02473.

License

Unless specified otherwise in the respective file headers, all code checked into this repository is made available under a permissive license. All software sources and tool scripts are licensed under Apache 2.0, except for files contained in the scripts directory, which are licensed under the MIT license, and files contained in the DeeployTest/Testsdirectory, which are licensed under the Creative Commons Attribution-NoDerivates 4.0 International license (CC BY-ND 4.0).

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.github		.github
.gitlab		.gitlab
.vscode		.vscode
Container		Container
Deeploy		Deeploy
DeeployTest		DeeployTest
TargetLibraries		TargetLibraries
cmake		cmake
docs		docs
scripts		scripts
toolchain		toolchain
.clang-format		.clang-format
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.gitmodules		.gitmodules
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
.style.yapf		.style.yapf
.yapfignore		.yapfignore
CHANGELOG.md		CHANGELOG.md
CMakeLists.txt		CMakeLists.txt
CODEOWNERS		CODEOWNERS
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
Deeploy.code-workspace		Deeploy.code-workspace
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Deeploy: DNN Compiler for Heterogeneous SoCs

Documentation & Tutorials

Getting started

Supported Platforms

Publications

ESWEEK 2024: Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers

IEEE Design & Test: Toward Attention-based TinyML: A Heterogeneous Accelerated Architecture and Automated Deployment Flow

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 10

Uh oh!

Languages

License

pulp-platform/Deeploy

Folders and files

Latest commit

History

Repository files navigation

Deeploy: DNN Compiler for Heterogeneous SoCs

Documentation & Tutorials

Getting started

Supported Platforms

Publications

ESWEEK 2024: Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers

IEEE Design & Test: Toward Attention-based TinyML: A Heterogeneous Accelerated Architecture and Automated Deployment Flow

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 10

Uh oh!

Languages

Packages