Merge pull request #25 from Xilinx/dev

finn-examples 0.0.4
Xilinx · Nov 5, 2021 · b04ff19 · b04ff19
2 parents 9fb0f74 + 8bef7b2
commit b04ff19
Show file tree

Hide file tree

Showing 33 changed files with 2,537 additions and 353 deletions.
diff --git a/.github/workflows/python-publish.yml b/.github/workflows/python-publish.yml
@@ -3,9 +3,7 @@
 
 name: Upload Python Package
 
-on:
-  release:
-    types: [created]
+on: workflow_dispatch
 
 jobs:
   deploy:

diff --git a/.gitignore b/.gitignore
@@ -13,4 +13,6 @@ build/finn
 *.link_summary
 *.onnx
 *.pyc
+output_*/
+release/
 *.egg-info
diff --git a/README.md b/README.md
@@ -1,4 +1,4 @@
-## <img src="docs/img/finn-logo.png" width=128/> Dataflow Accelerator Examples
+## <img src=https://raw.githubusercontent.com/Xilinx/finn/github-pages/docs/img/finn-logo.png width=128/> Dataflow Accelerator Examples
 *for PYNQ on Zynq and Alveo*
 
 <img align="left" src="docs/img/finn-example.png" alt="drawing" style="margin-right: 20px" width="250"/>
@@ -42,7 +42,7 @@ Retrieve the example Jupyter notebooks using the PYNQ get-notebooks command:
 
 ```shell
 # on PYNQ boards, first cd /home/xilinx/jupyter_notebooks
-pynq get-notebooks --from-package finn-examples -p .
+pynq get-notebooks --from-package finn-examples -p . --force
 ```
 
 You can now navigate the provided Jupyter notebook examples, or just use the
@@ -67,6 +67,11 @@ dummy_out = accel.execute(dummy_in)
 | <img src="docs/img/mnist.jpg" width="150"/><br/><br>MNIST       | 3-layer fully-connected | several variants:<br>1/2-bit weights/activations           | all              |
 | <img src="docs/img/imagenet.jpg" width="150"/><br/><br>ImageNet | MobileNet-v1            | 4-bit weights and activations<br>8-bit first layer weights | Alveo U250<br>ZCU104       |
 | <img src="docs/img/imagenet.jpg" width="150"/><br/><br>ImageNet | ResNet-50            | 1-bit weights 2-bit activations<br>4-bit residuals<br>8-bit first/last layer weights | Alveo U250       |
+| <img src="docs/img/radioml.png" width="150"/><br/><br>RadioML 2018 | 1D CNN (VGG10)     |  4-bit weights and activations | ZCU104  |
+| <img src="docs/img/maskedfacenet.jpg" width="150"/><br/><br>MaskedFace-Net | [BinaryCoP](https://arxiv.org/pdf/2102.03456)<br/>*Contributed by TU Munich+BMW*  | 1-bit weights and activations | Pynq-Z1       |
+| <img src="docs/img/keyword-spotting.png" width="150"/><br/><br>Google Speech Commands v2 | 3-layer fully-connected  | 3-bit weights and activations | Pynq-Z1       |
+
+We welcome community contributions to add more examples to this repo!
 
 ## Supported Boards
 

diff --git a/build/get-finn.sh b/build/get-finn.sh
@@ -30,7 +30,7 @@
 # URL for git repo to be cloned
 REPO_URL=https://github.com/Xilinx/finn
 # commit hash for repo
-REPO_COMMIT=c8be5048a7f1647f7c72be7c7cd158e851d47a86
+REPO_COMMIT=d1cc9cf94f1c33354cc169c5a6517314d0e94e3b
 # directory (under the same folder as this script) to clone to
 REPO_DIR=finn
 

diff --git a/build/kws/README.md b/build/kws/README.md
@@ -0,0 +1,24 @@
+# The KWS examplee
+
+The KWS example includes an MLP for the Google SpeechCommandsV2 dataset.
+
+## Build bitfiles for BNN-PYNQ examples
+
+The build is currently configured for the PYNQ-Z1 board and a throughput of 200k FPS at a clock frequency of 100 MHz.
+
+1. Download the pretrained MLP ONNX models and pre-processed validation data using the `get-kws-data-model.sh` script.
+
+2. Launch the build as follows:
+```shell
+# update this according to where you cloned this repo:
+FINN_EXAMPLES=/path/to/finn-examples
+# cd into finn submodule
+cd $FINN_EXAMPLES/build/finn
+# launch the build on the bnn-pynq folder
+bash run-docker.sh build_custom /path/to/finn-examples/build/kws
+```
+
+3. The generated outputs will be under `kws/<timestamp>_output_<onnx_file_name>_<platform>`. 
+You can find a description of the generated files [here](https://finn-dev.readthedocs.io/en/latest/command_line.html#simple-dataflow-build-mode).
+The folder will additionally include the quantized inputs for verification (`all_validation_KWS_data_inputs_len_10102.npy`) and the expected outputs (`all_validation_KWS_data_outputs_len_10102.npy`).
+When running the network on hardware the validation should achieve an accuracy of 89.78 % with 9070 of the 10102 samples being classified correctly.  
diff --git a/build/kws/build.py b/build/kws/build.py
@@ -0,0 +1,159 @@
+# Copyright (c) 2021, Xilinx
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+#
+# * Redistributions of source code must retain the above copyright notice, this
+#   list of conditions and the following disclaimer.
+#
+# * Redistributions in binary form must reproduce the above copyright notice,
+#   this list of conditions and the following disclaimer in the documentation
+#   and/or other materials provided with the distribution.
+#
+# * Neither the name of FINN nor the names of its
+#   contributors may be used to endorse or promote products derived from
+#   this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+import finn.builder.build_dataflow as build
+import finn.builder.build_dataflow_config as build_cfg
+
+from finn.core.modelwrapper import ModelWrapper
+from finn.builder.build_dataflow_config import DataflowBuildConfig
+from finn.transformation.insert_topk import InsertTopK
+from finn.builder.build_dataflow_steps import build_dataflow_step_lookup
+import time
+import finn.core.onnx_exec as oxe
+import numpy as np
+import datetime
+from glob import glob
+
+
+# Inject the preprocessing step into FINN to enable json serialization later on
+def step_preprocess(model: ModelWrapper, cfg: DataflowBuildConfig):
+    model = model.transform(InsertTopK(k=1))
+    return model
+
+
+build_dataflow_step_lookup["step_preprocess_InsertTopK"] = step_preprocess
+
+estimate_steps = ["step_preprocess_InsertTopK"] + build_cfg.estimate_only_dataflow_steps
+estimate_outputs = [build_cfg.DataflowOutputType.ESTIMATE_REPORTS]
+build_steps = ["step_preprocess_InsertTopK"] + build_cfg.default_build_dataflow_steps
+build_outputs = [
+    build_cfg.DataflowOutputType.ESTIMATE_REPORTS,
+    build_cfg.DataflowOutputType.STITCHED_IP,
+    build_cfg.DataflowOutputType.PYNQ_DRIVER,
+    build_cfg.DataflowOutputType.BITFILE,
+    build_cfg.DataflowOutputType.DEPLOYMENT_PACKAGE,
+]
+verification_steps = [
+    build_cfg.VerificationStepType.QONNX_TO_FINN_PYTHON,
+    build_cfg.VerificationStepType.TIDY_UP_PYTHON,
+    build_cfg.VerificationStepType.STREAMLINED_PYTHON,
+    build_cfg.VerificationStepType.FOLDED_HLS_CPPSIM,
+]
+
+model_name = (
+    "MLP_W3A3_python_speech_features_pre-processing_QONNX"
+)
+model_file = model_name + ".onnx"
+
+# Change the ONNX opset from version 9 to 11, which adds support for the TopK node
+from finn.core.modelwrapper import ModelWrapper
+model = ModelWrapper(model_file)
+model.model.opset_import[0].version = 11
+model_file = model_file.replace(".onnx", "_opset-11.onnx")
+model.save(model_file)
+
+platform_name = "Pynq-Z1"
+output_dir = f"{time.time():.2f}_output_{model_name.replace('/','_')}_{platform_name}"
+
+# Configure build
+cfg = build_cfg.DataflowBuildConfig(
+    # steps=estimate_steps, generate_outputs=estimate_outputs,
+    verify_steps=verification_steps,
+    steps=build_steps,
+    generate_outputs=build_outputs,
+    output_dir=output_dir,
+    target_fps=200000,
+    synth_clk_period_ns=10.0,
+    board=platform_name,
+    shell_flow_type=build_cfg.ShellFlowType.VIVADO_ZYNQ,
+    save_intermediate_models=True,
+    stitched_ip_gen_dcp=True,
+    verify_save_full_context=True,
+)
+# Build the model
+build.build_dataflow_cfg(model_file, cfg)
+
+# Save Build config
+config_json_path = f"{output_dir}/DataflowBuildConfig.json"
+with open(config_json_path, "w") as f:
+    f.write(cfg.to_json())
+print(f"Saved DataflowBuildConfig to: {config_json_path}")
+
+# Export quantized inputs
+print("Quantizing validation dataset.")
+parent_model = ModelWrapper(output_dir + "/intermediate_models/dataflow_parent.onnx")
+input_shape = (1, 1, 10, 49)
+last_node = parent_model.graph.node[-2]
+
+for f_name in glob("*.npz"):
+    print(f"Processing file: {f_name}")
+
+    with open(f_name, "rb") as f:
+        np_f = np.load(f)
+        data_arr = np_f["data_arr"]
+        label_arr = np_f["label_arr"]
+
+    pre_processed_inputs = []
+    start_time = time.time()
+    for i in range(len(data_arr)):
+        input_tensor_finn = data_arr[i].reshape(input_shape)
+
+        # Execute with FINN-ONNX
+        input_dict = {parent_model.graph.input[0].name: input_tensor_finn}
+        output_dict = oxe.execute_onnx(
+            parent_model,
+            input_dict,
+            True,
+            end_node=last_node,
+        )
+        finn_output = output_dict[last_node.output[0]]
+        pre_processed_inputs.append(finn_output)
+
+        diff_time = time.time() - start_time
+        time_per_sample = diff_time / (i + 1)
+        time_left = (len(data_arr) - (i + 1)) * time_per_sample
+        time_left = datetime.timedelta(seconds=time_left)
+        print(
+            f"Processed: {100*(i+1)/len(data_arr):.1f} [%], "
+            f"time left: {str(time_left)}",
+            end="\r",
+        )
+    print()
+
+    # Make compatible with FINN driver
+    pre_processed_inputs = np.asarray(pre_processed_inputs)
+    pre_processed_inputs = np.squeeze(pre_processed_inputs)
+    pre_processed_inputs = pre_processed_inputs.astype(np.int8)
+
+    # Save data
+    export_path = output_dir + "/" + f_name.replace(".npz", "_{}_len_{}.npy")
+    print(f"Saving data to: {export_path}")
+    np.save(
+        export_path.format("inputs", len(pre_processed_inputs)), pre_processed_inputs
+    )
+    np.save(export_path.format("outputs", len(label_arr)), label_arr)
diff --git a/build/kws/expected_output.npy b/build/kws/expected_output.npy
diff --git a/build/kws/get-kws-data-model.sh b/build/kws/get-kws-data-model.sh
@@ -0,0 +1,32 @@
+#!/bin/bash
+# Copyright (c) 2020, Xilinx
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+#
+# * Redistributions of source code must retain the above copyright notice, this
+#   list of conditions and the following disclaimer.
+#
+# * Redistributions in binary form must reproduce the above copyright notice,
+#   this list of conditions and the following disclaimer in the documentation
+#   and/or other materials provided with the distribution.
+#
+# * Neither the name of FINN nor the names of its
+#   contributors may be used to endorse or promote products derived from
+#   this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+# Download validation data and model
+wget https://github.com/Xilinx/finn-examples/releases/download/kws/python_speech_preprocessing_all_validation_KWS_data.npz
+wget https://github.com/Xilinx/finn-examples/releases/download/kws/MLP_W3A3_python_speech_features_pre-processing_QONNX.onnx
diff --git a/build/kws/input.npy b/build/kws/input.npy
diff --git a/build/mobilenet-v1/custom_steps.py b/build/mobilenet-v1/custom_steps.py
@@ -36,7 +36,7 @@
 import finn.transformation.streamline.reorder as reorder
 from finn.transformation.infer_data_layouts import InferDataLayouts
 from finn.transformation.streamline.collapse_repeated import CollapseRepeatedMul
-from finn.transformation.streamline.remove import RemoveIdentityOps
+from finn.transformation.remove import RemoveIdentityOps
 from finn.transformation.streamline.round_thresholds import RoundAndClipThresholds
 from finn.transformation.lower_convs_to_matmul import LowerConvsToMatMul
 from finn.transformation.general import (
@@ -78,6 +78,7 @@ def step_mobilenet_streamline(model: ModelWrapper, cfg: DataflowBuildConfig):
 def step_mobilenet_lower_convs(model: ModelWrapper, cfg: DataflowBuildConfig):
     model = model.transform(LowerConvsToMatMul())
     model = model.transform(absorb.AbsorbTransposeIntoMultiThreshold())
+    model = model.transform(absorb.AbsorbConsecutiveTransposes())
     model = model.transform(GiveUniqueNodeNames())
     model = model.transform(GiveReadableTensorNames())
     model = model.transform(InferDataTypes())

diff --git a/build/resnet50/custom_steps.py b/build/resnet50/custom_steps.py
@@ -52,6 +52,7 @@
     FactorOutMulSignMagnitude,
     Absorb1BitMulIntoMatMul,
     Absorb1BitMulIntoConv,
+    AbsorbConsecutiveTransposes,
 )
 
 from finn.transformation.streamline.collapse_repeated import (
@@ -80,7 +81,7 @@
     )
 
 from finn.transformation.double_to_single_float import DoubleToSingleFloat   
-from finn.transformation.streamline.remove import RemoveIdentityOps
+from finn.transformation.remove import RemoveIdentityOps
 from finn.core.datatype import DataType
 
 from finn.transformation.infer_shapes import InferShapes
@@ -178,6 +179,7 @@ def step_resnet50_streamline_linear(model: ModelWrapper, cfg: DataflowBuildConfi
         AbsorbMulIntoMultiThreshold(),
         Absorb1BitMulIntoMatMul(),
         Absorb1BitMulIntoConv(),
+        RoundAndClipThresholds(),
     ]
     for trn in streamline_transformations:
         model = model.transform(trn)
@@ -213,7 +215,7 @@ def step_resnet50_streamline(model: ModelWrapper, cfg: DataflowBuildConfig):
 
 
 def step_resnet50_convert_to_hls(model: ModelWrapper, cfg: DataflowBuildConfig):
-    model.set_tensor_datatype(model.graph.input[0].name, DataType.UINT8)
+    model.set_tensor_datatype(model.graph.input[0].name, DataType["UINT8"])
     model = model.transform(InferDataLayouts())
 
     try:
@@ -239,8 +241,7 @@ def step_resnet50_convert_to_hls(model: ModelWrapper, cfg: DataflowBuildConfig):
         AbsorbConsecutiveTransposes,
         to_hls.InferConvInpGen,
         to_hls.InferDuplicateStreamsLayer,
-        to_hls.InferLabelSelectLayer,
-
+        to_hls.InferLabelSelectLayer
     ]
     for trn in to_hls_transformations:
         model = model.transform(trn())
@@ -307,6 +308,13 @@ def step_resnet50_set_fifo_depths(model: ModelWrapper, cfg: DataflowBuildConfig)
         model, cfg.output_dir + "/final_hw_config.json", hw_attrs
     )
 
+    # after FIFOs are ready to go, call PrepareIP and HLSSynthIP again
+    # this will only run for the new nodes (e.g. FIFOs and DWCs)
+    model = model.transform(
+        PrepareIP(cfg._resolve_fpga_part(), cfg._resolve_hls_clk_period())
+    )
+    model = model.transform(HLSSynthIP())
+    model = model.transform(ReplaceVerilogRelPaths())
     return model
 
 

diff --git a/build/vgg10-radioml/README.md b/build/vgg10-radioml/README.md
@@ -0,0 +1,33 @@
+# VGG10
+
+This 1-dimensional CNN was [introduced](https://arxiv.org/pdf/1712.04578.pdf) by DeepSig alongside their RadioML 2018 dataset for RF modulation classification.
+It consists of 7 1D convolution + maxpooling layers, followed by 2 hidden dense layers and the final dense classification layer. ReLU activations and Batchnorm are applied throughout the network. The input is a frame of 1024 I/Q samples (i.e. shape [1024,2]), the classifier distinguishes 24 classes (i.e. modulation types).
+
+Here, we use a reduced-precision implementation trained on [RadioML 2018.01A](https://www.deepsig.ai/datasets) with Brevitas. The weights and activations are quantized to 4-bit. The number of filters in the convolution layers has been reduced from 64 to 32. The pre-trained model reaches 55.9% overall accuracy and 87.9% at the highest SNR (30 dB). At 250MHz, the accelerator reaches ~230k frames/s (236M samples/s) with the supplied folding configuration.
+
+## Build bitfiles for VGG10
+
+Due to the 1-dimensional topology in VGG10 we use a specialized build script that adds a few custom build steps to the standard steps in FINN.
+**We currently provide bitstreams and the corresponding folding configuration only for the ZCU104, but plan to extend to other boards in the future.**
+
+0. Ensure you have performed the *Setup* steps in the top-level README for setting up the FINN requirements and environment variables.
+
+1. Run the `download_vgg10.sh` script under the `models` directory to download the pretrained VGG10 ONNX model. You should have e.g. `vgg10-radioml/models/radioml_w4a4_small_tidy.onnx` as a result.
+
+2. Launch the build as follows:
+```SHELL
+# update this according to where you cloned this repo:
+FINN_EXAMPLES=/path/to/finn-examples
+# cd into finn submodule
+cd $FINN_EXAMPLES/build/finn
+# launch the build on the vgg10 folder
+./run-docker.sh build_custom $FINN_EXAMPLES/build/vgg10
+```
+
+5. The generated outputs will be under `vgg10-radioml/output_<topology>_<board>`. You can find a description of the generated files [here](https://finn-dev.readthedocs.io/en/latest/command_line.html#simple-dataflow-build-mode).
+
+## Where did the ONNX model files come from?
+
+The quantized VGG10 is based on the baseline topology for our problem statement in the ITU AI/ML in 5G Challenge. You can find it in our [sandbox repository](https://github.com/Xilinx/brevitas-radioml-challenge-21).
+
+In addition, the ONNX model has been tidied up by removing the input quantization, which we do in software for this example, and by adding a top-k (k=1) node at the output. Thus, the accelerator returns the top-1 class index instead of logits.