Skip to content

Commit

Permalink
Merge branch 'main' into qonnx-1p0
Browse files Browse the repository at this point in the history
  • Loading branch information
jmitrevs authored Oct 24, 2024
2 parents 3ec6c5a + 39d0e91 commit 10eb161
Show file tree
Hide file tree
Showing 56 changed files with 1,176 additions and 336 deletions.
10 changes: 5 additions & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@ exclude: (^hls4ml\/templates\/(vivado|quartus)\/(ap_types|ac_types)\/|^test/pyte

repos:
- repo: https://github.com/psf/black
rev: 24.8.0
rev: 24.10.0
hooks:
- id: black
language_version: python3
args: ['--line-length=125',
'--skip-string-normalization']

- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
rev: v5.0.0
hooks:
- id: check-added-large-files
- id: check-case-conflict
Expand All @@ -30,13 +30,13 @@ repos:
args: ["--profile", "black", --line-length=125]

- repo: https://github.com/asottile/pyupgrade
rev: v3.17.0
rev: v3.18.0
hooks:
- id: pyupgrade
args: ["--py36-plus"]

- repo: https://github.com/asottile/setup-cfg-fmt
rev: v2.5.0
rev: v2.7.0
hooks:
- id: setup-cfg-fmt

Expand All @@ -50,7 +50,7 @@ repos:
'--extend-ignore=E203,T201'] # E203 is not PEP8 compliant

- repo: https://github.com/mgedmin/check-manifest
rev: "0.49"
rev: "0.50"
hooks:
- id: check-manifest
stages: [manual]
Expand Down
18 changes: 9 additions & 9 deletions docs/advanced/model_optimization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@ The code block below showcases three use cases of the hls4ml Optimization API -
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import CategoricalAccuracy
from tensorflow.keras.losses import CategoricalCrossentropy
from hls4ml.optimization.keras import optimize_model
from hls4ml.optimization.keras.utils import get_model_sparsity
from hls4ml.optimization.attributes import get_attributes_from_keras_model
from hls4ml.optimization.objectives import ParameterEstimator
from hls4ml.optimization.scheduler import PolynomialScheduler
from hls4ml.optimization.dsp_aware_pruning.keras import optimize_model
from hls4ml.optimization.dsp_aware_pruning.keras.utils import get_model_sparsity
from hls4ml.optimization.dsp_aware_pruning.attributes import get_attributes_from_keras_model
from hls4ml.optimization.dsp_aware_pruning.objectives import ParameterEstimator
from hls4ml.optimization.dsp_aware_pruning.scheduler import PolynomialScheduler
# Define baseline model and load data
# X_train, y_train = ...
# X_val, y_val = ...
Expand Down Expand Up @@ -75,7 +75,7 @@ To optimize GPU FLOPs, the code is similar to above:

.. code-block:: Python
from hls4ml.optimization.objectives.gpu_objectives import GPUFLOPEstimator
from hls4ml.optimization.dsp_aware_pruning.objectives.gpu_objectives import GPUFLOPEstimator
# Optimize model
# Note the change from ParameterEstimator to GPUFLOPEstimator
Expand All @@ -98,7 +98,7 @@ Finally, optimizing Vivado DSPs is possible, given a hls4ml config:
.. code-block:: Python
from hls4ml.utils.config import config_from_keras_model
from hls4ml.optimization.objectives.vivado_objectives import VivadoDSPEstimator
from hls4ml.optimization.dsp_aware_pruning.objectives.vivado_objectives import VivadoDSPEstimator
# Note the change from optimize_model to optimize_keras_model_for_hls4ml
# The function optimize_keras_model_for_hls4ml acts as a wrapper for the function, parsing hls4ml config to model attributes
Expand Down Expand Up @@ -130,5 +130,5 @@ Note, to ensure DSPs are optimized, "unrolled" Dense multiplication must be used
.. code-block:: Python
hls_config = config_from_keras_model(optimized_model)
hls_config['Model']['DenseResourceImplementation'] = 'Unrolled'
# Any addition hls4ml config, such as strategy, reuse factor etc...
hls_config['Model']['Strategy'] = 'Unrolled'
# Any addition hls4ml config, reuse factor etc...
5 changes: 4 additions & 1 deletion docs/api/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,10 @@ For Vivado backend the options are:
* **IOType**\ : your options are ``io_parallel`` or ``io_stream`` which defines the type of data structure used for inputs, intermediate activations between layers, and outputs. For ``io_parallel``, arrays are used that, in principle, can be fully unrolled and are typically implemented in RAMs. For ``io_stream``, HLS streams are used, which are a more efficient/scalable mechanism to represent data that are produced and consumed in a sequential manner. Typically, HLS streams are implemented with FIFOs instead of RAMs. For more information see `here <https://docs.xilinx.com/r/en-US/ug1399-vitis-hls/pragma-HLS-stream>`__.
* **HLSConfig**\: the detailed configuration of precision and parallelism, including:
* **ReuseFactor**\ : in the case that you are pipelining, this defines the pipeline interval or initiation interval
* **Strategy**\ : Optimization strategy on FPGA, either "Latency" or "Resource". If none is supplied then hl4ml uses "Latency" as default. Note that a reuse factor larger than 1 should be specified when using "resource" strategy. An example of using larger reuse factor can be found `here. <https://github.com/fastmachinelearning/models/tree/master/keras/KERAS_dense>`__
* **ParallelizationFactor**\ : The number of output "pixels" to compute in parallel in convolutional layers. Increasing this parameter results in significant increase in resources required on the FPGA.
* **Strategy**\ : Optimization strategy on FPGA, either "Latency", "Resource" or "Unrolled". If none is supplied then hl4ml uses "Latency" as default. Note that a reuse factor larger than 1 should be specified when using "resource" or "unrolled" strategy. An example of using larger reuse factor can be found `here. <https://github.com/fastmachinelearning/models/tree/master/keras/KERAS_dense>`__
* **PipelineStyle**\ : Set the top level pipeline style. Valid options are "auto", "pipeline" and "dataflow". If unspecified, it defaults to "auto".
* **PipelineInterval**\ : Optionally override the desired initiation interval of the design. Only valid in combination with "pipeline" style. If unspecified, it is left to the compiler to decide, ideally matching the largest reuse factor of the network.
* **Precision**\ : this defines the precsion of your inputs, outputs, weights and biases. It is denoted by ``ap_fixed<X,Y>``\ , where ``Y`` is the number of bits representing the signed number above the binary point (i.e. the integer part), and ``X`` is the total number of bits.
Additionally, integers in fixed precision data type (\ ``ap_int<N>``\ , where ``N`` is a bit-size from 1 to 1024) can also be used. You have a chance to further configure this more finely with per-layer configuration described below.

Expand Down
4 changes: 3 additions & 1 deletion hls4ml/backends/fpga/fpga_backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -254,10 +254,12 @@ def get_closest_reuse_factor(self, valid_rf, chosen_rf):
else:
return before

def set_closest_reuse_factor(self, layer, n_in, n_out, attribute='reuse_factor'):
def set_closest_reuse_factor(self, layer, n_in, n_out, attribute='reuse_factor', include_max_rf=True):
assert attribute is not None, 'Reuse factor attribute cannot be None'

valid_rf = self.get_valid_reuse_factors(n_in, n_out)
if not include_max_rf:
valid_rf.pop()
chosen_rf = layer.get_attr(attribute)
if chosen_rf not in valid_rf:
closest_rf = self.get_closest_reuse_factor(valid_rf, chosen_rf)
Expand Down
23 changes: 20 additions & 3 deletions hls4ml/backends/vitis/passes/feature_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ def transform(self, model, node):
node.set_attr('implementation', 'linebuffer')


class ValidateStrategy(OptimizerPass):
class ValidateResourceStrategy(OptimizerPass):
_resource_layer_cls = ['Conv1D', 'Conv2D', 'Dense']

def match(self, node):
Expand All @@ -29,6 +29,23 @@ def transform(self, model, node):
if rf > n_in and rf % n_in > 0:
print(
f'WARNING: "Resource" strategy in "{node.name}" ({node.class_name}) may have suboptimal QoR in Vitis '
'backend due to use of "urem" cores.\n'
'Consider using a different ReuseFactor or switching to "Latency" strategy.'
'backend due to use of "urem" cores in Vitis HLS <= 2022.1.\n'
'Consider using a different ReuseFactor or switching to "Latency" strategy if using older versions '
'of Vitis HLS.'
)


class ValidateResourceUnrolledStrategy(OptimizerPass):
_unrolled_layer_cls = ['Conv1D', 'Conv2D', 'Dense', 'GRU', 'LSTM']

def match(self, node):
is_unrolled_layer = len([layer_cls for layer_cls in self._unrolled_layer_cls if layer_cls in node.class_name]) > 0
is_unrolled_strategy = node.get_attr('strategy', 'latency').lower() == 'resource_unrolled'

return is_unrolled_layer and is_unrolled_strategy

def transform(self, model, node):
print(
f'WARNING: "ResourceUnrolled" strategy in "{node.name}" ({node.class_name}) may have unexpected II in'
'Vitis backend.\nVerify that the final design satisfies the latency/II constraints.'
)
3 changes: 2 additions & 1 deletion hls4ml/backends/vitis/vitis_backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ def __init__(self):
def _register_flows(self):
validation_passes = [
'vitis:validate_conv_implementation',
'vitis:validate_strategy',
'vitis:validate_resource_strategy',
'vitis:validate_resource_unrolled_strategy',
]
validation_flow = register_flow('validation', validation_passes, requires=['vivado:init_layers'], backend=self.name)

Expand Down
44 changes: 44 additions & 0 deletions hls4ml/backends/vivado/passes/convolution_templates.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
typedef {accum_t.name} accum_t;
typedef {bias_t.name} bias_t;
typedef {weight_t.name} weight_t;
template<class data_T, class res_T, class CONFIG_T>
using kernel = nnet::{dense_function}<data_T, res_T, CONFIG_T>;
template<class x_T, class y_T>
using product = nnet::product::{product_type}<x_T, y_T>;
}};\n"""
Expand Down Expand Up @@ -100,6 +102,18 @@ def format(self, node):
mult_params['product_type'] = get_backend('vivado').product_type(
node.get_input_variable().type.precision, node.get_weights('weight').type.precision
)

if node.get_attr('strategy').lower() == 'latency':
mult_params['dense_function'] = 'DenseLatency'
elif node.get_attr('strategy').lower() == 'resource':
if int(mult_params['reuse_factor']) <= int(mult_params['n_in']):
mult_params['dense_function'] = 'DenseResource_rf_leq_nin'
else:
mult_params['dense_function'] = 'DenseResource_rf_gt_nin_rem0'
# The 3rd case is never used
elif node.get_attr('strategy').lower() == 'resource_unrolled':
mult_params['dense_function'] = f'dense_resource_unrolled_{node.index}'

mult_config = self.mult_template.format(**mult_params)

return mult_config + '\n' + conv_config
Expand Down Expand Up @@ -213,6 +227,18 @@ def format(self, node):
mult_params['product_type'] = get_backend('vivado').product_type(
node.get_input_variable().type.precision, node.get_weights('weight').type.precision
)

if node.get_attr('strategy').lower() == 'latency':
mult_params['dense_function'] = 'DenseLatency'
elif node.get_attr('strategy').lower() == 'resource':
if int(mult_params['reuse_factor']) <= int(mult_params['n_in']):
mult_params['dense_function'] = 'DenseResource_rf_leq_nin'
else:
mult_params['dense_function'] = 'DenseResource_rf_gt_nin_rem0'
# The 3rd case is never used
elif node.get_attr('strategy').lower() == 'resource_unrolled':
mult_params['dense_function'] = f'dense_resource_unrolled_{node.index}'

mult_config = self.mult_template.format(**mult_params)

return mult_config + '\n' + conv_config
Expand Down Expand Up @@ -297,6 +323,8 @@ def format(self, node):
params['scale_index_type'] = 'scale_index_regular'

params['config_t'] = f'config{node.index}_depthwise_mult'
# TODO - Extend unrolled Dense Resource
params['unrolled_function'] = 'DenseResourceUnrolled'
depthwise_config = self.depthwise_template.format(**params)

# Depthwise mult config
Expand All @@ -309,6 +337,9 @@ def format(self, node):
mult_params['product_type'] = get_backend('vivado').product_type(
node.get_input_variable().type.precision, node.get_weights('depthwise').type.precision
)
# TODO - Extend unrolled Dense Resource to depthwise Conv1D
mult_params['unrolled_function'] = 'DenseResourceUnrolled'

depthwise_mult_config = self.depthwise_mult_template.format(**mult_params)

# Pointwise config
Expand Down Expand Up @@ -338,6 +369,8 @@ def format(self, node):
params['scale_index_type'] = 'scale_index_regular'

params['config_t'] = f'config{node.index}_pointwise_mult'
# TODO - Extend unrolled Dense Resource
params['unrolled_function'] = 'DenseResourceUnrolled'
pointwise_config = self.pointwise_template.format(**params)

# Pointwise mult config
Expand All @@ -350,6 +383,9 @@ def format(self, node):
mult_params['product_type'] = get_backend('vivado').product_type(
node.get_input_variable().type.precision, node.get_weights('pointwise').type.precision
)
# TODO - Extend unrolled Dense Resource to separable Conv1D
mult_params['unrolled_function'] = 'DenseResourceUnrolled'

pointwise_mult_config = self.pointwise_mult_template.format(**mult_params)

return (
Expand Down Expand Up @@ -425,6 +461,8 @@ def format(self, node):
params['scale_index_width_type'] = 'scale_index_regular'

params['config_t'] = f'config{node.index}_depthwise_mult'
# TODO - Extend unrolled Dense Resource
params['unrolled_function'] = 'DenseResourceUnrolled'
depthwise_config = self.depthwise_template.format(**params)

# Depthwise mult config
Expand All @@ -437,6 +475,8 @@ def format(self, node):
mult_params['product_type'] = get_backend('vivado').product_type(
node.get_input_variable().type.precision, node.get_weights('depthwise').type.precision
)
# TODO - Extend unrolled Dense Resource to depthwise Conv2D
mult_params['unrolled_function'] = 'DenseResourceUnrolled'
depthwise_mult_config = self.depthwise_mult_template.format(**mult_params)

# Pointwise config
Expand Down Expand Up @@ -474,6 +514,8 @@ def format(self, node):
else:
params['scale_index_width_type'] = 'scale_index_regular'
params['config_t'] = f'config{node.index}_pointwise_mult'
# TODO - Extend unrolled Dense Resource
params['unrolled_function'] = 'DenseResourceUnrolled'
pointwise_config = self.pointwise_template.format(**params)

# Pointwise mult config
Expand All @@ -486,6 +528,8 @@ def format(self, node):
mult_params['product_type'] = get_backend('vivado').product_type(
node.get_input_variable().type.precision, node.get_weights('pointwise').type.precision
)
# TODO - Extend unrolled Dense Resource to separable Conv2D
mult_params['unrolled_function'] = 'DenseResourceUnrolled'
pointwise_mult_config = self.pointwise_mult_template.format(**mult_params)

return (
Expand Down
13 changes: 13 additions & 0 deletions hls4ml/backends/vivado/passes/core_templates.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@
typedef {bias_t.name} bias_t;
typedef {weight_t.name} weight_t;
typedef {index_t.name} index_t;
template<class data_T, class res_T, class CONFIG_T>
using kernel = nnet::{dense_function}<data_T, res_T, CONFIG_T>;
template<class x_T, class y_T>
using product = nnet::product::{product_type}<x_T, y_T>;
}};\n"""
Expand All @@ -41,6 +43,17 @@ def format(self, node):
node.get_input_variable().type.precision, node.get_weights('weight').type.precision
)

if node.get_attr('strategy').lower() == 'latency':
params['dense_function'] = 'DenseLatency'
elif node.get_attr('strategy').lower() == 'resource':
if int(params['reuse_factor']) <= int(params['n_in']):
params['dense_function'] = 'DenseResource_rf_leq_nin'
else:
params['dense_function'] = 'DenseResource_rf_gt_nin_rem0'
# The 3rd case is never used
elif node.get_attr('strategy').lower() == 'resource_unrolled':
params['dense_function'] = f'dense_resource_unrolled_{node.index}'

return self.template.format(**params)


Expand Down
Loading

0 comments on commit 10eb161

Please sign in to comment.