Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backend][LLVM] Runtime support for any bitwidth integer numpy input #493

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

zzzDavid
Copy link
Collaborator

@zzzDavid zzzDavid commented Mar 15, 2023

Summary

Up until this PR, the top function input/output argument type has been set to 64-bit integer type (for integer type args), and type casting is done inside the function body. This was due to the fact that numpy has only 8, 16, 32, 64-bit integer types.

This PR extends hcl.Array and LLVM runtime to support arbitrary bitwidth input arguments from numpy array.

Methods

Byte-as-field numpy array

To store arbitrary width integer data as numpy arrays, we use struct-type numpy arrays, with each byte as a field. Therefore, each integer scalar is represented as a struct of bytes, and the bytes are contiguous in the memory.

Arbitrary data representation

When input data is wider than 64-bit, it cannot be represented as a numpy scalar type. Instead, we use multidimensional lists of integers in Python to represent input tensors, because Python integers can have arbitrary bitwidth.

MLIR arbitrary bitwidth integer alignment

When passing data from numpy to an MLIR's ExecutionEngine as input arguments, we are creating C Struct from numpy ndarrays with the ctypes module in Python. Through a series of experiments, I found that the required alignment of such C Struct is not byte-level, instead, it depends on the integer bitwidth:

Integer type bitwidth (bit) alignment(bit)
(0, 8] 8
(8, 16] 16
(16, 32] 32
(32, 64] 64
(64, 128] 128
(128, 256] 256
(256, 512] 512

Changes

  • make_anybitwidth_numpy_array is moved from ir_builder.py to utils.py
  • All field formats in the struct numpy array are set to unsigned, this makes sign extension in runtime easier to implement, and this change does not affect the creation of DenseAttr in constant tensor op's IRBuilder function.
  • hcl.Array.np_array is refactored and extended to support any bitwidth data

Limitations

  • This PR only upgrades Int and UInt types. Fixed/UFixed types are not covered, because fixed-to-integer pass needs to be updated in the IR first. Support for fixed-point type will be added by another PR.

Copy link
Member

@chhzh123 chhzh123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Do you have other things to add?

@zzzDavid
Copy link
Collaborator Author

ExecutionEngine randomly produces wrong results for bitwidth 513-1024, I'm still debugging this issue. Will update you once it's solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants