From f92ba1b93c2958c9ae9de156031bbbdc9a8c1c45 Mon Sep 17 00:00:00 2001 From: Tal Zussman <32444106+tzussman@users.noreply.github.com> Date: Fri, 20 Sep 2024 11:24:30 -0400 Subject: [PATCH 1/8] [SYSVABI64] Fix formatting of Change History table This change renders the last row of the table correctly --- sysvabi64/sysvabi64.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sysvabi64/sysvabi64.rst b/sysvabi64/sysvabi64.rst index bf1315f..f4eaa74 100644 --- a/sysvabi64/sysvabi64.rst +++ b/sysvabi64/sysvabi64.rst @@ -209,7 +209,7 @@ Change History | | | - In `Dynamic Section Tags`_, reserve tags | | | | used by `PAuthABIELF64`_ and | | | | `MemTagABIELF64`_. | - +---------------+--------------------+--------------------------------------------------------------+ + +------------+------------------------------+-------------------------------------------------------+ References ---------- From aa8733f11076d81889d3c8adcebbcfd833b2030f Mon Sep 17 00:00:00 2001 From: Tal Zussman <32444106+tzussman@users.noreply.github.com> Date: Fri, 20 Sep 2024 11:28:45 -0400 Subject: [PATCH 2/8] Add sysvabi64 to README --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 973d82a..2ed5055 100644 --- a/README.md +++ b/README.md @@ -72,6 +72,7 @@ DWARF for the Arm 64-bit Architecture | [aadwarf64] C++ ABI for the Arm 64-bit Architecture | [cppabi64](cppabi64/cppabi64.rst) | [2020Q2](legacy-documents/cppabi64/ihi0059_E/IHI0059E_2020Q2_cppabi64.pdf) Vector Function ABI for the Arm 64-bit Architecture | [vfabia64](vfabia64/vfabia64.rst) | [2019Q2](legacy-documents/vfabia64/101129_1920/101129_1920_01_en.pdf) C/C++ Atomics ABI for the Arm 64-bit Architecture | [atomicsabi64](atomicsabi64/atomicsabi64.rst) | n/a +System V ABI for the Arm 64-bit Architecture | [sysvabi64](sysvabi64/sysvabi64.rst) | n/a ### ABI for the Arm 64-bit Architecture with SVE support From 853286c7ab66048e4b819682ce17f567b77a0291 Mon Sep 17 00:00:00 2001 From: Ties Stuij Date: Wed, 25 Sep 2024 10:47:58 +0100 Subject: [PATCH 3/8] Bumping github actions versions - update checkout to v4, v2 is using a node that will soon be depreciated - upload-artifact to v4, v2 has become depreciated - update setup-python to v5, just to stay up-to-date --- .github/workflows/ci.yml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 3b327c5..4c5381e 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -7,9 +7,9 @@ jobs: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v2 + - uses: actions/checkout@v4 - name: setup python - uses: actions/setup-python@v2 + uses: actions/setup-python@v5 with: python-version: '3.x' - name: install packages @@ -18,7 +18,7 @@ jobs: run: tools/common/check-rst-syntax.sh - name: build PDFs run: tools/rst2pdf/generate-pdfs.sh PDFs - - uses: actions/upload-artifact@v2 + - uses: actions/upload-artifact@v4 with: name: PDFs path: PDFs From b7a53fd71a2940273dd25987c376f38f996bfc75 Mon Sep 17 00:00:00 2001 From: Matthias Rosenfelder Date: Wed, 18 Sep 2024 00:18:29 +0200 Subject: [PATCH 4/8] [AAPCS64] Fix typos The table layout for Table 7 will be fixed in the next commit. Signed-off-by: Matthias Rosenfelder --- aapcs64/aapcs64.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/aapcs64/aapcs64.rst b/aapcs64/aapcs64.rst index b5bd3fd..1e299d2 100644 --- a/aapcs64/aapcs64.rst +++ b/aapcs64/aapcs64.rst @@ -905,7 +905,7 @@ Floating-Point register v0. z0-z7 are used to pass scalable vector arguments to a subroutine, and to return scalable vector results from a function. If a subroutine takes at least one argument in scalable vector registers or scalable -predicate registers, or returns results in such regisers, the +predicate registers, or returns results in such registers, the subroutine must ensure that the entire contents of z8-z23 are preserved across the call. In other cases it need only preserve the low 64 bits of z8-z15, as described in `SIMD and Floating-Point @@ -2852,7 +2852,7 @@ setjmp and longjmp The C subroutines ``setjmp`` and ``longjmp`` have a `private-ZA`_ `non-streaming interface`_. In addition to the standard requirements for such an interface, there is an additional requirement that applies -specificially to ``setjmp`` and ``longjmp``: +specifically to ``setjmp`` and ``longjmp``: * ZA must be in the “off” state when ``setjmp`` returns to its caller via a ``longjmp``. @@ -3016,9 +3016,9 @@ The header file ``arm_neon.h`` also defines a number of intrinsic functions that +-----------------+-------------------+--------------------------+-----------+ | __Poly64x2\_t | poly64x2\_t | unsigned double-word | 2 | +-----------------+-------------------+--------------------------+-----------+ - | __Bfloat16x4\_t | bfloat16x4\_t | half-precison Brain float| 4 | - +-----------------+-------------------+--------------------------+-----------+ - | __Bfloat16x8\_t | bfloat16x8\_t | half-precison Brain float| 8 | + | __Bfloat16x4\_t | bfloat16x4\_t | half-precision Brain float| 4 | + +-----------------+-------------------+---------------------------+-----------+ + | __Bfloat16x8\_t | bfloat16x8\_t | half-precision Brain float| 8 | +-----------------+-------------------+--------------------------+-----------+ APPENDIX Support for Scalable vectors From a22eb564532b75b221cb737286cfb687300ea8ac Mon Sep 17 00:00:00 2001 From: Matthias Rosenfelder Date: Wed, 18 Sep 2024 00:21:34 +0200 Subject: [PATCH 5/8] [AAPCS64] Fix table layout Fix table layout that has been corrupted by fixing typos in a previous commit. I tried to avoid enlarging the total width of the table. Not sure if that matter - maybe for some PDF export later. No content change. Signed-off-by: Matthias Rosenfelder --- aapcs64/aapcs64.rst | 110 ++++++++++++++++++++++---------------------- 1 file changed, 55 insertions(+), 55 deletions(-) diff --git a/aapcs64/aapcs64.rst b/aapcs64/aapcs64.rst index 1e299d2..ca3a879 100644 --- a/aapcs64/aapcs64.rst +++ b/aapcs64/aapcs64.rst @@ -2965,61 +2965,61 @@ The header file ``arm_neon.h`` also defines a number of intrinsic functions that .. table:: Table 7: Short vector extended types - +-----------------+-------------------+--------------------------+-----------+ - | Internal type | arm\_neon.h type | Base Type | Elements | - +=================+===================+==========================+===========+ - | __Int8x8\_t | int8x8\_t | signed byte | 8 | - +-----------------+-------------------+--------------------------+-----------+ - | __Int16x4\_t | int16x4\_t | signed half-word | 4 | - +-----------------+-------------------+--------------------------+-----------+ - | __Int32x2\_t | int32x2\_t | signed word | 2 | - +-----------------+-------------------+--------------------------+-----------+ - | __Uint8x8\_t | uint8x8\_t | unsigned byte | 8 | - +-----------------+-------------------+--------------------------+-----------+ - | __Uint16x4\_t | uint16x4\_t | unsigned half-word | 4 | - +-----------------+-------------------+--------------------------+-----------+ - | __Uint32x2\_t | uint32x2\_t | unsigned word | 2 | - +-----------------+-------------------+--------------------------+-----------+ - | __Float16x4\_t | float16x4\_t | half-precision float | 4 | - +-----------------+-------------------+--------------------------+-----------+ - | __Float32x2\_t | float32x2\_t | single-precision float | 2 | - +-----------------+-------------------+--------------------------+-----------+ - | __Poly8x8\_t | poly8x8\_t | unsigned byte | 8 | - +-----------------+-------------------+--------------------------+-----------+ - | __Poly16x4\_t | poly16x4\_t | unsigned half-word | 4 | - +-----------------+-------------------+--------------------------+-----------+ - | __Int8x16\_t | int8x16\_t | signed byte | 16 | - +-----------------+-------------------+--------------------------+-----------+ - | __Int16x8\_t | int16x8\_t | signed half-word | 8 | - +-----------------+-------------------+--------------------------+-----------+ - | __Int32x4\_t | int32x4\_t | signed word | 4 | - +-----------------+-------------------+--------------------------+-----------+ - | __Int64x2\_t | int64x2\_t | signed double-word | 2 | - +-----------------+-------------------+--------------------------+-----------+ - | __Uint8x16\_t | uint8x16\_t | unsigned byte | 16 | - +-----------------+-------------------+--------------------------+-----------+ - | __Uint16x8\_t | uint16x8\_t | unsigned half-word | 8 | - +-----------------+-------------------+--------------------------+-----------+ - | __Uint32x4\_t | uint32x4\_t | unsigned word | 4 | - +-----------------+-------------------+--------------------------+-----------+ - | __Uint64x2\_t | uint64x2\_t | unsigned double-word | 2 | - +-----------------+-------------------+--------------------------+-----------+ - | __Float16x8\_t | float16x8\_t | half-precision float | 8 | - +-----------------+-------------------+--------------------------+-----------+ - | __Float32x4\_t | float32x4\_t | single-precision float | 4 | - +-----------------+-------------------+--------------------------+-----------+ - | __Float64x2\_t | float64x2\_t | double-precision float | 2 | - +-----------------+-------------------+--------------------------+-----------+ - | __Poly8x16\_t | poly8x16\_t | unsigned byte | 16 | - +-----------------+-------------------+--------------------------+-----------+ - | __Poly16x8\_t | poly16x8\_t | unsigned half-word | 8 | - +-----------------+-------------------+--------------------------+-----------+ - | __Poly64x2\_t | poly64x2\_t | unsigned double-word | 2 | - +-----------------+-------------------+--------------------------+-----------+ - | __Bfloat16x4\_t | bfloat16x4\_t | half-precision Brain float| 4 | - +-----------------+-------------------+---------------------------+-----------+ - | __Bfloat16x8\_t | bfloat16x8\_t | half-precision Brain float| 8 | - +-----------------+-------------------+--------------------------+-----------+ + +-----------------+-------------------+---------------------------+----------+ + | Internal type | arm\_neon.h type | Base Type | Elements | + +=================+===================+===========================+==========+ + | __Int8x8\_t | int8x8\_t | signed byte | 8 | + +-----------------+-------------------+---------------------------+----------+ + | __Int16x4\_t | int16x4\_t | signed half-word | 4 | + +-----------------+-------------------+---------------------------+----------+ + | __Int32x2\_t | int32x2\_t | signed word | 2 | + +-----------------+-------------------+---------------------------+----------+ + | __Uint8x8\_t | uint8x8\_t | unsigned byte | 8 | + +-----------------+-------------------+---------------------------+----------+ + | __Uint16x4\_t | uint16x4\_t | unsigned half-word | 4 | + +-----------------+-------------------+---------------------------+----------+ + | __Uint32x2\_t | uint32x2\_t | unsigned word | 2 | + +-----------------+-------------------+---------------------------+----------+ + | __Float16x4\_t | float16x4\_t | half-precision float | 4 | + +-----------------+-------------------+---------------------------+----------+ + | __Float32x2\_t | float32x2\_t | single-precision float | 2 | + +-----------------+-------------------+---------------------------+----------+ + | __Poly8x8\_t | poly8x8\_t | unsigned byte | 8 | + +-----------------+-------------------+---------------------------+----------+ + | __Poly16x4\_t | poly16x4\_t | unsigned half-word | 4 | + +-----------------+-------------------+---------------------------+----------+ + | __Int8x16\_t | int8x16\_t | signed byte | 16 | + +-----------------+-------------------+---------------------------+----------+ + | __Int16x8\_t | int16x8\_t | signed half-word | 8 | + +-----------------+-------------------+---------------------------+----------+ + | __Int32x4\_t | int32x4\_t | signed word | 4 | + +-----------------+-------------------+---------------------------+----------+ + | __Int64x2\_t | int64x2\_t | signed double-word | 2 | + +-----------------+-------------------+---------------------------+----------+ + | __Uint8x16\_t | uint8x16\_t | unsigned byte | 16 | + +-----------------+-------------------+---------------------------+----------+ + | __Uint16x8\_t | uint16x8\_t | unsigned half-word | 8 | + +-----------------+-------------------+---------------------------+----------+ + | __Uint32x4\_t | uint32x4\_t | unsigned word | 4 | + +-----------------+-------------------+---------------------------+----------+ + | __Uint64x2\_t | uint64x2\_t | unsigned double-word | 2 | + +-----------------+-------------------+---------------------------+----------+ + | __Float16x8\_t | float16x8\_t | half-precision float | 8 | + +-----------------+-------------------+---------------------------+----------+ + | __Float32x4\_t | float32x4\_t | single-precision float | 4 | + +-----------------+-------------------+---------------------------+----------+ + | __Float64x2\_t | float64x2\_t | double-precision float | 2 | + +-----------------+-------------------+---------------------------+----------+ + | __Poly8x16\_t | poly8x16\_t | unsigned byte | 16 | + +-----------------+-------------------+---------------------------+----------+ + | __Poly16x8\_t | poly16x8\_t | unsigned half-word | 8 | + +-----------------+-------------------+---------------------------+----------+ + | __Poly64x2\_t | poly64x2\_t | unsigned double-word | 2 | + +-----------------+-------------------+---------------------------+----------+ + | __Bfloat16x4\_t | bfloat16x4\_t | half-precision Brain float| 4 | + +-----------------+-------------------+---------------------------+----------+ + | __Bfloat16x8\_t | bfloat16x8\_t | half-precision Brain float| 8 | + +-----------------+-------------------+---------------------------+----------+ APPENDIX Support for Scalable vectors ===================================== From 190e725f0f0dc09e212d4c8c75e7ab05a98cd14c Mon Sep 17 00:00:00 2001 From: Ties Stuij Date: Wed, 4 Sep 2024 12:14:04 +0100 Subject: [PATCH 6/8] slightly tweak the release links generator script - removed referring to developer.arm.com, as we have moved all the legacy docs here now - moved some docs into the 64-bit section --- tools/common/generate-release-links.sh | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/tools/common/generate-release-links.sh b/tools/common/generate-release-links.sh index 2774e78..e4721f3 100755 --- a/tools/common/generate-release-links.sh +++ b/tools/common/generate-release-links.sh @@ -57,10 +57,10 @@ cat < Date: Fri, 9 Sep 2022 17:20:54 +0100 Subject: [PATCH 7/8] [CONTRIBUTING] Add section on extension documents Some changes to the ABI will require experimentation before the content stabilises. Adding information to the main ABI is premature in this case. However it is useful to document this information as an extension to the ABI to enable further collaboration. This section outlines the process for submitting an extension, what criteria Arm use for accepting a contribution and how an extension can move to the main ABI. --- CONTRIBUTING.md | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 3db4550..9fa92fd 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -43,6 +43,41 @@ changes. If you want to make ABI changes that for some reason can't be discussed in public, you can send an email to arm.eabi@arm.com. +### Extension documents +While the majority of new proposals can be added to existing +documents. Proposals that extend the ABI, but are not yet stable are +placed in an extension document. An example of an extension document +is the PAuth Extension to ELF for Arm 64-bit Architecture. Extension +documents have the following requirements: + +1. The document status must be Alpha. +2. The document has an owner recorded in the table below. The owner + need not be from Arm. +3. The document must not clash with other ABI extension documents, or + both extensions must be marked as being incompatible. + + +The Arm approval process for accepting the extension is as follows: + +1. At least one person within Arm has reviewed and accepted the pull + request. +2. There is a consensus within Arm that the extension can be added to + the ABI. + +Extension documents can move into the main ABI when the following conditions hold: + +1. The information in the document is stable. +2. There is an implementation of the extension. +3. The boundaries of when the extension applies are clear. + +An extension document that moves into the main ABI will add the +necessary information to the main documents. In addition any design +and rationale in the extension document will be moved to a new +document in the design-documents folder. + +When the extension has either moved into the main ABI or has been +withdrawn it will be moved to an archive folder. + ## Manual checking of the PDF documents and Continuous Integration To check the outcome of your changes, run the `tools/rst2pdf/generate-pdfs.sh` From 76d56124610302e645b66ac4e491be0c1a90ee11 Mon Sep 17 00:00:00 2001 From: Momchil Velikov Date: Mon, 28 Oct 2024 14:13:22 +0000 Subject: [PATCH 8/8] [aapcs64] Describe the FPMR register and the FP8 types (#273) - Add a description of the FPMR register. The FPMR register is used to control operations with FP8 operands. It is expected that this register may need to be set more than once per function, hence it is considered a temporary register (clobbered by function calls). - Add descriptions of the modal 8-bit floating point types: scalar, short-vector, and scalable vector. The respective parameter passing rules effectively state that values of these types are passed to and returned from functions in FP/vector registers, which reflects the typical and anticipated place of such values in instruction operands. --- aapcs64/aapcs64.rst | 211 +++++++++++++++++++++++++------------------- 1 file changed, 122 insertions(+), 89 deletions(-) diff --git a/aapcs64/aapcs64.rst b/aapcs64/aapcs64.rst index ca3a879..fa90f85 100644 --- a/aapcs64/aapcs64.rst +++ b/aapcs64/aapcs64.rst @@ -257,6 +257,11 @@ changes to the content of the document for that release. | | | - Add the __arm_get_current_vg SME support routine. | | | | - Clarify use of `it` when preserving z and p registers. | +------------+--------------------+------------------------------------------------------------------+ +| | | - Add descriptions of the modal 8-bit floating point types | +| | | - Add a description of the FPMR register | +| | | - Update argument passing rules to include FP8 types | +| | | | ++------------+--------------------+------------------------------------------------------------------+ References ^^^^^^^^^^ @@ -538,7 +543,9 @@ Fundamental Data Types | +---------------------------------------+------------+---------------------------+ | | | Signed quad-word | 16 | 16 | | +------------------------+---------------------------------------+------------+---------------------------+-----------------------------------------------+ - | Floating Point | Half precision | 2 | 2 | See `Half-precision Floating Point`_ | + | Floating Point | 8-bit precision | 1 | 1 | See `Modal 8-bit floating-point`_ | + | +---------------------------------------+------------+---------------------------+-----------------------------------------------+ + | | Half precision | 2 | 2 | See `Half-precision Floating Point`_ | | +---------------------------------------+------------+---------------------------+-----------------------------------------------+ | | Single precision | 4 | 4 | IEEE 754-2008 | | +---------------------------------------+------------+---------------------------+ | @@ -576,6 +583,18 @@ Fundamental Data Types +------------------------+---------------------------------------+------------+---------------------------+-----------------------------------------------+ +Modal 8-bit floating-point +------------------------------------ + +The architecture provides hardware support for modal 8-bit floating-point types. +Two formats are supported: + +1. E4M3, 4-bit exponent and 3-bit significand, with no representation for + infinities and only a single bit-pattern in the significand for NaNs. + +2. E5M2, 5-bit exponent and 2-bit significand, following IEEE 754 conventions + for representation of special values. + Half-precision Floating Point ----------------------------- @@ -892,6 +911,11 @@ thread-local storage on platforms where multi-threaded code is supported. The exact location of such information is platform specific. +**(Alpha)** + +The FPMR is a system register that controls behaviors of the instructions +operating on modal 8-bit floating-point values. It is a temporary register. + Scalable vector registers ^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -1884,10 +1908,10 @@ For a caller, sufficient stack space to hold stacked argument values is assumed | When an argument is assigned to a register any unused bits in the register have unspecified value. When an | | argument is assigned to a stack slot any unused padding bytes have unspecified value. | +-----------------------+----------------------------------------------------------------------------------------+ - | | If the argument is a Half-, Single-, Double- or Quad- precision Floating-point or | - | | short vector type and the NSRN is less than 8, then the argument is allocated to the | - | C.1 | least significant bits of register v[NSRN]. The NSRN is incremented by one. The | - | | argument has now been allocated. | + | | If the argument is an 8-bit, Half-, Single-, Double- or Quad- precision Floating-point | + | C.1 | or short vector type and the NSRN is less than 8, then the argument is allocated to | + | | the least significant bits of register v[NSRN]. The NSRN is incremented by one. | + | | The argument has now been allocated. | +-----------------------+----------------------------------------------------------------------------------------+ | | If the argument is an HFA or an HVA and there are sufficient unallocated SIMD and | | | Floating-point registers (NSRN + number of members ≤ 8), then the argument is | @@ -2570,6 +2594,9 @@ The mapping of C arithmetic types to Fundamental Data Types is shown in `Table 3 | | | significant bits of the type in a big-endian view. Non-significant | | | | bits within the last quad-word are unspecified. | +--------------------------------+-----------------------------------------+------------------------------------------------------------------------+ + | **(Alpha)** ``__mfp8`` | 8-bit floating point | Arm extension. Values are interpreted as either E5M2 or E4M3, | + | | | depending on processor mode. | + +--------------------------------+-----------------------------------------+------------------------------------------------------------------------+ A platform ABI may specify a different combination of primitive variants but we discourage this. @@ -2965,61 +2992,65 @@ The header file ``arm_neon.h`` also defines a number of intrinsic functions that .. table:: Table 7: Short vector extended types - +-----------------+-------------------+---------------------------+----------+ - | Internal type | arm\_neon.h type | Base Type | Elements | - +=================+===================+===========================+==========+ - | __Int8x8\_t | int8x8\_t | signed byte | 8 | - +-----------------+-------------------+---------------------------+----------+ - | __Int16x4\_t | int16x4\_t | signed half-word | 4 | - +-----------------+-------------------+---------------------------+----------+ - | __Int32x2\_t | int32x2\_t | signed word | 2 | - +-----------------+-------------------+---------------------------+----------+ - | __Uint8x8\_t | uint8x8\_t | unsigned byte | 8 | - +-----------------+-------------------+---------------------------+----------+ - | __Uint16x4\_t | uint16x4\_t | unsigned half-word | 4 | - +-----------------+-------------------+---------------------------+----------+ - | __Uint32x2\_t | uint32x2\_t | unsigned word | 2 | - +-----------------+-------------------+---------------------------+----------+ - | __Float16x4\_t | float16x4\_t | half-precision float | 4 | - +-----------------+-------------------+---------------------------+----------+ - | __Float32x2\_t | float32x2\_t | single-precision float | 2 | - +-----------------+-------------------+---------------------------+----------+ - | __Poly8x8\_t | poly8x8\_t | unsigned byte | 8 | - +-----------------+-------------------+---------------------------+----------+ - | __Poly16x4\_t | poly16x4\_t | unsigned half-word | 4 | - +-----------------+-------------------+---------------------------+----------+ - | __Int8x16\_t | int8x16\_t | signed byte | 16 | - +-----------------+-------------------+---------------------------+----------+ - | __Int16x8\_t | int16x8\_t | signed half-word | 8 | - +-----------------+-------------------+---------------------------+----------+ - | __Int32x4\_t | int32x4\_t | signed word | 4 | - +-----------------+-------------------+---------------------------+----------+ - | __Int64x2\_t | int64x2\_t | signed double-word | 2 | - +-----------------+-------------------+---------------------------+----------+ - | __Uint8x16\_t | uint8x16\_t | unsigned byte | 16 | - +-----------------+-------------------+---------------------------+----------+ - | __Uint16x8\_t | uint16x8\_t | unsigned half-word | 8 | - +-----------------+-------------------+---------------------------+----------+ - | __Uint32x4\_t | uint32x4\_t | unsigned word | 4 | - +-----------------+-------------------+---------------------------+----------+ - | __Uint64x2\_t | uint64x2\_t | unsigned double-word | 2 | - +-----------------+-------------------+---------------------------+----------+ - | __Float16x8\_t | float16x8\_t | half-precision float | 8 | - +-----------------+-------------------+---------------------------+----------+ - | __Float32x4\_t | float32x4\_t | single-precision float | 4 | - +-----------------+-------------------+---------------------------+----------+ - | __Float64x2\_t | float64x2\_t | double-precision float | 2 | - +-----------------+-------------------+---------------------------+----------+ - | __Poly8x16\_t | poly8x16\_t | unsigned byte | 16 | - +-----------------+-------------------+---------------------------+----------+ - | __Poly16x8\_t | poly16x8\_t | unsigned half-word | 8 | - +-----------------+-------------------+---------------------------+----------+ - | __Poly64x2\_t | poly64x2\_t | unsigned double-word | 2 | - +-----------------+-------------------+---------------------------+----------+ - | __Bfloat16x4\_t | bfloat16x4\_t | half-precision Brain float| 4 | - +-----------------+-------------------+---------------------------+----------+ - | __Bfloat16x8\_t | bfloat16x8\_t | half-precision Brain float| 8 | - +-----------------+-------------------+---------------------------+----------+ + +-----------------------------+-------------------+--------------------------+-----------+ + | Internal type | arm\_neon.h type | Base Type | Elements | + +=============================+===================+==========================+===========+ + | __Int8x8\_t | int8x8\_t | signed byte | 8 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Int16x4\_t | int16x4\_t | signed half-word | 4 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Int32x2\_t | int32x2\_t | signed word | 2 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Uint8x8\_t | uint8x8\_t | unsigned byte | 8 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Uint16x4\_t | uint16x4\_t | unsigned half-word | 4 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Uint32x2\_t | uint32x2\_t | unsigned word | 2 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Float16x4\_t | float16x4\_t | half-precision float | 4 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Float32x2\_t | float32x2\_t | single-precision float | 2 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Poly8x8\_t | poly8x8\_t | unsigned byte | 8 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Poly16x4\_t | poly16x4\_t | unsigned half-word | 4 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Int8x16\_t | int8x16\_t | signed byte | 16 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Int16x8\_t | int16x8\_t | signed half-word | 8 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Int32x4\_t | int32x4\_t | signed word | 4 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Int64x2\_t | int64x2\_t | signed double-word | 2 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Uint8x16\_t | uint8x16\_t | unsigned byte | 16 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Uint16x8\_t | uint16x8\_t | unsigned half-word | 8 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Uint32x4\_t | uint32x4\_t | unsigned word | 4 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Uint64x2\_t | uint64x2\_t | unsigned double-word | 2 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Float16x8\_t | float16x8\_t | half-precision float | 8 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Float32x4\_t | float32x4\_t | single-precision float | 4 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Float64x2\_t | float64x2\_t | double-precision float | 2 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Poly8x16\_t | poly8x16\_t | unsigned byte | 16 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Poly16x8\_t | poly16x8\_t | unsigned half-word | 8 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Poly64x2\_t | poly64x2\_t | unsigned double-word | 2 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Bfloat16x4\_t | bfloat16x4\_t | half-precison Brain float| 4 | + +-----------------------------+-------------------+--------------------------+-----------+ + | __Bfloat16x8\_t | bfloat16x8\_t | half-precison Brain float| 8 | + +-----------------------------+-------------------+--------------------------+-----------+ + | **(Alpha)** __Mfloat8x8\_t | mfloat8x8\_t | modal 8-bit float | 8 | + +-----------------------------+-------------------+--------------------------+-----------+ + | **(Alpha)** __Mfloat8x16\_t | mfloat8x16\_t | modal 8-bit float | 16 | + +-----------------------------+-------------------+--------------------------+-----------+ APPENDIX Support for Scalable vectors ===================================== @@ -3054,35 +3085,37 @@ document. .. table:: Table 8: Scalable Vector Types and Scalable Predicate Types - +---------------------+-----------------------+-------------------------------------------+----------------+ - | Internal type | ``arm_sve.h`` type | Base type | Elements | - +=====================+=======================+===========================================+================+ - | ``__SVInt8_t`` | ``svint8_t`` | signed byte | VG×8 | - +---------------------+-----------------------+-------------------------------------------+----------------+ - | ``__SVUint8_t`` | ``svuint8_t`` | unsigned byte | VG×8 | - +---------------------+-----------------------+-------------------------------------------+----------------+ - | ``__SVInt16_t`` | ``svint16_t`` | signed half-word | VG×4 | - +---------------------+-----------------------+-------------------------------------------+----------------+ - | ``__SVUint16_t`` | ``svuint16_t`` | unsigned half-word | VG×4 | - +---------------------+-----------------------+-------------------------------------------+----------------+ - | ``__SVFloat16_t`` | ``svfloat16_t`` | half-precision float | VG×4 | - +---------------------+-----------------------+-------------------------------------------+----------------+ - | ``__SVBfloat16_t`` | ``svbfloat16_t`` | half-precision brain float | VG×4 | - +---------------------+-----------------------+-------------------------------------------+----------------+ - | ``__SVInt32_t`` | ``svint32_t`` | signed word | VG×2 | - +---------------------+-----------------------+-------------------------------------------+----------------+ - | ``__SVUint32_t`` | ``svuint32_t`` | unsigned word | VG×2 | - +---------------------+-----------------------+-------------------------------------------+----------------+ - | ``__SVFloat32_t`` | ``svfloat32_t`` | single-precision float | VG×2 | - +---------------------+-----------------------+-------------------------------------------+----------------+ - | ``__SVInt64_t`` | ``svint64_t`` | signed double-word | VG | - +---------------------+-----------------------+-------------------------------------------+----------------+ - | ``__SVUint64_t`` | ``svuint64_t`` | unsigned double-word | VG | - +---------------------+-----------------------+-------------------------------------------+----------------+ - | ``__SVFloat64_t`` | ``svfloat64_t`` | double-precision float | VG | - +---------------------+-----------------------+-------------------------------------------+----------------+ - | ``__SVBool_t`` | ``svbool_t`` | single bit (fully packed into VG bytes) | VG×8 | - +---------------------+-----------------------+-------------------------------------------+----------------+ + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | Internal type | ``arm_sve.h`` type | Base type | Elements | + +================================+=======================+===========================================+================+ + | ``__SVInt8_t`` | ``svint8_t`` | signed byte | VG×8 | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | ``__SVUint8_t`` | ``svuint8_t`` | unsigned byte | VG×8 | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | ``__SVInt16_t`` | ``svint16_t`` | signed half-word | VG×4 | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | ``__SVUint16_t`` | ``svuint16_t`` | unsigned half-word | VG×4 | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | ``__SVFloat16_t`` | ``svfloat16_t`` | half-precision float | VG×4 | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | ``__SVBfloat16_t`` | ``svbfloat16_t`` | half-precision brain float | VG×4 | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | ``__SVInt32_t`` | ``svint32_t`` | signed word | VG×2 | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | ``__SVUint32_t`` | ``svuint32_t`` | unsigned word | VG×2 | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | ``__SVFloat32_t`` | ``svfloat32_t`` | single-precision float | VG×2 | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | ``__SVInt64_t`` | ``svint64_t`` | signed double-word | VG | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | ``__SVUint64_t`` | ``svuint64_t`` | unsigned double-word | VG | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | ``__SVFloat64_t`` | ``svfloat64_t`` | double-precision float | VG | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | ``__SVBool_t`` | ``svbool_t`` | single bit (fully packed into VG bytes) | VG×8 | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ + | **(Alpha)** ``__SVMfloat8_t`` | ``svmfloat8_t`` | modal 8-bit float | VG×8 | + +--------------------------------+-----------------------+-------------------------------------------+----------------+ APPENDIX C++ mangling