From 1b59fd8800e4d16c68820e78ccae303346a946ea Mon Sep 17 00:00:00 2001 From: fairywreath Date: Thu, 30 Jan 2025 21:35:23 -0500 Subject: [PATCH] update proposal --- ...8-floating-point-packed-data-intrinsics.md | 83 -------------- docs/proposals/018-packed-data-intrinsics.md | 103 ++++++++++++++++++ 2 files changed, 103 insertions(+), 83 deletions(-) delete mode 100644 docs/proposals/018-floating-point-packed-data-intrinsics.md create mode 100644 docs/proposals/018-packed-data-intrinsics.md diff --git a/docs/proposals/018-floating-point-packed-data-intrinsics.md b/docs/proposals/018-floating-point-packed-data-intrinsics.md deleted file mode 100644 index a61343f839..0000000000 --- a/docs/proposals/018-floating-point-packed-data-intrinsics.md +++ /dev/null @@ -1,83 +0,0 @@ -SP #018: Add floating-point pack/unpack intrinsics -================= - -Adds intrinsics for converting vector floating-point data to and from packed unsigned integer values. - -## Status - -Status: Design Review. - -Implementation: N/A - -Author: Darren Wihandi and Slang community - -Reviewer: - -## Background - -Floating-point pack and unpack functions provide great utility and exist as built-in intrinsics on GLSL, SPIRV, Metal and WGSL but not on HLSL. -Since Slang's core mudle is derived from HLSL, floating-point pack/unpack intrinsics are not defined and there is no way to access the intrinsics -provided by the other shader language targets. - -## Proposed Approach - -Slang's core module, which derives from HLSL, already defines integer pack/unpack intrinsics which were introduced in SM 6.6. Some examples are -``` -uint32_t4 unpack_u8u32(uint8_t4_packed packedVal); -uint8_t4_packed pack_u8(uint32_t4 unpackedVal); -``` - -We propose to add more intrinsics, with similar syntax, to cover floating-points. Different variants exist to cover conversion between unorm, snorm, and -unormalized/standard IEEE 754 floats and vectors of 16-bit and 32-bit floats. - -A set of unpack intrinsics are added to decompose a 32-bit integer of packed 8-bit or 16-bit float chunks and reinterprets them -as a vector of unorm, snorm or standard floats and halfs. Each 8-bit or 16-bit chunk is converted to either a normalized float -through a conversion rule or to a standard IEEE-754 floating-point. -``` -float4 unpack_unorm_f8f32(uint packedVal); -half4 unpack_unorm_f8f16(uint packedVal); - -float4 unpack_snorm_f8f32(uint packedVal); -half4 unpack_snorm_f8f16(uint packedVal); - -float2 unpack_unorm_f16f32(uint packedVal); -half2 unpack_unorm_f16f16(uint packedVal); - -float2 unpack_snorm_f16f32(uint packedVal); -half2 unpack_snorm_f16f16(uint packedVal); - -float2 unpack_half_f16f32(uint packedVal); -half4 unpack_half_f16f16(uint packedVal); -``` - -A set of pack intrinsics are added to pack a vector of unorm, snorm or standard floats and halfs to a 32-bit integer of packed 8-bit or 16-bit float chunks. -Each vector element is converted to an 8-bit or 16-bit integer chunk through conversion rules, then packed into one 32-bit integer value. -``` -uint pack_unorm_f8(float4 unpackedVal); -uint pack_unorm_f8(half4 unpackedVal); - -uint pack_snorm_f8(float4 unpackedVal); -uint pack_snorm_f8(half4 unpackedVal); - -uint pack_unorm_f16(float2 unpackedVal); -uint pack_unorm_f16(half2 unpackedVal); - -uint pack_snorm_f16(float2 unpackedVal); -uint pack_snorm_f16(half2 unpackedVal); - -uint pack_half_f16(float2 unpackedVal); -uint pack_half_f16(half2 unpackedVal); -``` - -### Normalized float conversion rules -Normalized float conversion rules are standard across GLSL/SPIRV, Metal and WGSL. Slang follows these standards. Details of the conversion rules for each target can be found in: -- Section 8.4, `Floating-Point Pack and Unpack Functions`, of the GLSL language specification, which is also used by the SPIR-V extended instrucsion for GLSL. -- Section 7.7.1, `Conversion Rules for Normalized Integer Pixel Data Types`, of the Metal Shading language specification. -- Sections [16.9 Data Packing Built-in Functions](https://www.w3.org/TR/WGSL/#pack-builtin-functions) and [16.10 Data Unpacking Built-in Functions](https://www.w3.org/TR/WGSL/#unpack-builtin-functions) of the WebGPU Shading language specification. - -### Built-in packed Datatypes -Unlike HLSL's implementation with introduces new packed datatypes, `uint8_t4_packed` and `int8_t4_packed`, unsigned 32-bit integers are used directly -and no new pakced datatypes are introduced. - -### Targets without built-in intrinsics -For targets without built-in intrinsics, the implementation is done manually through code with a combination of arithmetic and bitwise operations. diff --git a/docs/proposals/018-packed-data-intrinsics.md b/docs/proposals/018-packed-data-intrinsics.md new file mode 100644 index 0000000000..0b22672862 --- /dev/null +++ b/docs/proposals/018-packed-data-intrinsics.md @@ -0,0 +1,103 @@ +SP #018: Data pack/unpack intrinsics +==================================== + +Adds intrinsics for converting numeric vector data to and from packed unsigned integer values. + +## Status + +Status: Design Review. + +Implementation: N/A + +Author: Darren Wihandi and Slang community. + +Reviewer: Yong He. + +## Background + +Data packing/unpacking intrinsics provide great utility. Slang's core module, which derives from HLSL, already defines integer pack/unpack intrinsics which were +introduced in SM 6.6. Floatin-point variants however are undefined. Floating-point pack/unpacking intrinsics exist as built-in intrinsics on GLSL, SPIRV, Metal +and WGSL but not on HLSL and Slang, and there is no way to access the intrinsics provided by the other shader language targets. + +## Proposed Approach + +We propose to add new packed-data intrinsics to cover both floating-points and integers. Although the HLSL integer intrinsics are already implemented, integer variants are +also added to obtain independence from the HLSL specs and syntax. For floating-point processing, different variants exist that handles conversion between unorm, snorm, and +unormalized/standard IEEE 754 floats and vectors of 16-bit and 32-bit floats. + +A set of unpack intrinsics are added to decompose a 32-bit integer of packed 8-bit or 16-bit float chunks and reinterprets them +as a vector of unorm, snorm or standard floats and halfs. Each 8-bit or 16-bit chunk is converted to either a normalized float +through a conversion rule or to a standard IEEE-754 floating-point. +``` +float4 unpackUnorm4x8ToFloat(uint packedVal); +half4 unpackUnorm4x8ToHalf(uint packedVal); + +float4 unpackSnorm4x8ToFloat(uint packedVal); +half4 unpackSnorm4x8ToHalf(uint packedVal); + +float2 unpackUnorm2x16ToFloat(uint packedVal); +half2 unpackUnorm2x16ToHalf(uint packedVal); + +float2 unpackSnorm2x16ToFloat(uint packedVal); +half2 unpackSnorm2x16ToHalf(uint packedVal); + +float2 unpackHalf2x16ToFloat(uint packedVal); +half2 unpackHalf2x16ToHalf(uint packedVal); +``` +A set of pack intrinsics are added to pack a vector of unorm, snorm or standard floats and halfs to a 32-bit integer of packed 8-bit or 16-bit float chunks. +Each vector element is converted to an 8-bit or 16-bit integer chunk through conversion rules, then packed into one 32-bit integer value. +``` +uint packUnorm4x8(float4 unpackedVal); +uint packUnorm4x8(half4 unpackedVal); + +uint packSnorm4x8(float4 unpackedVal); +uint packSnorm4x8(half4 unpackedVal); + +uint packUnorm2x16(float2 unpackedVal); +uint packUnorm2x16(half2 unpackedVal); + +uint packSnorm2x16(float2 unpackedVal); +uint packSnorm2x16(half2 unpackedVal); + +uint packHalf2x16(float2 unpackedVal); +uint packHalf2x16(half2 unpackedVal); +``` + +A set of unpack intrinsics are added to decompose a 32-bit integer containing four packed 8-bit signed or unsigned integer values and reinterpret them as vectors of 16-bit or 32-bit integers. +These intrinsics support sign extension for signed integers and zero extension for unsigned integers. +``` +uint32_t4 unpackUint4x8ToUint32(uint packedVal); +uint16_t4 unpackUint4x8ToUint16(uint packedVal); + +int32_t4 unpackInt4x8ToInt32(uint packedVal); +int16_t4 unpackInt4x8ToInt16(uint packedVal); +``` + +A set of pack intrinsics are added to convert a vector of 16-bit or 32-bit signed or unsigned integers into a 32-bit packed representation, +storing only the lower 8 bits of each value. Clamped variants clamp each 8-bit value to `[0, 255]` for unsigned values and `[-128, 127]` for signed values. +``` +uint packUint4x8(uint32_t4 unpackedVal); +uint packUint4x8(uint16_t4 unpackedVal); + +uint packInt4x8(int32_t4 unpackedVal); +uint packInt4x8(int16_t4 unpackedVal); + +uint packUint4x8Clamp(int32_t4 unpackedVal); +uint packUint4x8Clamp(int16_t4 unpackedVal); + +uint packInt4x8Clamp(int32_t4 unpackedVal); +uint packInt4x8Clamp(int16_t4 unpackedVal); +``` + +### Normalized float conversion rules +Normalized float conversion rules are standard across GLSL/SPIRV, Metal and WGSL. Slang follows these standards. Details of the conversion rules for each target can be found in: +- Section 8.4, `Floating-Point Pack and Unpack Functions`, of the GLSL language specification, which is also used by the SPIR-V extended instrucsion for GLSL. +- Section 7.7.1, `Conversion Rules for Normalized Integer Pixel Data Types`, of the Metal Shading language specification. +- Sections [16.9 Data Packing Built-in Functions](https://www.w3.org/TR/WGSL/#pack-builtin-functions) and [16.10 Data Unpacking Built-in Functions](https://www.w3.org/TR/WGSL/#unpack-builtin-functions) of the WebGPU Shading language specification. + +### Built-in packed Datatypes +Unlike HLSL's implementation with introduces new packed datatypes, `uint8_t4_packed` and `int8_t4_packed`, unsigned 32-bit integers are used directly +and no new pakced datatypes are introduced. + +### Targets without built-in intrinsics +For targets without built-in intrinsics, the implementation is done manually through code with a combination of arithmetic and bitwise operations.