Releases: intel/llvm
Releases · intel/llvm
DPC++ daily 2022-03-08
[CODEOWNERS] Update with new names (#5743) * Added Tianfei for Release Notes * Added ESIMD reviewers for SYCLLowerIR/CMakeLists.txt
oneAPI DPC++ Compiler 2021-12
New features
SYCL Compiler
- Added support for
-fgpu-inline-threshold
which allows controlling inline
threshold of the SYCL device code [5f7b607] - Added experimental support for CUDA backend on Windows [8aa3513]
- Added support for experimental option
-fsycl-max-parallel-link-jobs=<N>
which can be used specify how many processes the compiler can use for
linking the device code [c2221f0]
SYCL Library
- Added support for default context extension
on Linux [315593d] - Added experimental support for group sorting algorithm
[932ae56] - Added support for sub-group mask extension
[78a3e77] - Added
sycl::ext::intel::experimental::esimd::simd_mask
as a replaced for
sycl::ext::intel::experimental::esimd::mask_type_t
to represent Gen
predicates [01351f1] - Added stripped PDBs for SYCL libraries [6e5dd48]
- Added support ESIMD emulator backend [f4ad3c1]
- Added support for FPGA DSP control extension [790aa8b]
- Implemented discard_events extension
[9542e28] - Extended XPTI notifications with information about SYCL memory management
[a068b15] [8f9d0d2] - Added support for
sycl::device::info::ext_intel_gpu_hw_threads_per_eu
[2e798df] - Added
sycl::ext::oneapi::no_offset
property for
sycl::ext::oneapi::accessor_property_list
[308e5ad] - Implemented SYCL 2020 property traits [3b17da6]
- Implemented MAX_WORK_GROUP_QUERY extension
[2fdf940] - Added function wrapper for function-level loop fusion attribute
([[intel::loop_fuse(N)]]
and[[intel::loop_fuse_independent(N)]]
) for
FPGA [e8ac5a0] - Added querying number of registers for kernel with
ext_codeplay_num_regs
kernel info query [97d33b7] - Added backend macros
SYCL_BACKEND_OPENCL
andSYCL_EXT_ONEAPI_*
for CUDA,
Level Zero, ESIMD emulator, HIP [2b0ebab] - Added support for
sycl::ext::intel::experimental::esimd_ballot
function
[0bbb091] - Added initial support for Tensorcore matrix extension
[711ba58]
Documentation
- Added device global extension specification
[d3e70d4] - Added property list extension specification
[a7da8b4] - Added extension specification for discard queue events
[23ca24b] - Added KernelProperties
extension [64f5e70]
Improvements
SYCL Compiler
- Added diagnostics on attempt to pass an incorrect value to
-fsycl-device-code-split
[631fd69] - Improved output of
-fsycl-help
[2404d02] - Allowed
::printf
builtin for CUDA backend only [0c55d3a] - Implemented
nextafter
forsycl::half
on CUDA backend [53c3268] - Added atomics with scopes and memory orders for CUDA backend
[2ebde5f] [00f43b3] - Added support for missing mathematical builtins in CUDA backend
[789ec8b] [f074774] [390e105] - Added diagnostic for non-forward declarable kernel name types [653bae9]
- Added
group_ballot
intrinsic for CUDA backend [0680e5c] - Added support for device side
assert
for CUDA backend [5a87b8c] - Turned on
-fsycl-dead-args-optimization
by default [5983dfd] - Improved compilation time by removing free function queries calls detection
[e4791d1] - Reduced memory consumption of device code linking [6266820]
- Improved UX of
sycl::ext::oneapi::experimental::printf
by allowing format
string to reside in a non-constant address space [2d62e51] - Improved barrier and sync instructions to use full mask when targeting NVPTX
[5ce99b8] - Added match for default SPIR device architecture with host architecture i.e.
x86_64
matchesspir64
andi686
matchesspir
[f4d01cd] - Set default device code split mode to off for FPGA [bea72e6]
- Improved diagnostic for invalid SYCL kernel names
[455dce8] [df1ff7a] - Made
Xsycl-target-frontend=
to accept device tripple aliases [7fa0569] - Improved diagnostic messages for
-fsycl-libspirv-path
[c54c605] - Made implied default device to force emulation for FPGA builds [074944e]
- Added support for
sycl::ext::oneapi::sub_group::get_local_id
for HIP
backend [7a9335d] - Added a diagnostic of indirect implicit capture of
this
for kernel lambda
[dce4c6a]
SYCL Library
- Updated joint matrix queries to report if unsigned int variants of mad
matrix instruction are supported [dd7ebce] - Reduced overhead of device code assert implementation [b94f23a] [58ac74e]
- Added a diagnostic on attempt to call
sycl::get_kernel_id
with an invalid
kernel [9dd1ea3] - Reduced overhead on kernel submission for CUDA backend [b79ae69]
- Reduced overhead on kernel submission in backend independent part of runtime
[e292aa5] - Aligned Level-Zero Interoperability API with SYCL 2020 specification
[dd7f82c] [e662166] - Made
sycl::half
default constructor constexpr [d32a444] - Changed CUDA and HIP backends to report each device in a separate platform
[8dddb11] - Added initial support for SYCL2020 exceptions [15e0ab1]
- Added SYCL 2020
sycl::target::device
enumeration value [f710886] - Added a diagnostic on attempt to print
std::byte
usingsycl::stream
[dd5e094] - Added possibility to specify ownership of
ze_module_handle_t
when creating
asycl::kernel_bundle
from it [e3c9c92] - Improve performance of
sycl::nd_item::get_group_range()
[0cd7b7e] - Deprecated
sycl::target::global_buffer
- Made
device_num
which can be passed toSYCL_DEVICE_FILTER
unique
[7aa5be0] - Added a diagnostic on using mutually exclusive
sycl::handler
methods
[6f620a4] - Added support for
std::byte
tosycl::vec
class [8fa04fe] - Added
sycl::make_kernel
interoperability support for Level-Zero backend
[98896fd] - Optimized work with events in the Level Zero backend [973aee9]
- Added support for
sycl::ext::oneapi::experimental::matrix::wi_slice
and
sycl::ext::oneapi::experimental::matrix::joint_matrix_fill
[97127eb] [cbad428] - Enabled code location information when
NDEBUG
is not defined in XPTI
notifications [e9f2d64] [9ca7cea] - Added a diagnostic on attempt to pass a command group function object to
sycl::queue::single_task
[2614d4d] - Enlarged the maximum batch size to 64 for Level Zero backend to improve
performance [596f693] - Reduced kernel submission overhead for CUDA backend [35729a7]
- Improved translation of Level Zero error codes [6699a5d], [5d9a04b]
- Added support for an arbitrary number of elements to
sycl::ext::intel::experimental::esimd::simd::copy_from/to
methods
[2bdc4c4] - Added HIP support to
sycl::ext::oneapi::filter_selector
[7224cb2], [b7cee06] - Added support for batching copy commands for Level Zero backend [4c3e699]
- Reduced
sycl::queue::submit
overhead by enabling post-enqueue execution
graph cleanup [6fd6098] - Added support for classes implicitly converted from
sycl::item
in
sycl::handler::parallel_for
parameter to align with the SYCL 2020
specification [34b93bf] - Removed direct initialization constructor from
sycl::ext::intel::experimental::bfloat16
class [81154ec] - Added
sycl::vec
andsycl::marray
support tosycl::known_identity
type
trait [8fefb25] - Added minimal support for the generic space address space to match
sycl::atomic_ref
class definition in specification [e99f298] - Improved cache of command-lists in the context to be per-device for Level
Zero backend [ca457d9] - Extended group algorithms to support broadened types [3205368]
- Added support for alignement flags in
sycl::ext::intel::experimental::esimd::simd::copy_from/copy_to
operations
[27f5c12] - Made
sycl::ext::oneapi::atomic_ref
available insycl
namespace
[2cdcbed] - Renamed
cuda
andhip
enum values and namespaces toext_oneapu_cuda
and
ext_oneapi_hip
to align with SYCL 2020 specification [97f916e] - Improved performance of kernel submission process [535ad1e]
- Eliminated build of unwanted kernels when creating one with
make_kernel
[53ea8b9] - Removed duplicate devices on submission to
kernel_bundle
API functions
[c222497] - Deprecated
sycl::aspects::int64_base_atomics
and
sycl::aspects::int64_extended_atomics
[554b79c] - Made backend specific headers be included implicitly [bc8a00a]
- Removed program class and related API [e7cc7b0]
- Excluded current working directory from DLL search path when looking for
runtime dependencies [0a65cb4] - Enabled persistent device code cache for kernel bundles [810d67a]
- R...
DPC++ daily 2022-03-07
[BuildBot] Uplift GPU RT version for Linux to 22.09.22577 (#5742) Signed-off-by: bb-sycl <[email protected]>
DPC++ daily 2022-03-05
sycl-nightly/20220305 [ESIMD] Disable non-critical messages from VC backend (disable-finali…
DPC++ daily 2022-03-04
[SYCL][ESIMD] Disallow use of accessor::operator[] in ESIMD code (#5706) This operator is not supported in ESIMD context. Signed-off-by: Sergey Dmitriev <[email protected]>
DPC++ daily 2022-03-03
sycl-nightly/20220303 [SYCL][FPGA] Expose value_type and min_capacity from SYCL pipes exten…
DPC++ daily 2022-03-02
[CI] Temporarily disable OCL FPGA Emulator (#5707) FPGA emulator causes crashes on SYCL runtime initialization.
DPC++ daily 2022-03-01
[SYCL] Fix host device local accessor alignment (#5554) Local kernel arguments must be aligned to the type size, simply using `std::vector<char>` doesn't always provide the correct alignment. So this patch adds extra padding to the vector and ensures that the pointer returned for the accessor is actually aligned to the type size. This issue was exposed by: https://github.com/intel/llvm-test-suite/pull/608, which was a follow up to fixing local accessor alignment for the CUDA plugin.
DPC++ daily 2022-02-28
sycl-nightly/20220228 [SYCL][NFC] Fix warnings after compile-time properties implementation…
DPC++ daily 2022-02-27
sycl-nightly/20220227 [SPIR-V] Add SPV_INTEL_long_composites (#2848)