Releases: microsoft/antares
Releases · microsoft/antares
Antares v0.2.3
New updates:
- Detect Windows HLSL ending lines
\r\n
in c-hlsl_win64/c-hlsl_xbox backend; - Refine extra overhead in JIT plugin computation for pytorch;
- Add IPU/IPU2 evaluator for c-ipu backend;
- Enhance in evaluating c-sycl_cuda backend;
Thanks for contributions from mzmssg, Michoumichmich.
Antares v0.2.2
New updates:
- HLSL support native
erf
&pow
operators. - Extend new backend:
c-mcpu_android
for aarch64 CPU. - TF/Pytotch JIT Plugin: Support extending AVX512 & SYCL kernels.
- Collective library: MPI Support for TF-Intel; NCCL/RCCL Support for TF-CUDA/TF-ROCM.
Antares v0.2.1
New updates:
- Change of Auto-scheduling: Enhanced CPU(c-scpu/c-mcpu/c-mcpu_avx512) & IPU(renamed from c-gc to c-ipu) auto tuning search space.
- Change of JIT Tuning: Pytorch/Tensorflow JIT Plugin is changed into local tuning by setting
ANTARES_ROOT
. (Tuning over rest-server is canceled) - Change of Installation: Allow non-root users to install Antares components.
- Many other fixes.
Antares v0.2.0
New features:
- Supporting More Backends: e.g. SYCL for CPU, ROCm for Windows, OCL for Android, etc.
- Enhanced Tuning Mechanism OpEvo-2: Faster and Effective Tuner than Legacy Ansor.
- AB Backend interface for all hardware (e.g. ab::init, ab::launchKernel, ..)
- Enhanced Antares HLSL library using DXC-6.0, tuning efficiency is much improved.
- Initial support for inter-op tuning (only for small graph in this version, large graph tuning will be supported in the following releases).
Antares v0.1.0
This version is frozen to keep legacy usage of Antares (intra-op optimizations, v0.1 api for directx12).