@@ -429,7 +429,7 @@ use mixed or double precision for |Gromacs|. There is no need to
429
429
compile FFTW with threading or MPI support, but it does no harm. On
430
430
x86 hardware, compile with all of ``--enable-sse2 ``, ``--enable-avx ``,
431
431
and ``--enable-avx2 `` flags. On Intel processors supporting
432
- 512-wide AVX, including KNL, add ``--enable-avx512 `` too.
432
+ 512-wide AVX, add ``--enable-avx512 `` too.
433
433
FFTW will create a fat library with codelets for all different instruction sets,
434
434
and pick the fastest supported one at runtime.
435
435
On ARM architectures with SIMD support use ``--enable-neon `` flag;
@@ -833,11 +833,10 @@ lead to performance loss, e.g. on Intel Skylake-X/SP and AMD Zen (first generati
833
833
On AMD it is beneficial to use starting with Zen4.
834
834
Additionally, with GPU accelerated runs ``AVX2_256 `` can also be
835
835
faster on high-end Skylake CPUs with both 512-bit FMA units enabled.
836
- 9. ``AVX_512_KNL `` Knights Landing Xeon Phi processors.
837
- 10. ``IBM_VSX `` Power7, Power8, Power9 and later have this.
838
- 11. ``ARM_NEON_ASIMD `` 64-bit ARMv8 and later. For maximum performance on NVIDIA
836
+ 9. ``IBM_VSX `` Power7, Power8, Power9 and later have this.
837
+ 10. ``ARM_NEON_ASIMD `` 64-bit ARMv8 and later. For maximum performance on NVIDIA
839
838
Grace (ARMv9), we strongly suggest at least GNU >= 13, LLVM >= 16.
840
- 12 . ``ARM_SVE `` 64-bit ARMv8 and later with the Scalable Vector Extensions (SVE).
839
+ 11 . ``ARM_SVE `` 64-bit ARMv8 and later with the Scalable Vector Extensions (SVE).
841
840
The SVE vector length is fixed at CMake configure time. The default vector
842
841
length is automatically detected, and this can be changed via the
843
842
``GMX_SIMD_ARM_SVE_LENGTH `` CMake variable. If compiling for a different
@@ -1730,26 +1729,6 @@ The ARM ThunderX2 Cray XC50 machines differ only in that the recommended
1730
1729
compiler is the ARM HPC Compiler (``armclang ``).
1731
1730
1732
1731
1733
- Intel Xeon Phi
1734
- ^^^^^^^^^^^^^^
1735
-
1736
-
1737
- Xeon Phi processors, hosted or self-hosted, are supported.
1738
- The Knights Landing-based Xeon Phi processors behave like standard x86 nodes,
1739
- but support a special SIMD instruction set. When cross-compiling for such nodes,
1740
- use the ``AVX_512_KNL `` SIMD flavor.
1741
- Knights Landing processors support so-called "clustering modes" which
1742
- allow reconfiguring the memory subsystem for lower latency. |Gromacs | can
1743
- benefit from the quadrant or SNC clustering modes.
1744
- Care needs to be taken to correctly pin threads. In particular, threads of
1745
- an MPI rank should not cross cluster and NUMA boundaries.
1746
- In addition to the main DRAM memory, Knights Landing has a high-bandwidth
1747
- stacked memory called MCDRAM. Using it offers performance benefits if
1748
- it is ensured that ``mdrun `` runs entirely from this memory; to do so
1749
- it is recommended that MCDRAM is configured in "Flat mode" and ``mdrun `` is
1750
- bound to the appropriate NUMA node (use e.g. ``numactl --membind 1 `` with
1751
- quadrant clustering mode).
1752
-
1753
1732
NVIDIA Grace
1754
1733
^^^^^^^^^^^^
1755
1734
0 commit comments