Skip to content

Commit d6e0d49

Browse files
authored
Merge pull request #172 from sterrettm2/mmx_fix
Fix for MMX instructions being generated without emms
2 parents 9ab7d47 + 5c63eec commit d6e0d49

File tree

2 files changed

+10
-10
lines changed

2 files changed

+10
-10
lines changed

README.md

-10
Original file line numberDiff line numberDiff line change
@@ -80,16 +80,6 @@ benchmark](https://github.com/google/benchmark) frameworks respectively. You
8080
can configure meson to build them both by using `-Dbuild_tests=true` and
8181
`-Dbuild_benchmarks=true`.
8282

83-
### Note about building with avx512 by g++ v9 and v10
84-
85-
There is a risk when compile with avx512 by g++ v9 and v10,
86-
as some `MMX Technology` instructions is used by g++ v9/v10
87-
without clearing fpu state.
88-
Check [issue 154](https://github.com/intel/x86-simd-sort/issues/154)
89-
for more details.
90-
91-
Adding `g++` option `-mno-mmx`, which disables `MMX Technology` instructions, is a possible workaround.
92-
9383
## Example usage
9484

9585
#### Sort an array of floats

src/xss-common-argsort.h

+10
Original file line numberDiff line numberDiff line change
@@ -575,6 +575,11 @@ X86_SIMD_SORT_INLINE void xss_argsort(T *arr,
575575

576576
if (descending) { std::reverse(arg, arg + arrsize); }
577577
}
578+
579+
#ifdef __MMX__
580+
// Workaround for compiler bug generating MMX instructions without emms
581+
_mm_empty();
582+
#endif
578583
}
579584

580585
template <typename T>
@@ -632,6 +637,11 @@ X86_SIMD_SORT_INLINE void xss_argselect(T *arr,
632637
argselect_<vectype, argtype>(
633638
arr, arg, k, 0, arrsize - 1, 2 * (arrsize_t)log2(arrsize));
634639
}
640+
641+
#ifdef __MMX__
642+
// Workaround for compiler bug generating MMX instructions without emms
643+
_mm_empty();
644+
#endif
635645
}
636646

637647
template <typename T>

0 commit comments

Comments
 (0)