Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libphobos: Require ucontext library on x86* if cet is enabled #23

Open
wants to merge 2 commits into
base: ci/mainline
Choose a base branch
from

Conversation

the-horo
Copy link

@the-horo the-horo commented Dec 1, 2023

This makes DRUNTIME_LIBRARY_UCONTEXT better match the D module checking if cet has been enabled on x86* when deciding if there is an assembly implementation for fiber_switchContext.

libphobos/ChangeLog:

	* configure: Regenerate.
	* m4/druntime/libraries.m4 (DRUNTIME_LIBRARIES_UCONTEXT): treat
	case where cet is enabled

ibuclaw and others added 2 commits December 1, 2023 02:22
This makes DRUNTIME_LIBRARY_UCONTEXT better match the D module
checking if cet has been enabled on x86* when deciding if there is an
assembly implementation for fiber_switchContext.

Signed-off-by: Andrei Horodniceanu <[email protected]>
@jpf91 jpf91 force-pushed the ci/mainline branch 7 times, most recently from b8250e6 to 713d6bc Compare December 8, 2023 01:22
@jpf91 jpf91 force-pushed the ci/mainline branch 8 times, most recently from 5ebd6cc to e6068f0 Compare December 16, 2023 01:22
@jpf91 jpf91 force-pushed the ci/mainline branch 7 times, most recently from fe8cc76 to 9e0d194 Compare December 23, 2023 01:22
@jpf91 jpf91 force-pushed the ci/mainline branch 6 times, most recently from 948da0c to 6b08bed Compare December 29, 2023 01:22
@jpf91 jpf91 force-pushed the ci/mainline branch 8 times, most recently from dc9a893 to b96eeb5 Compare October 23, 2024 00:22
@jpf91 jpf91 force-pushed the ci/mainline branch 7 times, most recently from dbe15da to d7ac963 Compare October 30, 2024 01:22
@jpf91 jpf91 force-pushed the ci/mainline branch 5 times, most recently from b4b8cc0 to ee2b430 Compare November 4, 2024 01:23
jpf91 pushed a commit that referenced this pull request Nov 5, 2024
We can make use of the integrated rotate step of the XAR instruction
to implement most vector integer rotates, as long we zero out one
of the input registers for it.  This allows for a lower-latency sequence
than the fallback SHL+USRA, especially when we can hoist the zeroing operation
away from loops and hot parts.  This should be safe to do for 64-bit vectors
as well even though the XAR instructions operate on 128-bit values, as the
bottom 64-bit results is later accessed through the right subregs.

This strategy is used whenever we have XAR instructions, the logic
in aarch64_emit_opt_vec_rotate is adjusted to resort to
expand_rotate_as_vec_perm only when it's expected to generate a single REV*
instruction or when XAR instructions are not present.

With this patch we can gerate for the input:
v4si
G1 (v4si r)
{
    return (r >> 23) | (r << 9);
}

v8qi
G2 (v8qi r)
{
  return (r << 3) | (r >> 5);
}
the assembly for +sve2:
G1:
        movi    v31.4s, 0
        xar     z0.s, z0.s, z31.s, #23
        ret

G2:
        movi    v31.4s, 0
        xar     z0.b, z0.b, z31.b, #5
        ret

instead of the current:
G1:
        shl     v31.4s, v0.4s, 9
        usra    v31.4s, v0.4s, 23
        mov     v0.16b, v31.16b
        ret
G2:
        shl     v31.8b, v0.8b, 3
        usra    v31.8b, v0.8b, 5
        mov     v0.8b, v31.8b
        ret

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <[email protected]>

gcc/

	* config/aarch64/aarch64.cc (aarch64_emit_opt_vec_rotate): Add
	generation of XAR sequences when possible.

gcc/testsuite/

	* gcc.target/aarch64/rotate_xar_1.c: New test.
@jpf91 jpf91 force-pushed the ci/mainline branch 7 times, most recently from cf5439e to e63b645 Compare November 11, 2024 01:22
@jpf91 jpf91 force-pushed the ci/mainline branch 2 times, most recently from bfb7386 to c5e8a39 Compare November 13, 2024 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants