-
Notifications
You must be signed in to change notification settings - Fork 274
[RVP] add p extension spec 0.94 support #258
base: riscv-gcc-experiment
Are you sure you want to change the base?
[RVP] add p extension spec 0.94 support #258
Conversation
|
Hi @kito-cheng , thanks for the suggestions! I will do that later on. Also, I wonder can we have a new experimental branch for p ext, since it might be still huge even if we split it into small pieces. |
2bc5611
to
29d62e2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For [Hook] Add TARGET_VECTOR_MODE_SUPPORTED_P implementation
Since some instructions like smul16
in rv32p need vector register with 64-bit, I think you should also consider MODE_VECTOR_INT
.
83e60d0
to
608b5be
Compare
@fakepaper56 Thanks, fixed in new commit. |
Could you rebase and fix the conflict? |
608b5be
to
8ded055
Compare
No conflict now. |
Oops, I still saw |
Oh, but it could be merged by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes are so huge, I guess I need take long time to review that, and I'll submit my review when I review part of patch.
gcc/config/riscv/rvp.md
Outdated
(define_code_attr ph [(plus "") (minus "")]) | ||
|
||
;; bit count, clz and clrs | ||
(define_code_iterator unop [clrsb clz]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like there is no bitc ount here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in the new commit eb5b39a
gcc/config/riscv/rvp_intrinsic.h
Outdated
typedef unsigned int uint32x2_t __attribute__((vector_size(8))); | ||
typedef unsigned int uint32x4_t __attribute__((vector_size(16))); | ||
|
||
#define __riscv__kabsw(a) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only saw __rv__kabsw
in the spec? it seems redundant ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it only helps intrinsic header more readable, e.g.
#if defined(__riscv_zpsf) && __riscv_xlen == 64
#define __rv__v_smar64 __riscv__v_smar64
#define __rv__v_smsr64 __riscv__v_smsr64
...
instead of several lines for one intrinsic. I can remove them if it is not appropriate.
gcc/config/riscv/predicates.md
Outdated
@@ -212,3 +212,21 @@ | |||
{ | |||
return riscv_gpr_save_operation_p (op); | |||
}) | |||
|
|||
(define_predicate "register_even_operand" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it necessary even riscv_hard_regno_mode_ok has guarded that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is not necessary, and it was removed by commit ecd8198
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you squash your patch together? I mean merge the fixup patch into the original one, since I am review the commit one-by-one, and it would be easier review when you submit patch to upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Kito, I have sorted them out.
"TARGET_ZPN && !TARGET_64BIT" | ||
"ave\t%0, %1, %2" | ||
[(set_attr "type" "dsp") | ||
(set_attr "mode" "SI")]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest merge riscv_ave
and ave
, riscv_ave
seems redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in the new commit eb5b39a
DONE; | ||
}) | ||
|
||
(define_insn "avedi" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto here.
gcc/config/riscv/riscv.md
Outdated
@@ -56,6 +56,10 @@ | |||
UNSPEC_UKSUBW | |||
UNSPEC_UKADDH | |||
UNSPEC_UKSUBH | |||
UNSPEC_BITREV | |||
UNSPEC_CLROV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to define UNSPEC_CLROV
in unspecv
rather than unspec
if you are use that in unspec_volatile
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in the new commit eb5b39a
gcc/config/riscv/rvp.md
Outdated
|
||
(define_insn "riscv_<rvp_optab>si2" | ||
[(set (match_operand:SI 0 "register_operand" "=r") | ||
(unop:SI (match_operand:SI 1 "register_operand" "r")))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest using standard pattern name
[1] for those pattern, I mean without the riscv_ prefix, using standard pattern name
can teach GCC to generate that instruction when user using __builtin_clz
and __builtin_clrsb
[2]
[1] https://gcc.gnu.org/onlinedocs/gccint/Standard-Names.html#index-clrsbm2-instruction-pattern
[2] https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fclz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your info! Fixed in commit 715c9ca. Also, there are many other patterns that can be renamed (remove riscv_ prefix), such as smin, smax, etc. Maybe I can just remove the riscv_
prefix in all the names?
gcc/config/riscv/rvp.md
Outdated
[(set (match_operand:VQIHI 0 "register_operand" "=r") | ||
(unspec:VQIHI [(eq:VQIHI (match_operand:VQIHI 1 "register_operand" " r") | ||
(match_operand:VQIHI 2 "register_operand" " r"))] | ||
UNSPEC_CMPEQ))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you reference vcondu<mode><v_cmp_mixed>
@ aarch64/aarch64-simd.md and try to remove the unspec here, I think it should be able to represent without unspec?
f149f17
to
82035e7
Compare
[(set_attr "type" "dsp64") | ||
(set_attr "mode" "DI")]) | ||
|
||
;; uk|k add|sub w|h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you simplify those pattern by define_int_iterator
and define_int_attr
?
gcc/config/riscv/rvp.md
Outdated
|
||
;; KDMBB, KDMBT, KDMTT (32 (*) (xlen, xlen) for vector mode & 32 (*) (32, 32) for non-vector mode) | ||
(define_insn "kdmbb<mode>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kdm*<mode>
seems can be merge with kdm*
and v64_kdm*
pattern?
(define_insn "ksll" | ||
[(set (match_operand:SI 0 "register_operand" "= r, r") | ||
(ss_ashift:SI (match_operand:SI 1 "register_operand" " r, r") | ||
(match_operand:SI 2 "rimm5u_operand" " Iu05, r")))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
u05
is enough here.
(set_attr "mode" "SI")]) | ||
|
||
;; MULR64, MULSR64 | ||
(define_insn "rvp_umulsidi3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umulsidi3
is a standard pattern name, so I would suggest remove rvp_
.
[(set_attr "type" "dsp") | ||
(set_attr "mode" "DI")]) | ||
|
||
(define_insn "rvp_mulsidi3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mulsidi3
is a standard pattern name, so I would suggest remove rvp_
here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestions.
The mulsidi3
and umulsidi3
are defined in riscv.md in 32 bit, so I put
if (TARGET_ZPN)
emit_insn (gen_rvp_<u>mulsidi3 (operands[0], operands[1], operands[2]));
else
...
under define_expand "<u>mulsidi3"
and define <u>mulsidi3
in rvp.md
for 64 bit. Does it sound a good solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that seems right solution :)
1c4e324
to
d3b8d1c
Compare
9e4e687
to
57a3b83
Compare
Hi Kito, the feedback problems above are addressed, except removing the unspec of vector compare pattern, which I will investigate later. |
57a3b83
to
a24b56a
Compare
gcc/config/riscv/rvp.md
Outdated
(unspec:GPR [(match_operand:GPR 1 "register_operand" "0") | ||
(match_operand:GPR 2 "register_operand" "r") | ||
(match_operand:GPR 3 "register_operand" "r")] UNSPEC_PBSADA))] | ||
"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"TARGET_ZPN"
?
@@ -525,6 +525,7 @@ pru-*-*) | |||
;; | |||
riscv*) | |||
cpu_type=riscv | |||
extra_headers="rvp_intrinsic.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest rvpintrin.h
similar to rvkintrin.h
gcc/config/riscv/constraints.md
Outdated
(match_test "ival < (1 << 6) && ival >= 0"))) | ||
|
||
(define_constraint "C00" | ||
"Constant value 1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0?
@@ -68,6 +68,8 @@ riscv_implied_info_t riscv_implied_info[] = | |||
{"zks", "zksh"}, | |||
{"zks", "zkg"}, | |||
{"zks", "zkb"}, | |||
{"p", "zpn"}, | |||
{"p", "zpsf"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zprv?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mapping here means the p
flag in --with-arch=rv{xlen}gcp
or -march=rv{xlen}gcp
can be expanded to rv{xlen}gc_zpn_zpsf
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Still I think that zprv should appear on xlen=64.
When I run riscv64-unknown-elf-gcc -march=rv64imcbp -mabi=lp64 .....
on compiler with this PR, I get
rv64up.c: warning: implicit declaration of function '__rv__kadd32' [-Wimplicit-function-declaration]
| if (0x0000000000000000 != __rv__kadd32(0x0000000000000000, 0x0000000000000000)) return 536;
| ^~~~~~~~~~~~
This is so because rv64imcbp
expands to rv64imcb_zpn_zpsf
whilest it should expand to rv64imcb_zpn_zpsf_zprv
because zprv
is required on xlen=64
by spec (I simplify a bit: I didn't expand b to its zeds here because it is not the issue).
If I do not treat warnings as errors, it obviously fails on linking with:
objs/rv64imcb@pcsh-bm1_rv64up/rv64up.o: in function `.L0 ': undefined reference to `__rv__kadd32'
I faced a more general problem with this PR: when I compile my code with riscv64-unknown-elf-gcc -march=rv64imcb -mabi=lp64 .....
(i.e. without p-ext) it fails on linking with this error:
objs/rv64imcb@pcsh-bm1_rv64up/rv64up.o: in function `.L0 ': undefined reference to `__builtin_riscv_kadd16'
I have my own p-ext intrinsic file (which is generally
#ifdef __riscv_zpn
static inline __rv__insn(uintXLEN_t rs1, uintXLEN_t rs2) { uintXLEN_t rd; __asm__("insn %0, %1, %2" : "=r"(rd) : "r"(rs1), "r"(rs2)); return rd; }
#else
static inline __rv__insn(uintXLEN_t rs1, uintXLEN_t rs2) { SOME_GENERIC_CODE; return rd; }
#endif
with different variations of types, number of arguments, clobbers and other stuff). I did so because I am not a pro at machine description language and also it seems to be a common way to implement intrinsics (e.g. https://github.com/rvkrypto/rvkrypto-fips/blob/main/rvkintrin.h).
So my intrinsics produce an assembler code with the specified insn if p-ext is enabled and make a call to (or inline -- depends on optimization levels) some generic-code functions otherwise.
When I found this PR, I expected it works the same way and in addition allows to optimize some generic code with emitting of p-ext insns.
As you mentioned in another topic, "generic code" should use very specific types defined in the same file where intrinsics are -- so we barely expect that such insns will appear on some code with common scalar types as b-ext insns do. It is ok for me: I understand that b-ext machine descriptions implement some simple stuff (e.g. bit set) and this PR operates with packed types so this is either very hard or impossible to map a set of scalar type insns to p-ext insn in most cases. Still I don't understand why intrinsics do not work (I mean do not provide generic code) without enabled p-ext and what should be done if I want to compile some file with the compiler which includes this intrinsic file without p-ext insns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @marcfedorow,
This is so because rv64imcbp expands to rv64imcb_zpn_zpsf whilest it should expand to rv64imcb_zpn_zpsf_zprv because zprv is required on xlen=64 by spec (I simplify a bit: I didn't expand b to its zeds here because it is not the issue).
Thank you for pointing it out. I added a fix to support a different expansion on p flag (rv32p will be expanded into rv32_zpn_zpsf, and rv64p will be rv64_zpn_zprv_zpsf).
with different variations of types, number of arguments, clobbers and other stuff). I did so because I am not a pro at machine description language and also it seems to be a common way to implement intrinsics (e.g. https://github.com/rvkrypto/rvkrypto-fips/blob/main/rvkintrin.h).
So my intrinsics produce an assembler code with the specified insn if p-ext is enabled and make a call to (or inline -- depends on optimization levels) some generic-code functions otherwise.
I think we do not have such a plan so far, since it requires a great amount of workload. I think it is a good idea though, so I will forward your message to other PLCT members in the further meetup.
As you mentioned in another topic, "generic code" should use very specific types defined in the same file where intrinsics are -- so we barely expect that such insns will appear on some code with common scalar types as b-ext insns do. It is ok for me: I understand that b-ext machine descriptions implement some simple stuff (e.g. bit set) and this PR operates with packed types so this is either very hard or impossible to map a set of scalar type insns to p-ext insn in most cases. Still I don't understand why intrinsics do not work (I mean do not provide generic code) without enabled p-ext and what should be done if I want to compile some file with the compiler which includes this intrinsic file without p-ext insns.
Vector type int16x4_t
doesn't need to be from p-ext header, since it is native support in GCC (https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html). To use intrinsic might be the only solution if you try to use scalar type instead of vector type (e.g. put int16x4_t data onto int64_t field), but p-ext insn is possible to be generated from scalar code through auto-vectorization. e.g.
#include <rvp_intrinsic.h>
#include <stdint.h>
typedef short v4hi __attribute__((vector_size (8)));
v4hi v_sadd16_spn (v4hi ra, v4hi rb)
{
return ra + rb;
}
int16_t *v_sadd16_arr (int16_t *ra, int16_t *rb, int len)
{
for (int i = 0; i < len; i++)
ra[i] += rb[i];
return ra;
}
riscv64-unknown-elf-gcc -S -O3 add16.c
.file "add16.c"
.option nopic
.attribute arch, "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c0p0_p2p0_zpn2p0_zprv2p0_zpsf2p0"
.attribute unaligned_access, 0
.attribute stack_align, 16
.text
.align 1
.globl v_sadd16_spn
.type v_sadd16_spn, @function
v_sadd16_spn:
add16 a0, a0, a1
ret
.size v_sadd16_spn, .-v_sadd16_spn
.align 1
.globl v_sadd16_arr
.type v_sadd16_arr, @function
v_sadd16_arr:
ble a2,zero,.L4
addi a5,a1,2
addiw a3,a2,-1
sub a5,a0,a5
sext.w a4,a3
sltiu a5,a5,5
li a6,5
xori a5,a5,1
sgtu a4,a4,a6
and a5,a4,a5
sext.w t1,a2
beq a5,zero,.L5
or a5,a1,a0
andi a5,a5,7
bne a5,zero,.L5
srliw a7,t1,2
slli a7,a7,3
mv a5,a0
mv a3,a1
add a7,a7,a0
.L6:
ld a4,0(a5)
ld a6,0(a3)
addi a5,a5,8
addi a3,a3,8
add16 a4, a4, a6
sd a4,-8(a5)
bne a5,a7,.L6
andi a5,t1,-4
mv a4,a5
beq t1,a5,.L4
slli a5,a5,32
srli a5,a5,31
add a7,a0,a5
add a3,a1,a5
lhu a6,0(a3)
lhu t1,0(a7)
addiw a3,a4,1
addw a6,a6,t1
sh a6,0(a7)
bge a3,a2,.L4
addi a3,a5,2
add a6,a0,a3
add a3,a1,a3
lhu a3,0(a3)
lhu a7,0(a6)
addiw a4,a4,2
addw a3,a3,a7
sh a3,0(a6)
bge a4,a2,.L4
addi a5,a5,4
add a4,a0,a5
add a1,a1,a5
lhu a5,0(a1)
lhu a3,0(a4)
addw a5,a5,a3
sh a5,0(a4)
ret
.L5:
slli a5,a3,32
srli a3,a5,31
addi a2,a0,2
mv a5,a0
add a2,a2,a3
.L8:
lhu a4,0(a5)
lhu a3,0(a1)
addi a5,a5,2
addi a1,a1,2
addw a4,a4,a3
sh a4,-2(a5)
bne a5,a2,.L8
.L4:
ret
.size v_sadd16_arr, .-v_sadd16_arr
.ident "GCC: (GNU) 10.2.0"
I hope I have answered all your questions. Thanks again.
gcc/config/riscv/rvp_intrinsic.h
Outdated
(__builtin_riscv_ukmsr64 ((r), (a), (b))) | ||
#define __rv__maddr32(a, b) \ | ||
(__builtin_riscv_maddr32 ((a), (b))) | ||
#define __rv__msubr32(a, b) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r, a, b
gcc/config/riscv/rvp_intrinsic.h
Outdated
(__builtin_riscv_ukmar64 ((r), (a), (b))) | ||
#define __rv__ukmsr64(r, a, b) \ | ||
(__builtin_riscv_ukmsr64 ((r), (a), (b))) | ||
#define __rv__maddr32(a, b) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r, a, b
I am unable to emit p-ext insns from generic code. #include <rvp_intrinsic.h>
#include <stdint.h>
#include <stdlib.h>
#define ADD_SUB(u, n, BODY) \
uint64_t rd = 0; \
for (int i = 0; i < __riscv_xlen; i += n) { \
u##int##n##_t rs1s = rs1 >> i, rs2s = rs2 >> i; \
rd |= (uint64_t)((BODY) & UINT##n##_MAX) << i; \
} \
return rd
static inline uint64_t add16(uint64_t rs1, uint64_t rs2)
{
ADD_SUB( , 16, rs1s + rs2s );
}
volatile long unsigned int foo(){
return rand();
}
int main()
{
uint64_t a = foo(), b = foo();
return __rv__add16(a, b) != add16(b, a);
} compiles to <main>:
1141 addi sp,sp,-16
e406 sd ra,8(sp)
e022 sd s0,0(sp)
66e000ef jal ra,800015c8 <rand>
842a mv s0,a0
668000ef jal ra,800015c8 <rand>
02055693 srli a3,a0,0x20
02045613 srli a2,s0,0x20
872a mv a4,a0
4105559b sraiw a1,a0,0x10
4104579b sraiw a5,s0,0x10
0106939b slliw t2,a3,0x10
0106189b slliw a7,a2,0x10
6541 lui a0,0x10
fff50293 addi t0,a0,-1 # ffff
00b7833b addw t1,a5,a1
4103d81b sraiw a6,t2,0x10
4108de1b sraiw t3,a7,0x10
00537eb3 and t4,t1,t0
01c80f3b addw t5,a6,t3
43075f93 srai t6,a4,0x30
43045593 srai a1,s0,0x30
0104161b slliw a2,s0,0x10
0107139b slliw t2,a4,0x10
005f76b3 and a3,t5,t0
01f5853b addw a0,a1,t6
000e879b sext.w a5,t4
4106531b sraiw t1,a2,0x10
4103d81b sraiw a6,t2,0x10
00557eb3 and t4,a0,t0
01079893 slli a7,a5,0x10
02069e13 slli t3,a3,0x20
01030f3b addw t5,t1,a6
01c8efb3 or t6,a7,t3
030e9593 slli a1,t4,0x30
005f72b3 and t0,t5,t0
00bfe7b3 or a5,t6,a1
0002869b sext.w a3,t0
00d7e533 or a0,a5,a3
40870477 add16 s0,a4,s0
60a2 ld ra,8(sp)
40850733 sub a4,a0,s0
6402 ld s0,0(sp)
00e03533 snez a0,a4
0141 addi sp,sp,16
8082 ret and #include <rvp_intrinsic.h>
#include <stdint.h>
#include <stdlib.h>
typedef union {
uint64_t packed;
uint16_t arr[4];
} pack;
static inline uint64_t add16(uint64_t rs1, uint64_t rs2)
{
pack a, b, c;
a.packed = rs1, b.packed = rs2;
for (short i = 0; i < 4; ++i){
c.arr[i] = a.arr[i] + b.arr[i];
}
return c.packed;
}
volatile long unsigned int foo(){
return rand();
}
int main()
{
uint64_t a = foo(), b = foo();
return __rv__add16(a, b) != add16(b, a);
} compiles to <main>:
1141 addi sp,sp,-16
e406 sd ra,8(sp)
e022 sd s0,0(sp)
63a000ef jal ra,80001594 <rand>
842a mv s0,a0
634000ef jal ra,80001594 <rand>
01045693 srli a3,s0,0x10
01055793 srli a5,a0,0x10
00f682bb addw t0,a3,a5
02045713 srli a4,s0,0x20
02055613 srli a2,a0,0x20
00a4033b addw t1,s0,a0
00c703bb addw t2,a4,a2
03055813 srli a6,a0,0x30
03045593 srli a1,s0,0x30
0802c8bb zext.h a7,t0
01089e13 slli t3,a7,0x10
00b80ebb addw t4,a6,a1
08034f3b zext.h t5,t1
0803cfbb zext.h t6,t2
01cf67b3 or a5,t5,t3
020f9293 slli t0,t6,0x20
030e9693 slli a3,t4,0x30
0057e733 or a4,a5,t0
00d76633 or a2,a4,a3
40850577 add16 a0,a0,s0
40c50433 sub s0,a0,a2
60a2 ld ra,8(sp)
00803533 snez a0,s0
6402 ld s0,0(sp)
0141 addi sp,sp,16
8082 ret I also tried without long unsigned int foo(){
static int bar = 12345;
return ^= 54321;
} Results are pretty same -- generic code does not emit as an instruction. Also I tried this: int main()
{
uint64_t a = 12345, b = 54321;
return __rv__add16(a, b) != add16(b, a);
} It compiled to this: c0000797 auipc a5,0xc0000
8e878793 addi a5,a5,-1816 # 40000838 <__TMC_END__+0x10>
c0000297 auipc t0,0xc0000
8e828293 addi t0,t0,-1816 # 40000840 <__TMC_END__+0x18>
6388 ld a0,0(a5)
0002b303 ld t1,0(t0)
406503f7 add16 t2,a0,t1
b9638593 addi a1,t2,-1130
00b03533 snez a0,a1
8082 ret And I tried that: int main()
{
uint64_t a = 0x12345, b = 0x54321;
return __rv__add16(a, b) != add16(b, a);
} And it was compiled out ( |
…KMADA32, KMAXDA32, KMDA32, KMXDA32
5fc9529
to
da54bac
Compare
gcc/config/riscv/rvp.md
Outdated
(set_attr "mode" "DI")]) | ||
|
||
;; uk|k add|sub w|h | ||
(define_int_attr karth_insn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
U?K(sub|add)[HW] could be described similar to U?K(sub|add) + sign extension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
U?K(sub|add)[HW] could be described similar to U?K(sub|add) + sign extension
Hi @marcfedorow , this is added in the latest update. Thanks for your feedback.
0. update impl according to the change in 0.9.11 1. fix add/sub crash in rv32g 2. use standard name for saturation insn, e.g. kaddw and kaddh
932a619
to
7eeadf6
Compare
Assembler messages: |
I think you need also use Binutils that supported P-ext, We had update p-ext toolchain support with p-spec v0.9.11, here is the latest repo: https://github.com/plctlab/riscv-gcc/tree/riscv-gcc-p-ext You can find some build record refer in this issue: |
Thank you for helping, now it's working fine |
Hi I am getting this error while doing compilation process |
Can you support the source file for me to reproduce this problem,thanks. |
Hi when I am running the p extension testcase I am getting this error how to solve it? /home/manasa/PSIMD/riscv-gnu-toolchain/lib/gcc/riscv32-unknown-elf/10.2.0/include/rvp_intrinsic.h:311:1: error: incompatible types when returning type 'int' but 'uint8xN_t' {aka 'uint8x4_t'} was expected |
0.930.94 with GCC 10.2Add commit 0b7b471 and 4b81528 for fixing configure option parsing error