Feature like SAXPY but with divide #36

mbelda · 2022-03-15T10:37:32Z

Hello,

I am currently optimizing a code to run using Hwacha and I have this scenario.

for(int i = 0; i < n; i++){
out[i] = vec1[i]* const_float / vec2[i] + const_float2;
}

So I know I can do the vector division on Hwacha, but it would be nice to have an operation to perform the multiplication by scalar and and operation like SAXPY but with divide. For example, SADPY so that I could do the following:

mul_scalar_vec_hwacha(out, vec1, const_float)
sadpy(out, vec2, const_float2)

Is there any instruction on the ISA to perform this? I have been looking for it but I can't find any.

Thanks in advance!

a0u · 2022-03-15T22:31:24Z

Single-precision scalar-vector multiply:

vfmul.s.vs vv1, vv0, vs0

Most Hwacha vector compute instructions support using shared (scalar) registers for the rs1/rs2/rs3 operands.
The .vs suffix shown above is optional; the assembler can select the correct variant based on the register names.

Page 15 of the Hwacha ISA manual explains the instruction encoding:

When the d flag at bit 63 is set, register rd is interpreted as a vector register (vd). When it is cleared, register rd is interpreted as a shared register (vs). Similarly, the s1 flag (bit 62), the s2 flag (bit 61), and the s3 flag (bit 60) indicates whether rs1, rs2, and rs3 refers to a vector register or a shared register respectively.

There are currently no instructions for fused floating-point divide and add.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature like SAXPY but with divide #36

Feature like SAXPY but with divide #36

mbelda commented Mar 15, 2022

a0u commented Mar 15, 2022 •

edited

Loading

Feature like SAXPY but with divide #36

Feature like SAXPY but with divide #36

Comments

mbelda commented Mar 15, 2022

a0u commented Mar 15, 2022 • edited Loading

a0u commented Mar 15, 2022 •

edited

Loading