Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JIT] Enable ccmp in X86 emitter backend. #110881

Draft
wants to merge 77 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
1820567
Ruihan: POC with REX2
Ruihan-Yin Mar 25, 2024
d1afc68
resolve comments
Ruihan-Yin May 17, 2024
2335aa3
refactor register encoding for REX2
Ruihan-Yin May 20, 2024
6578c58
merge REX2 path to legacy path
Ruihan-Yin May 21, 2024
01eeb80
Enable REX2 in more instructions.
Ruihan-Yin May 30, 2024
690aee3
Avoid repeatedly estimate the size of REX2 prefix
Ruihan-Yin Jun 3, 2024
31d7fb4
Enable REX2 encoding on RI and SV path
Ruihan-Yin Jun 5, 2024
a995878
Add rex2 support to rotate and shift.
Ruihan-Yin Jun 6, 2024
74aacf6
CR session.
Ruihan-Yin Jun 7, 2024
c330927
Testing infra updates: assert REX2 is enabled.
Ruihan-Yin Jun 11, 2024
fbf20d1
revert rcl_N and rcr_N, tp and latency data for these instructions is…
Ruihan-Yin Jun 11, 2024
ea02e70
partially enable REX2 on emitOutputAM, case covered: R_AR and AR_R.
Ruihan-Yin Jun 12, 2024
c74b801
Adding unit tests.
Ruihan-Yin Jun 13, 2024
34980b4
push, pop, inc, dec, neg, not, xadd, shld, shrd, cmpxchg, setcc, bswap.
Ruihan-Yin Jun 26, 2024
2ffdbeb
bug fix for bswap
Ruihan-Yin Jun 27, 2024
3a729bb
bt
Ruihan-Yin Jun 28, 2024
d943b03
xchg, idiv
Ruihan-Yin Jul 1, 2024
c8fee9c
Make sure add REX2 prefix if register encoding for EGPRs are being ca…
Ruihan-Yin Jul 2, 2024
6ec0e97
Ensure code size is correctly computed in R_R_I path.
Ruihan-Yin Jul 8, 2024
1d01003
clean up
Ruihan-Yin Jul 9, 2024
1acc219
Change all AddSimdPrefix to AddX86Prefix
Ruihan-Yin Jul 15, 2024
87ad443
div, mulEAX
Ruihan-Yin Jul 16, 2024
bb9905a
filter out test from REX2 encoding when using ACC form.
Ruihan-Yin Jul 19, 2024
86083b2
Make sure REX prefix will not be added when emitting with REX2.
Ruihan-Yin Jul 24, 2024
dfe8760
resolve comments.
Ruihan-Yin Aug 5, 2024
64761cd
make sure the APX debug knob is only available under debug build.
Ruihan-Yin Oct 24, 2024
f1aba62
clean up some out-dated code.
Ruihan-Yin Nov 12, 2024
f5cc5a8
enable movsxd
Ruihan-Yin Nov 12, 2024
7ca8433
Enable "Call"
Ruihan-Yin Nov 13, 2024
bc4d225
Enable "JMP"
Ruihan-Yin Nov 15, 2024
deb3814
resolve merge errors
Ruihan-Yin Nov 18, 2024
0d63230
formatting
Ruihan-Yin Nov 18, 2024
13b8076
remote coredistools.dll for internal tests only
Ruihan-Yin Nov 18, 2024
42c6cfc
bug fix
Ruihan-Yin Nov 19, 2024
b1a9617
SUB reg, reg, reg
Ruihan-Yin Aug 8, 2024
ec5d5ca
enable NDD on genCodeForBinary
Ruihan-Yin Aug 28, 2024
ebeaf04
consolidate TakesLegacyPromotedEvexPrefix logics.
Ruihan-Yin Aug 30, 2024
547f01d
ensure register encoding is correct under legacy-promoted-evex encoding.
Ruihan-Yin Aug 30, 2024
3566464
Make sure the overflow check is correctly emitted.
Ruihan-Yin Sep 4, 2024
f8e9c4d
simplify the compiler setup logics.
Ruihan-Yin Sep 4, 2024
6bfd050
emitInsNddBinary
Ruihan-Yin Sep 6, 2024
4b0085d
make sure REX will not be added when EVEX presents.
Ruihan-Yin Sep 7, 2024
5701b1c
resolve comment and clean up.
Ruihan-Yin Sep 11, 2024
6d30388
enable more NDD instructions.
Ruihan-Yin Sep 13, 2024
5d3768c
bug fixes
Ruihan-Yin Sep 13, 2024
a5619e4
enable imul
Ruihan-Yin Sep 13, 2024
c71ace6
add emitter unit tests, and fix encoding error for CMOVcc
Ruihan-Yin Sep 16, 2024
ca92da9
bug fixes:
Ruihan-Yin Sep 18, 2024
5d10aef
refactor emitInsBinary
Ruihan-Yin Sep 19, 2024
5f288a6
clean up
Ruihan-Yin Sep 19, 2024
f4e96b0
clean up and refactor some code
Ruihan-Yin Sep 20, 2024
637c413
make sure the code size estimation is correct for some apx promoted i…
Ruihan-Yin Sep 25, 2024
a203a4d
add tuning knob to EVEX.ND feature.
Ruihan-Yin Sep 30, 2024
a99705a
flip the Evex.nd knob.
Ruihan-Yin Oct 1, 2024
b5fa5bf
put NDD control knob to the correct place.
Ruihan-Yin Oct 3, 2024
b69d01e
resolve merge errors
Ruihan-Yin Nov 20, 2024
52539c3
Make sure APX related knobs are defined properly across platforms
Ruihan-Yin Nov 20, 2024
25d66bf
Add Evex.nf to instrDesc
Ruihan-Yin Oct 2, 2024
a19da9e
{nf} add reg, reg
Ruihan-Yin Oct 8, 2024
2e8d714
Enable EVEX.NF in more instructions
Ruihan-Yin Oct 9, 2024
df59342
more instructions
Ruihan-Yin Oct 10, 2024
226fabb
comments.
Ruihan-Yin Oct 10, 2024
36c6631
lzcnt, tzcnt, popcnt
Ruihan-Yin Oct 10, 2024
5f8a01d
Exclude ACC form from EVEX promotion.
Ruihan-Yin Oct 15, 2024
0453630
BMI instructions.
Ruihan-Yin Oct 15, 2024
07868bc
bug fixes
Ruihan-Yin Oct 16, 2024
69f7e8b
Tweak the code size calculation to make sure REX2 and APX-EVEX are pr…
Ruihan-Yin Oct 18, 2024
1c1a894
bug fixes for stress mode
Ruihan-Yin Oct 29, 2024
1be4b12
Add idEvexNoPromotion to emitter to exclude the APX-EVEX promotion fr…
Ruihan-Yin Nov 4, 2024
bfb06c7
resolve merge error
Ruihan-Yin Nov 20, 2024
9541a99
fix merge error
Ruihan-Yin Nov 21, 2024
543d949
Revert "Add idEvexNoPromotion to emitter to exclude the APX-EVEX prom…
Ruihan-Yin Nov 21, 2024
a879019
bug fix
Ruihan-Yin Nov 22, 2024
55cbda6
introduce _no_evex suffix for some instructions for cases when LOCK w…
Ruihan-Yin Nov 22, 2024
a9a3d5c
Merge remote-tracking branch 'origin/main' into apx-evex-nf-nov
Ruihan-Yin Dec 17, 2024
69c7a29
fix merge errors.
Ruihan-Yin Dec 17, 2024
a625f59
Adds `ccmp` logic into emitter backend.
anthonycanino Dec 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/coreclr/jit/codegen.h
Original file line number Diff line number Diff line change
Expand Up @@ -649,6 +649,7 @@ class CodeGen final : public CodeGenInterface
#if defined(TARGET_AMD64)
void genAmd64EmitterUnitTestsSse2();
void genAmd64EmitterUnitTestsApx();
void genAmd64EmitterUnitTestsCCMP();
#endif

#endif // defined(DEBUG)
Expand Down
4 changes: 4 additions & 0 deletions src/coreclr/jit/codegenlinear.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2702,6 +2702,10 @@ void CodeGen::genEmitterUnitTests()
{
genAmd64EmitterUnitTestsApx();
}
if (unitTestSectionAll || (strstr(unitTestSection, "ccmp") != nullptr))
{
genAmd64EmitterUnitTestsCCMP();
}

#elif defined(TARGET_ARM64)
if (unitTestSectionAll || (strstr(unitTestSection, "general") != nullptr))
Expand Down
293 changes: 275 additions & 18 deletions src/coreclr/jit/codegenxarch.cpp

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions src/coreclr/jit/compiler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2298,7 +2298,11 @@ void Compiler::compSetProcessor()
}
if (canUseApxEncoding())
{
// TODO-Xarch-apx:
// At this stage, since no machine will pass the CPUID check for APX, we need a special stress mode that
// enables REX2 on incompatible platform, `DoJitStressRex2Encoding` is expected to be removed eventually.
codeGen->GetEmitter()->SetUseRex2Encoding(true);
codeGen->GetEmitter()->SetUsePromotedEVEXEncoding(true);
}
}
#endif // TARGET_XARCH
Expand Down
17 changes: 17 additions & 0 deletions src/coreclr/jit/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -9999,6 +9999,23 @@ class Compiler
#ifdef DEBUG
return JitConfig.JitStressEvexEncoding() || JitConfig.JitStressRex2Encoding();
#endif // DEBUG
return false;
}

//------------------------------------------------------------------------
// DoJitStressPromotedEvexEncoding- Answer the question: Do we force promoted EVEX encoding.
//
// Returns:
// `true` if user requests promoted EVEX encoding.
//
bool DoJitStressPromotedEvexEncoding() const
{
#ifdef DEBUG
if (JitConfig.JitStressPromotedEvexEncoding())
{
return true;
}
#endif // DEBUG

return false;
}
Expand Down
86 changes: 59 additions & 27 deletions src/coreclr/jit/emit.h
Original file line number Diff line number Diff line change
Expand Up @@ -471,6 +471,7 @@ class emitter
SetUseVEXEncoding(false);
SetUseEvexEncoding(false);
SetUseRex2Encoding(false);
SetUsePromotedEVEXEncoding(false);
#endif // TARGET_XARCH

emitDataSecCur = nullptr;
Expand Down Expand Up @@ -793,7 +794,16 @@ class emitter
// For normal and embedded broadcast intrinsics, EVEX.L'L has the same semantic, vector length.
// For embedded rounding, EVEX.L'L semantic changes to indicate the rounding mode.
// Multiple bits in _idEvexbContext are used to inform emitter to specially handle the EVEX.L'L bits.
unsigned _idEvexbContext : 2;
unsigned _idCustom5 : 1;
unsigned _idCustom6 : 1;

#define _idEvexbContext \
(_idCustom6 << 1) | _idCustom5 /* Evex.b: embedded broadcast, embedded rounding, embedded SAE \
*/
#define _idEvexNdContext _idCustom5 /* bits used for the APX-EVEX.nd context for promoted legacy instructions */
#define _idEvexNfContext _idCustom6 /* bits used for the APX-EVEX.nf context for promoted legacy/vex instructions */
#define _idEvexDFV (_idCustom4 << 3) | (_idCustom3 << 2) | (_idCustom2 << 1) | _idCustom1

#endif // TARGET_XARCH

#ifdef TARGET_ARM64
Expand Down Expand Up @@ -1009,6 +1019,7 @@ class emitter
regNumber _idReg3 : REGNUM_BITS;
regNumber _idReg4 : REGNUM_BITS;
};

#elif defined(TARGET_LOONGARCH64)
struct
{
Expand Down Expand Up @@ -1657,38 +1668,17 @@ class emitter
#ifdef TARGET_XARCH
bool idIsEvexbContextSet() const
{
return _idEvexbContext != 0;
return idGetEvexbContext() != 0;
}

void idSetEvexbContext(insOpts instOptions)
{
assert(!idIsEvexbContextSet());
assert(idGetEvexbContext() == 0);
unsigned value = static_cast<unsigned>(instOptions & INS_OPTS_EVEX_b_MASK);

switch (instOptions & INS_OPTS_EVEX_b_MASK)
{
case INS_OPTS_EVEX_eb_er_rd:
{
_idEvexbContext = 1;
break;
}

case INS_OPTS_EVEX_er_ru:
{
_idEvexbContext = 2;
break;
}

case INS_OPTS_EVEX_er_rz:
{
_idEvexbContext = 3;
break;
}

default:
{
unreached();
}
}
_idCustom5 = ((value >> 0) & 1);
_idCustom6 = ((value >> 1) & 1);
}

unsigned idGetEvexbContext() const
Expand Down Expand Up @@ -1728,6 +1718,43 @@ class emitter
assert(!idIsEvexZContextSet());
_idEvexZContext = 1;
}

bool idIsEvexNdContextSet() const
{
return _idEvexNdContext != 0;
}

void idSetEvexNdContext()
{
assert(!idIsEvexNdContextSet());
_idEvexNdContext = 1;
}

bool idIsEvexNfContextSet() const
{
return _idEvexNfContext != 0;
}

void idSetEvexNfContext()
{
assert(!idIsEvexNfContextSet());
_idEvexNfContext = 1;
}

unsigned idGetEvexDFV() const
{
return _idEvexDFV;
}

void idSetEvexDFV(insOpts instOptions)
{
unsigned value = static_cast<unsigned>((instOptions & INS_OPTS_EVEX_dfv_MASK) >> 8);

_idCustom1 = ((value >> 0) & 1);
_idCustom2 = ((value >> 1) & 1);
_idCustom3 = ((value >> 2) & 1);
_idCustom4 = ((value >> 3) & 1);
}
#endif

#ifdef TARGET_ARMARCH
Expand Down Expand Up @@ -2531,7 +2558,12 @@ class emitter
CORINFO_FIELD_HANDLE emitSimdMaskConst(simdmask_t constValue);
#endif // FEATURE_MASKED_HW_INTRINSICS
#endif // FEATURE_SIMD

#if defined(TARGET_XARCH)
regNumber emitInsBinary(instruction ins, emitAttr attr, GenTree* dst, GenTree* src, regNumber targetReg = REG_NA);
#else
regNumber emitInsBinary(instruction ins, emitAttr attr, GenTree* dst, GenTree* src);
#endif
regNumber emitInsTernary(instruction ins, emitAttr attr, GenTree* dst, GenTree* src1, GenTree* src2);
void emitInsLoadInd(instruction ins, emitAttr attr, regNumber dstReg, GenTreeIndir* mem);
void emitInsStoreInd(instruction ins, emitAttr attr, GenTreeStoreInd* mem);
Expand Down
1 change: 1 addition & 0 deletions src/coreclr/jit/emitfmtsxarch.h
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ IF_DEF(RRW_RRW, IS_R1_RW|IS_R2_RW, NONE) // r/w
IF_DEF(RRD_RRD_CNS, IS_R1_RD|IS_R2_RD, SCNS) // read reg1, read reg2, const
IF_DEF(RWR_RRD_CNS, IS_R1_WR|IS_R2_RD, SCNS) // write reg1, read reg2, const
IF_DEF(RRW_RRD_CNS, IS_R1_RW|IS_R2_RD, SCNS) // r/w reg1, read reg2, const
IF_DEF(RWR_RRD_SHF, IS_R1_WR|IS_R2_RD, SCNS) // write reg1, read reg2, shift

IF_DEF(RRD_RRD_RRD, IS_R1_RD|IS_R2_RD|IS_R3_RD, NONE) // read reg1, read reg2, read reg3
IF_DEF(RWR_RRD_RRD, IS_R1_WR|IS_R2_RD|IS_R3_RD, NONE) // write reg1, read reg2, read reg3
Expand Down
Loading
Loading