Skip to content

Commit 16c48d6

Browse files
authored
Align inner loops (#44370)
* Detect inner loop and add 10 bytes of padding at the beginning * generate nop in previous blocks * TODO: figure out if anything needs to be done in optCanonicalizeLoop * Add COMPlus_JitAlignLoopMinBlockWeight and COMPlus_JitAlignLoopMaxCodeSize - Add 2 variables to control which loops get aligned - Moved padding after the conditional/unconditional jump of previous block * Reuse AlignLoops flag for dynamic loop alignment * Detect back edge and count no. of instructions before doing loop alignment * fix bugs * propagate the basic block flag * Switch from instrCount to codeSize * JitAlignLoopWith32BPadding * Add emitLoopAlign32Bytes() * wip * Add logic to avoid emitting nop if not needed * fix a condition * Several things: - Replaced JitAlignLoopWith32BPadding with JitAlignLoopBoundary - Added JitAlignLoopForJcc - Added logging of boundary and point where instruction splitting happpens - Add logic to take into consideration JCC. * Added JitAlignLoopAdaptive algorithm * wip * revert emitarm64.cpp changes * fix errors during merge * fix build errors * refactoring and cleanup * refactoring and build errors fix * jit format * one more build error * Add emitLoopAlignAdjustments() * Update emitLoopAlignAdjustments to just include loopSize calc * Remove #ifdef ADAPTIVE_LOOP_ALIGNMENT * Code cleanup * minor fixes * Fix issues: - Make sure all `align` instructions for non-adaptive fall under same IG - Convert some variables to `unsigned short` - Fixed the maxPadding amount for adaptive alignment calculation * Other fixes * Remove align_loops flag from coreclr * Review feedback - Do not align loop if it has call - Created `emitSetLoopBackEdge()` to isolate `emitCurIG` inside emitter class - Created `emitOutputAlign()` to move the align instruction output logic - Renamed emitVariableeLoopAlign() to emitLongLoopAlign() - Created `optIdentifyLoopsForAlignment()` to identify loops that need alignment - Added comments at various places * jit format * Add FEATURE_LOOP_ALIGN * remove special case for align * Do not propagate BBF_LOOP_ALIGN in certain cases * Introduce instrDescAlign and emitLastAlignedIgNum * Several changes: - Perform accurate padding size before outputting align instruction - During outputting, just double check if the padding needed matches to what was calculated. - If at any time, instruction sizes are over-estimated before the last align instruction, then compensate them by adding NOP. - As part of above step, do not perform encoding "VEX prefix shortening" if there is align instruction in future. - Fix edge cases where because of loop cloning or resolution phase of register allocator, the loops are marked such that they cover the loops that are already mark for alignment. Fix by resetting their IGF_LOOP_ALIGN flag. - During loop size calculation, if the last IG also has `align` flag, then do not take into account the align instruction's size because they are reserved for the next loop. * jit format * fix issue related to needLabel * align memory correctly in superpmi * Few more fixes: - emitOffsAdj takes into account for any mis-prediction of jump. If we compensate that mis-prediction, that off that adjustment. - Record the lastAlignIG only for valid non-zero align instructions * minor JITDUMP messages * Review comments * missing check * Mark the last align IG the one that has non-zero padding * More review comments * Propagate BBF_LOOP_ALIGN for compacting blocks * Handle ALIGN_LOOP flag for loops that are unrolled * jit format * Loop size upto last back-edge instead of first back-edge * Take loop weight in consideration * remove align flag if loop is no longer valid * Adjust loop block weight to 4 instead of 8 * missing space after rebase * fix the enum values after rebase * review feedback * Add missing #ifdef DEBUG
1 parent f9cf601 commit 16c48d6

24 files changed

+1157
-135
lines changed

src/coreclr/ToolBox/superpmi/superpmi/icorjitinfo.cpp

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1609,7 +1609,21 @@ void MyICJI::allocMem(ULONG hotCodeSize, /* IN */
16091609
jitInstance->mc->cr->AddCall("allocMem");
16101610

16111611
// TODO-Cleanup: Could hot block size be ever 0?
1612-
*hotCodeBlock = jitInstance->mc->cr->allocateMemory(hotCodeSize);
1612+
size_t codeAlignment = sizeof(void*);
1613+
size_t hotCodeAlignedSize = static_cast<size_t>(hotCodeSize);
1614+
1615+
if ((flag & CORJIT_ALLOCMEM_FLG_32BYTE_ALIGN) != 0)
1616+
{
1617+
codeAlignment = 32;
1618+
}
1619+
else if ((flag & CORJIT_ALLOCMEM_FLG_16BYTE_ALIGN) != 0)
1620+
{
1621+
codeAlignment = 16;
1622+
}
1623+
hotCodeAlignedSize = ALIGN_UP_SPMI(hotCodeAlignedSize, codeAlignment);
1624+
hotCodeAlignedSize = hotCodeAlignedSize + (codeAlignment - sizeof(void*));
1625+
*hotCodeBlock = jitInstance->mc->cr->allocateMemory(hotCodeAlignedSize);
1626+
*hotCodeBlock = ALIGN_UP_SPMI(*hotCodeBlock, codeAlignment);
16131627

16141628
if (coldCodeSize > 0)
16151629
*coldCodeBlock = jitInstance->mc->cr->allocateMemory(coldCodeSize);

src/coreclr/inc/clrconfigvalues.h

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -302,7 +302,6 @@ RETAIL_CONFIG_DWORD_INFO_EX(EXTERNAL_UseIBCFile, W("UseIBCFile"), 0, "", CLRConf
302302
///
303303
/// JIT
304304
///
305-
RETAIL_CONFIG_DWORD_INFO_DIRECT_ACCESS(UNSUPPORTED_JitAlignLoops, W("JitAlignLoops"), "Aligns loop targets to 8 byte boundaries")
306305
CONFIG_DWORD_INFO_EX(INTERNAL_JitBreakEmit, W("JitBreakEmit"), (DWORD)-1, "", CLRConfig::EEConfig_default)
307306
CONFIG_DWORD_INFO_DIRECT_ACCESS(INTERNAL_JitDebuggable, W("JitDebuggable"), "")
308307
#if !defined(DEBUG) && !defined(_DEBUG)

src/coreclr/inc/corjitflags.h

Lines changed: 25 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -79,45 +79,45 @@ class CORJIT_FLAGS
7979
CORJIT_FLAG_BBINSTR = 29, // Collect basic block profile information
8080
CORJIT_FLAG_BBOPT = 30, // Optimize method based on profile information
8181
CORJIT_FLAG_FRAMED = 31, // All methods have an EBP frame
82-
CORJIT_FLAG_ALIGN_LOOPS = 32, // add NOPs before loops to align them at 16 byte boundaries
82+
CORJIT_FLAG_UNUSED12 = 32,
8383
CORJIT_FLAG_PUBLISH_SECRET_PARAM = 33, // JIT must place stub secret param into local 0. (used by IL stubs)
84-
CORJIT_FLAG_UNUSED12 = 34,
84+
CORJIT_FLAG_UNUSED13 = 34,
8585
CORJIT_FLAG_SAMPLING_JIT_BACKGROUND = 35, // JIT is being invoked as a result of stack sampling for hot methods in the background
8686
CORJIT_FLAG_USE_PINVOKE_HELPERS = 36, // The JIT should use the PINVOKE_{BEGIN,END} helpers instead of emitting inline transitions
8787
CORJIT_FLAG_REVERSE_PINVOKE = 37, // The JIT should insert REVERSE_PINVOKE_{ENTER,EXIT} helpers into method prolog/epilog
88-
CORJIT_FLAG_UNUSED13 = 38,
88+
CORJIT_FLAG_UNUSED14 = 38,
8989
CORJIT_FLAG_TIER0 = 39, // This is the initial tier for tiered compilation which should generate code as quickly as possible
9090
CORJIT_FLAG_TIER1 = 40, // This is the final tier (for now) for tiered compilation which should generate high quality code
9191

9292
#if defined(TARGET_ARM)
9393
CORJIT_FLAG_RELATIVE_CODE_RELOCS = 41, // JIT should generate PC-relative address computations instead of EE relocation records
9494
#else // !defined(TARGET_ARM)
95-
CORJIT_FLAG_UNUSED14 = 41,
95+
CORJIT_FLAG_UNUSED15 = 41,
9696
#endif // !defined(TARGET_ARM)
9797

9898
CORJIT_FLAG_NO_INLINING = 42, // JIT should not inline any called method into this method
9999

100-
CORJIT_FLAG_UNUSED15 = 43,
101-
CORJIT_FLAG_UNUSED16 = 44,
102-
CORJIT_FLAG_UNUSED17 = 45,
103-
CORJIT_FLAG_UNUSED18 = 46,
104-
CORJIT_FLAG_UNUSED19 = 47,
105-
CORJIT_FLAG_UNUSED20 = 48,
106-
CORJIT_FLAG_UNUSED21 = 49,
107-
CORJIT_FLAG_UNUSED22 = 50,
108-
CORJIT_FLAG_UNUSED23 = 51,
109-
CORJIT_FLAG_UNUSED24 = 52,
110-
CORJIT_FLAG_UNUSED25 = 53,
111-
CORJIT_FLAG_UNUSED26 = 54,
112-
CORJIT_FLAG_UNUSED27 = 55,
113-
CORJIT_FLAG_UNUSED28 = 56,
114-
CORJIT_FLAG_UNUSED29 = 57,
115-
CORJIT_FLAG_UNUSED30 = 58,
116-
CORJIT_FLAG_UNUSED31 = 59,
117-
CORJIT_FLAG_UNUSED32 = 60,
118-
CORJIT_FLAG_UNUSED33 = 61,
119-
CORJIT_FLAG_UNUSED34 = 62,
120-
CORJIT_FLAG_UNUSED35 = 63
100+
CORJIT_FLAG_UNUSED16 = 43,
101+
CORJIT_FLAG_UNUSED17 = 44,
102+
CORJIT_FLAG_UNUSED18 = 45,
103+
CORJIT_FLAG_UNUSED19 = 46,
104+
CORJIT_FLAG_UNUSED20 = 47,
105+
CORJIT_FLAG_UNUSED21 = 48,
106+
CORJIT_FLAG_UNUSED22 = 49,
107+
CORJIT_FLAG_UNUSED23 = 50,
108+
CORJIT_FLAG_UNUSED24 = 51,
109+
CORJIT_FLAG_UNUSED25 = 52,
110+
CORJIT_FLAG_UNUSED26 = 53,
111+
CORJIT_FLAG_UNUSED27 = 54,
112+
CORJIT_FLAG_UNUSED28 = 55,
113+
CORJIT_FLAG_UNUSED29 = 56,
114+
CORJIT_FLAG_UNUSED30 = 57,
115+
CORJIT_FLAG_UNUSED31 = 58,
116+
CORJIT_FLAG_UNUSED32 = 59,
117+
CORJIT_FLAG_UNUSED33 = 60,
118+
CORJIT_FLAG_UNUSED34 = 61,
119+
CORJIT_FLAG_UNUSED35 = 62,
120+
CORJIT_FLAG_UNUSED36 = 63
121121
};
122122

123123
CORJIT_FLAGS()

src/coreclr/inc/jiteeversionguid.h

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -31,11 +31,11 @@
3131
//
3232
//////////////////////////////////////////////////////////////////////////////////////////////////////////
3333

34-
constexpr GUID JITEEVersionIdentifier = { /* 8e32c24d-62fe-4d78-ae73-eedddb928ee2 */
35-
0x8e32c24d,
36-
0x62fe,
37-
0x4d78,
38-
{0xae, 0x73, 0xee, 0xdd, 0xdb, 0x92, 0x8e, 0xe2}
34+
constexpr GUID JITEEVersionIdentifier = { /* de81f48e-7701-45f2-a91b-1914f88dfd11 */
35+
0xde81f48e,
36+
0x7701,
37+
0x45f2,
38+
{0xa9, 0x1b, 0x19, 0x14, 0xf8, 0x8d, 0xfd, 0x11}
3939
};
4040

4141
//////////////////////////////////////////////////////////////////////////////////////////////////////////

src/coreclr/jit/block.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -505,6 +505,10 @@ void BasicBlock::dspFlags()
505505
{
506506
printf("cfe ");
507507
}
508+
if (bbFlags & BBF_LOOP_ALIGN)
509+
{
510+
printf("align ");
511+
}
508512
}
509513

510514
/*****************************************************************************

src/coreclr/jit/block.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -448,6 +448,7 @@ struct BasicBlock : private LIR::Range
448448

449449
#define BBF_PATCHPOINT MAKE_BBFLAG(36) // Block is a patchpoint
450450
#define BBF_HAS_CLASS_PROFILE MAKE_BBFLAG(37) // BB contains a call needing a class profile
451+
#define BBF_LOOP_ALIGN MAKE_BBFLAG(39) // Block is lexically the first block in a loop we intend to align.
451452

452453
// clang-format on
453454

@@ -463,6 +464,10 @@ struct BasicBlock : private LIR::Range
463464
{
464465
return ((bbFlags & BBF_LOOP_HEAD) != 0);
465466
}
467+
bool isLoopAlign() const
468+
{
469+
return ((bbFlags & BBF_LOOP_ALIGN) != 0);
470+
}
466471

467472
// Flags to update when two blocks are compacted
468473

src/coreclr/jit/codegencommon.cpp

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2258,6 +2258,12 @@ void CodeGen::genGenerateMachineCode()
22582258

22592259
GetEmitter()->emitJumpDistBind();
22602260

2261+
#if FEATURE_LOOP_ALIGN
2262+
/* Perform alignment adjustments */
2263+
2264+
GetEmitter()->emitLoopAlignAdjustments();
2265+
#endif
2266+
22612267
/* The code is now complete and final; it should not change after this. */
22622268
}
22632269

@@ -2345,10 +2351,12 @@ void CodeGen::genEmitMachineCode()
23452351
#ifdef DEBUG
23462352
if (compiler->opts.disAsm || verbose)
23472353
{
2348-
printf("\n; Total bytes of code %d, prolog size %d, PerfScore %.2f, instruction count %d (MethodHash=%08x) for "
2354+
printf("\n; Total bytes of code %d, prolog size %d, PerfScore %.2f, instruction count %d, allocated bytes for "
2355+
"code %d (MethodHash=%08x) for "
23492356
"method %s\n",
2350-
codeSize, prologSize, compiler->info.compPerfScore, instrCount, compiler->info.compMethodHash(),
2351-
compiler->info.compFullName);
2357+
codeSize, prologSize, compiler->info.compPerfScore, instrCount,
2358+
GetEmitter()->emitTotalHotCodeSize + GetEmitter()->emitTotalColdCodeSize,
2359+
compiler->info.compMethodHash(), compiler->info.compFullName);
23522360
printf("; ============================================================\n\n");
23532361
printf(""); // in our logic this causes a flush
23542362
}

src/coreclr/jit/codegenlinear.cpp

Lines changed: 49 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -311,13 +311,6 @@ void CodeGen::genCodeForBBlist()
311311

312312
genUpdateCurrentFunclet(block);
313313

314-
#ifdef TARGET_XARCH
315-
if (ShouldAlignLoops() && block->bbFlags & BBF_LOOP_HEAD)
316-
{
317-
GetEmitter()->emitLoopAlign();
318-
}
319-
#endif
320-
321314
genLogLabel(block);
322315

323316
// Tell everyone which basic block we're working on
@@ -356,6 +349,14 @@ void CodeGen::genCodeForBBlist()
356349
needLabel = true;
357350
}
358351

352+
#if FEATURE_LOOP_ALIGN
353+
if (GetEmitter()->emitEndsWithAlignInstr())
354+
{
355+
// we had better be planning on starting a new IG
356+
assert(needLabel);
357+
}
358+
#endif
359+
359360
if (needLabel)
360361
{
361362
// Mark a label and update the current set of live GC refs
@@ -667,10 +668,6 @@ void CodeGen::genCodeForBBlist()
667668

668669
switch (block->bbJumpKind)
669670
{
670-
case BBJ_ALWAYS:
671-
inst_JMP(EJ_jmp, block->bbJumpDest);
672-
break;
673-
674671
case BBJ_RETURN:
675672
genExitCode(block);
676673
break;
@@ -741,15 +738,55 @@ void CodeGen::genCodeForBBlist()
741738
#endif // !FEATURE_EH_FUNCLETS
742739

743740
case BBJ_NONE:
744-
case BBJ_COND:
745741
case BBJ_SWITCH:
746742
break;
747743

744+
case BBJ_ALWAYS:
745+
inst_JMP(EJ_jmp, block->bbJumpDest);
746+
FALLTHROUGH;
747+
748+
case BBJ_COND:
749+
750+
#if FEATURE_LOOP_ALIGN
751+
// This is the last place where we operate on blocks and after this, we operate
752+
// on IG. Hence, if we know that the destination of "block" is the first block
753+
// of a loop and needs alignment (it has BBF_LOOP_ALIGN), then "block" represents
754+
// end of the loop. Propagate that information on the IG through "igLoopBackEdge".
755+
//
756+
// During emitter, this information will be used to calculate the loop size.
757+
// Depending on the loop size, decision of whether to align a loop or not will be taken.
758+
759+
if (block->bbJumpDest->isLoopAlign())
760+
{
761+
GetEmitter()->emitSetLoopBackEdge(block->bbJumpDest);
762+
}
763+
#endif
764+
break;
765+
748766
default:
749767
noway_assert(!"Unexpected bbJumpKind");
750768
break;
751769
}
752770

771+
#if FEATURE_LOOP_ALIGN
772+
773+
// If next block is the first block of a loop (identified by BBF_LOOP_ALIGN),
774+
// then need to add align instruction in current "block". Also mark the
775+
// corresponding IG with IGF_LOOP_ALIGN to know that there will be align
776+
// instructions at the end of that IG.
777+
//
778+
// For non-adaptive alignment, add alignment instruction of size depending on the
779+
// compJitAlignLoopBoundary.
780+
// For adaptive alignment, alignment instruction will always be of 15 bytes.
781+
782+
if ((block->bbNext != nullptr) && (block->bbNext->isLoopAlign()))
783+
{
784+
assert(ShouldAlignLoops());
785+
786+
GetEmitter()->emitLoopAlignment();
787+
}
788+
#endif
789+
753790
#if defined(DEBUG) && defined(USING_VARIABLE_LIVE_RANGE)
754791
if (compiler->verbose)
755792
{

src/coreclr/jit/compiler.cpp

Lines changed: 30 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2308,7 +2308,7 @@ void Compiler::compSetProcessor()
23082308
opts.compUseCMOV = jitFlags.IsSet(JitFlags::JIT_FLAG_USE_CMOV);
23092309
#ifdef DEBUG
23102310
if (opts.compUseCMOV)
2311-
opts.compUseCMOV = !compStressCompile(STRESS_USE_CMOV, 50);
2311+
opts.compUseCMOV = !compStressCompile(STRESS_USE_CMOV, 50);
23122312
#endif // DEBUG
23132313

23142314
#endif // TARGET_X86
@@ -2615,6 +2615,29 @@ void Compiler::compInitOptions(JitFlags* jitFlags)
26152615
opts.compDbgInfo = jitFlags->IsSet(JitFlags::JIT_FLAG_DEBUG_INFO);
26162616
opts.compDbgEnC = jitFlags->IsSet(JitFlags::JIT_FLAG_DEBUG_EnC);
26172617

2618+
#ifdef DEBUG
2619+
opts.compJitAlignLoopAdaptive = JitConfig.JitAlignLoopAdaptive() == 1;
2620+
opts.compJitAlignLoopBoundary = (unsigned short)JitConfig.JitAlignLoopBoundary();
2621+
opts.compJitAlignLoopMinBlockWeight = (unsigned short)JitConfig.JitAlignLoopMinBlockWeight();
2622+
2623+
opts.compJitAlignLoopForJcc = JitConfig.JitAlignLoopForJcc() == 1;
2624+
opts.compJitAlignLoopMaxCodeSize = (unsigned short)JitConfig.JitAlignLoopMaxCodeSize();
2625+
#else
2626+
opts.compJitAlignLoopAdaptive = true;
2627+
opts.compJitAlignLoopBoundary = DEFAULT_ALIGN_LOOP_BOUNDARY;
2628+
opts.compJitAlignLoopMinBlockWeight = DEFAULT_ALIGN_LOOP_MIN_BLOCK_WEIGHT;
2629+
#endif
2630+
if (opts.compJitAlignLoopAdaptive)
2631+
{
2632+
opts.compJitAlignPaddingLimit = (opts.compJitAlignLoopBoundary >> 1) - 1;
2633+
}
2634+
else
2635+
{
2636+
opts.compJitAlignPaddingLimit = opts.compJitAlignLoopBoundary - 1;
2637+
}
2638+
2639+
assert(isPow2(opts.compJitAlignLoopBoundary));
2640+
26182641
#if REGEN_SHORTCUTS || REGEN_CALLPAT
26192642
// We never want to have debugging enabled when regenerating GC encoding patterns
26202643
opts.compDbgCode = false;
@@ -3913,19 +3936,17 @@ void Compiler::compSetOptimizationLevel()
39133936
codeGen->setFrameRequired(true);
39143937
#endif
39153938

3916-
if (opts.jitFlags->IsSet(JitFlags::JIT_FLAG_RELOC))
3939+
if (opts.jitFlags->IsSet(JitFlags::JIT_FLAG_PREJIT))
39173940
{
3918-
codeGen->SetAlignLoops(false); // loop alignment not supported for prejitted code
3919-
3920-
// The zapper doesn't set JitFlags::JIT_FLAG_ALIGN_LOOPS, and there is
3921-
// no reason for it to set it as the JIT doesn't currently support loop alignment
3922-
// for prejitted images. (The JIT doesn't know the final address of the code, hence
3941+
// The JIT doesn't currently support loop alignment for prejitted images.
3942+
// (The JIT doesn't know the final address of the code, hence
39233943
// it can't align code based on unknown addresses.)
3924-
assert(!opts.jitFlags->IsSet(JitFlags::JIT_FLAG_ALIGN_LOOPS));
3944+
3945+
codeGen->SetAlignLoops(false); // loop alignment not supported for prejitted code
39253946
}
39263947
else
39273948
{
3928-
codeGen->SetAlignLoops(opts.jitFlags->IsSet(JitFlags::JIT_FLAG_ALIGN_LOOPS));
3949+
codeGen->SetAlignLoops(JitConfig.JitAlignLoops() == 1);
39293950
}
39303951
}
39313952

src/coreclr/jit/compiler.h

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6367,6 +6367,8 @@ class Compiler
63676367

63686368
void optFindNaturalLoops();
63696369

6370+
void optIdentifyLoopsForAlignment();
6371+
63706372
// Ensures that all the loops in the loop nest rooted at "loopInd" (an index into the loop table) are 'canonical' --
63716373
// each loop has a unique "top." Returns "true" iff the flowgraph has been modified.
63726374
bool optCanonicalizeLoopNest(unsigned char loopInd);
@@ -9036,6 +9038,43 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
90369038
bool dspGCtbls; // Display the GC tables
90379039
#endif
90389040

9041+
// Default numbers used to perform loop alignment. All the numbers are choosen
9042+
// based on experimenting with various benchmarks.
9043+
9044+
// Default minimum loop block weight required to enable loop alignment.
9045+
#define DEFAULT_ALIGN_LOOP_MIN_BLOCK_WEIGHT 4
9046+
9047+
// By default a loop will be aligned at 32B address boundary to get better
9048+
// performance as per architecture manuals.
9049+
#define DEFAULT_ALIGN_LOOP_BOUNDARY 0x20
9050+
9051+
// For non-adaptive loop alignment, by default, only align a loop whose size is
9052+
// at most 3 times the alignment block size. If the loop is bigger than that, it is most
9053+
// likely complicated enough that loop alignment will not impact performance.
9054+
#define DEFAULT_MAX_LOOPSIZE_FOR_ALIGN DEFAULT_ALIGN_LOOP_BOUNDARY * 3
9055+
9056+
#ifdef DEBUG
9057+
// Loop alignment variables
9058+
9059+
// If set, for non-adaptive alignment, ensure loop jmps are not on or cross alignment boundary.
9060+
bool compJitAlignLoopForJcc;
9061+
#endif
9062+
// For non-adaptive alignment, minimum loop size (in bytes) for which alignment will be done.
9063+
unsigned short compJitAlignLoopMaxCodeSize;
9064+
9065+
// Minimum weight needed for the first block of a loop to make it a candidate for alignment.
9066+
unsigned short compJitAlignLoopMinBlockWeight;
9067+
9068+
// For non-adaptive alignment, address boundary (power of 2) at which loop alignment should
9069+
// be done. By default, 32B.
9070+
unsigned short compJitAlignLoopBoundary;
9071+
9072+
// Padding limit to align a loop.
9073+
unsigned short compJitAlignPaddingLimit;
9074+
9075+
// If set, perform adaptive loop alignment that limits number of padding based on loop size.
9076+
bool compJitAlignLoopAdaptive;
9077+
90399078
#ifdef LATE_DISASM
90409079
bool doLateDisasm; // Run the late disassembler
90419080
#endif // LATE_DISASM

0 commit comments

Comments
 (0)