Skip to content

Commit bfc8814

Browse files
authored
[cDAC] Implement core stackwalking (#111759)
* Adds IStackWalk contract * Implements core of cDAC stackwalking mechanics * Support for targeting amd64 and arm64 architectures * Support Windows amd64 host
1 parent 3e38bd8 commit bfc8814

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+2375
-114
lines changed

docs/design/datacontracts/ExecutionManager.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,10 @@ struct CodeBlockHandle
2323
TargetPointer GetMethodDesc(CodeBlockHandle codeInfoHandle);
2424
// Get the instruction pointer address of the start of the code block
2525
TargetCodePointer GetStartAddress(CodeBlockHandle codeInfoHandle);
26+
// Gets the unwind info of the code block at the specified code pointer
27+
TargetPointer GetUnwindInfo(CodeBlockHandle codeInfoHandle, TargetCodePointer ip);
28+
// Gets the base address the UnwindInfo of codeInfoHandle is relative to.
29+
TargetPointer GetUnwindInfoBaseAddress(CodeBlockHandle codeInfoHandle);
2630
```
2731

2832
## Version 1
@@ -53,6 +57,8 @@ Data descriptors used:
5357
| `CodeHeapListNode` | `MapBase` | Start of the map - start address rounded down based on OS page size |
5458
| `CodeHeapListNode` | `HeaderMap` | Bit array used to find the start of methods - relative to `MapBase` |
5559
| `RealCodeHeader` | `MethodDesc` | Pointer to the corresponding `MethodDesc` |
60+
| `RealCodeHeader` | `NumUnwindInfos` | Number of Unwind Infos |
61+
| `RealCodeHeader` | `UnwindInfos` | Start address of Unwind Infos |
5662
| `Module` | `ReadyToRunInfo` | Pointer to the `ReadyToRunInfo` for the module |
5763
| `ReadyToRunInfo` | `CompositeInfo` | Pointer to composite R2R info - or itself for non-composite |
5864
| `ReadyToRunInfo` | `NumRuntimeFunctions` | Number of `RuntimeFunctions` |
@@ -214,7 +220,7 @@ class CodeBlock
214220
}
215221
```
216222

217-
The remaining contract APIs extract fields of the `CodeBlock`:
223+
The `GetMethodDesc` and `GetStartAddress` APIs extract fields of the `CodeBlock`:
218224

219225
```csharp
220226
TargetPointer IExecutionManager.GetMethodDesc(CodeBlockHandle codeInfoHandle)
@@ -230,6 +236,15 @@ The remaining contract APIs extract fields of the `CodeBlock`:
230236
}
231237
```
232238

239+
`GetUnwindInfo` gets the Windows style unwind data in the form of `RUNTIME_FUNCTION` which has a platform dependent implementation. The ExecutionManager delegates to the JitManager implementations as the unwind infos (`RUNTIME_FUNCTION`) are stored differently on jitted and R2R code.
240+
241+
* For jitted code (`EEJitManager`) a list of sorted `RUNTIME_FUNCTION` are stored on the `RealCodeHeader` which is accessed in the same was as `GetMethodInfo` described above. The correct `RUNTIME_FUNCTION` is found by binary searching the list based on IP.
242+
243+
* For R2R code (`ReadyToRunJitManager`), a list of sorted `RUNTIME_FUNCTION` are stored on the module's `ReadyToRunInfo`. This is accessed as described above for `GetMethodInfo`. Again, the relevant `RUNTIME_FUNCTION` is found by binary searching the list based on IP.
244+
245+
Unwind info (`RUNTIME_FUNCTION`) use relative addressing. For managed code, these values are relative to the start of the code's containing range in the RangeSectionMap (described below). This could be the beginning of a `CodeHeap` for jitted code or the base address of the loaded image for ReadyToRun code.
246+
`GetUnwindInfoBaseAddress` finds this base address for a given `CodeBlockHandle`.
247+
233248
### RangeSectionMap
234249

235250
The range section map logically partitions the entire 32-bit or 64-bit addressable space into chunks.
Lines changed: 246 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
# Contract StackWalk
2+
3+
This contract encapsulates support for walking the stack of managed threads.
4+
5+
## APIs of contract
6+
7+
```csharp
8+
public interface IStackDataFrameHandle { };
9+
```
10+
11+
```csharp
12+
// Creates a stack walk and returns a handle
13+
IEnumerable<IStackDataFrameHandle> CreateStackWalk(ThreadData threadData);
14+
15+
// Gets the thread context at the given stack dataframe.
16+
byte[] GetRawContext(IStackDataFrameHandle stackDataFrameHandle);
17+
// Gets the Frame address at the given stack dataframe. Returns TargetPointer.Null if the current dataframe does not have a valid Frame.
18+
TargetPointer GetFrameAddress(IStackDataFrameHandle stackDataFrameHandle);
19+
```
20+
21+
## Version 1
22+
To create a full walk of the managed stack, two types of 'stacks' must be read.
23+
24+
1. True call frames on the thread's stack
25+
2. Capital "F" Frames (referred to as Frames as opposed to frames) which are used by the runtime for book keeping purposes.
26+
27+
Capital "F" Frames are pushed and popped to a singly-linked list on the runtime's Thread object and are accessible using the [IThread](./Thread.md) contract. These capital "F" Frames are allocated within a functions call frame, meaning they also live on the stack. A subset of Frame types store extra data allowing us to recover a portion of the context from when they were created For our purposes, these are relevant because they mark every transition where managed code calls native code. For more information about Frames see: [BOTR Stack Walking](https://github.com/dotnet/runtime/blob/44b7251f94772c69c2efb9daa7b69979d7ddd001/docs/design/coreclr/botr/stackwalking.md).
28+
29+
Unwinding call frames on the stack usually requires an OS specific implementation. However, in our particular circumstance of unwinding only **managed function** call frames, the runtime uses Windows style unwind logic/codes for all platforms (this isn't true for NativeAOT). Therefore we can delegate to the existing native unwinding code located in `src/coreclr/unwinder/`. For more information on the Windows unwinding algorithm and unwind codes see the following docs:
30+
31+
* [Windows x64](https://learn.microsoft.com/en-us/cpp/build/exception-handling-x64)
32+
* [Windows ARM64](https://learn.microsoft.com/en-us/cpp/build/arm64-exception-handling)
33+
34+
This contract depends on the following descriptors:
35+
36+
| Data Descriptor Name | Field | Meaning |
37+
| --- | --- | --- |
38+
| `Frame` | `Next` | Pointer to next from on linked list |
39+
| `InlinedCallFrame` | `CallSiteSP` | SP saved in Frame |
40+
| `InlinedCallFrame` | `CallerReturnAddress` | Return address saved in Frame |
41+
| `InlinedCallFrame` | `CalleeSavedFP` | FP saved in Frame |
42+
| `SoftwareExceptionFrame` | `TargetContext` | Context object saved in Frame |
43+
| `SoftwareExceptionFrame` | `ReturnAddress` | Return address saved in Frame |
44+
45+
Global variables used:
46+
| Global Name | Type | Purpose |
47+
| --- | --- | --- |
48+
| For each FrameType `<frameType>`, `<frameType>##Identifier` | FrameIdentifier enum value | Identifier used to determine concrete type of Frames |
49+
50+
Contracts used:
51+
| Contract Name |
52+
| --- |
53+
| `ExecutionManager` |
54+
| `Thread` |
55+
56+
57+
### Stackwalk Algorithm
58+
The intuition for walking a managed stack is relatively simply: unwind managed portions of the stack until we hit native code then use capital "F" Frames as checkpoints to get into new sections of managed code. Because Frames are added at each point before managed code (higher SP value) calls native code (lower SP values), we are guaranteed that a Frame exists at the top (lower SP value) of each managed call frame run.
59+
60+
In reality, the actual algorithm is a little more complex fow two reasons. It requires pausing to return the current context and Frame at certain points and it checks for "skipped Frames" which can occur if an capital "F" Frame is allocated in a managed stack frame (e.g. an inlined P/Invoke call).
61+
62+
1. Setup
63+
1. Set the current context `currContext` to be the thread's context. Fetched as part of the [ICorDebugDataTarget](https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/debugging/icordebugdatatarget-getthreadcontext-method) COM interface.
64+
2. Create a stack of the thread's capital "F" Frames `frameStack`.
65+
2. **Return the current context**.
66+
3. While the `currContext` is in managed code or `frameStack` is not empty:
67+
1. If `currContext` is native code, pop the top Frame from `frameStack` update the context using the popped Frame. **Return the updated context** and **go to step 3**.
68+
2. If `frameStack` is not empty, check for skipped Frames. Peek `frameStack` to find a Frame `frame`. Compare the address of `frame` (allocated on the stack) with the caller of the current context's stack pointer (found by unwinding current context one iteration).
69+
If the address of the `frame` is less than the caller's stack pointer, **return the current context**, pop the top Frame from `frameStack`, and **go to step 3**.
70+
3. Unwind `currContext` using the Windows style unwinder. **Return the current context**.
71+
72+
73+
#### Simple Example
74+
75+
In this example we walk through the algorithm without instances of skipped Frames.
76+
77+
Given the following call stack and capital "F" Frames linked list, we can apply the above algorithm.
78+
<table>
79+
<tr>
80+
<th> Call Stack (growing down)</th>
81+
<th> Capital "F" Frames Linked List </th>
82+
</tr>
83+
<tr>
84+
<td>
85+
86+
```
87+
Managed Call: -----------
88+
89+
| Native | <- <A>'s SP
90+
- | |
91+
|-----------| <- <B>'s SP
92+
| |
93+
| Managed |
94+
| |
95+
|-----------| <- <C>'s SP
96+
| |
97+
| Native |
98+
+ | |
99+
| StackBase |
100+
```
101+
</td>
102+
<td>
103+
104+
```
105+
SoftwareExceptionFrame
106+
(Context = <B>)
107+
108+
||
109+
\/
110+
111+
NULL TERMINATOR
112+
```
113+
114+
</td>
115+
</tr>
116+
</table>
117+
118+
1. (1) Set `currContext` to the thread context `<A>`. Create a stack of Frames `frameStack`.
119+
2. (2) Return the `currContext` which has the threads context.
120+
3. (3) `currContext` is in unmanaged code (native) however, because `frameStack` is not empty, we begin processing the context.
121+
4. (3.1) Since `currContext` is unmanaged. We pop the SoftwareExceptionFrame from `frameStack` and use it to update `currContext`. The SoftwareExceptionFrame is holding context `<B>` which we set `currContext` to. Return the current context and go back to step 3.
122+
5. (3) Now `currContext` is in managed code as shown by `<B>`'s SP. Therefore, we begin to process the context.
123+
6. (3.1) Since `currContext` is managed, skip step 3.1.
124+
7. (3.2) Since `frameStack` is empty, we do not check for skipped Frames.
125+
8. (3.3) Unwind `currContext` a single iteration to `<C>` and return the current context.
126+
9. (3) `currContext` is now at unmanaged (native) code and `frameStack` is empty. Therefore we are done.
127+
128+
The following C# code could yield a stack similar to the example above:
129+
```csharp
130+
void foo()
131+
{
132+
// Call native code or function that calls down to native.
133+
Console.ReadLine();
134+
// Capture stack trace while inside native code.
135+
}
136+
```
137+
138+
#### Skipped Frame Example
139+
The skipped Frame check is important when managed code calls managed code through an unmanaged boundary.
140+
This occurs when calling a function marked with `[UnmanagedCallersOnly]` as an unmanaged delegate from a managed caller.
141+
In this case, if we ignored the skipped Frame check we would miss the unmanaged boundary.
142+
143+
Given the following call stack and capital "F" Frames linked list, we can apply the above algorithm.
144+
<table>
145+
<tr>
146+
<th> Call Stack (growing down)</th>
147+
<th> Capital "F" Frames Linked List </th>
148+
</tr>
149+
<tr>
150+
<td>
151+
152+
```
153+
Unmanaged Call: -X-X-X-X-X-
154+
Managed Call: -----------
155+
InlinedCallFrame location: [ICF]
156+
157+
| Managed | <- <A>'s SP
158+
- | |
159+
| |
160+
|-X-X-X-X-X-| <- <B>'s SP
161+
| [ICF] |
162+
| Managed |
163+
| |
164+
|-----------| <- <C>'s SP
165+
| |
166+
| Native |
167+
+ | |
168+
| StackBase |
169+
```
170+
</td>
171+
<td>
172+
173+
```
174+
InlinedCallFrame
175+
(Context = <B>)
176+
177+
||
178+
\/
179+
180+
NULL TERMINATOR
181+
```
182+
183+
</td>
184+
</tr>
185+
</table>
186+
187+
1. (1) Set `currContext` to the thread context `<A>`. Create a stack of Frames `frameStack`.
188+
2. (2) Return the `currContext` which has the threads context.
189+
3. (3) Since `currContext` is in managed code, we begin to process the context.
190+
4. (3.1) Since `currContext` is managed, skip step 3.1.
191+
5. (3.2) Check for skipped Frames. Copy `currContext` into `parentContext` and unwind `parentContext` once using the Windows style unwinder. As seen from the call stack, unwinding `currContext=<A>` will yield `<C>`. We peek the top of `frameStack` and find an InlinedCallFrame (shown in call stack above as `[ICF]`). Since `parentContext`'s SP is greater than the address of `[ICF]` there are no skipped Frames.
192+
6. (3.3) Unwind `currContext` a single iteration to `<B>` and return the current context.
193+
7. (3) Since `currContext` is still in managed code, we continue processing the context.
194+
8. (3.1) Since `currContext` is managed, skip step 3.1.
195+
9. (3.2) Check for skipped Frames. Copy `currContext` into `parentContext` and unwind `parentContext` once using the Windows style unwinder. As seen from the call stack, unwinding `currContext=<B>` will yield `<C>`. We peek the top of `frameStack` and find an InlinedCallFrame (shown in call stack above as `[ICF]`). This time the the address of `[ICF]` is less than `parentContext`'s SP. Therefore we return the current context then pop the InlinedCallFrame from `frameStack` which is now empty and return to step 3.
196+
10. (3) Since `currContext` is still in managed code, we continue processing the context.
197+
11. (3.1) Since `currContext` is managed, skip step 3.1.
198+
12. (3.2) Since `frameStack` is empty, we do not check for skipped Frames.
199+
13. (3.3) Unwind `currContext` a single iteration to `<C>` and return the current context.
200+
14. (3) `currContext` is now at unmanaged (native) code and `frameStack` is empty. Therefore we are done.
201+
202+
The following C# code could yield a stack similar to the example above:
203+
```csharp
204+
void foo()
205+
{
206+
var fptr = (delegate* unmanaged<void>)&bar;
207+
fptr();
208+
}
209+
210+
[UnmanagedCallersOnly]
211+
private static void bar()
212+
{
213+
// Do something
214+
// Capture stack trace while in here
215+
}
216+
```
217+
218+
### APIs
219+
220+
The majority of the contract's complexity is the stack walking algorithm (detailed above) implemented as part of `CreateStackWalk`.
221+
The `IEnumerable<IStackDataFrame>` return value is computed lazily.
222+
223+
```csharp
224+
IEnumerable<IStackDataFrameHandle> CreateStackWalk(ThreadData threadData);
225+
```
226+
227+
The rest of the APIs convey state about the stack walk at a given point which fall out of the stack walking algorithm relatively simply.
228+
229+
`GetRawContext` Retrieves the raw Windows style thread context of the current frame as a byte array. The size and shape of the context is platform dependent.
230+
231+
* On Windows the context is defined directly in Windows header `winnt.h`. See [CONTEXT structure](https://learn.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-context) for more info.
232+
* On non-Windows platform the context's are defined in `src/coreclr/pal/inc/pal.h` and should mimic the Windows structure.
233+
234+
This context is not guaranteed to be complete. Not all capital "F" Frames store the entire context, some only store the IP/SP/FP. Therefore, at points where the context is based on these Frames it will be incomplete.
235+
```csharp
236+
byte[] GetRawContext(IStackDataFrameHandle stackDataFrameHandle);
237+
```
238+
239+
240+
`GetFrameAddress` gets the address of the current capital "F" Frame. This is only valid if the `IStackDataFrameHandle` is at a point where the context is based on a capital "F" Frame. For example, it is not valid when when the current context was created by using the stack frame unwinder.
241+
If the Frame is not valid, returns `TargetPointer.Null`.
242+
243+
```csharp
244+
TargetPointer GetFrameAddress(IStackDataFrameHandle stackDataFrameHandle);
245+
```
246+

eng/native/functions.cmake

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -554,11 +554,11 @@ function(install_static_library targetName destination component)
554554
endif()
555555
endfunction()
556556

557-
# install_clr(TARGETS targetName [targetName2 ...] [DESTINATIONS destination [destination2 ...]] [COMPONENT componentName])
557+
# install_clr(TARGETS targetName [targetName2 ...] [DESTINATIONS destination [destination2 ...]] [COMPONENT componentName] [INSTALL_ALL_ARTIFACTS])
558558
function(install_clr)
559559
set(multiValueArgs TARGETS DESTINATIONS)
560560
set(singleValueArgs COMPONENT)
561-
set(options "")
561+
set(options INSTALL_ALL_ARTIFACTS)
562562
cmake_parse_arguments(INSTALL_CLR "${options}" "${singleValueArgs}" "${multiValueArgs}" ${ARGV})
563563

564564
if ("${INSTALL_CLR_TARGETS}" STREQUAL "")
@@ -594,9 +594,14 @@ function(install_clr)
594594
endif()
595595

596596
foreach(destination ${destinations})
597-
# We don't need to install the export libraries for our DLLs
598-
# since they won't be directly linked against.
599-
install(PROGRAMS $<TARGET_FILE:${targetName}> DESTINATION ${destination} COMPONENT ${INSTALL_CLR_COMPONENT})
597+
# Install the export libraries for static libraries.
598+
if (${INSTALL_CLR_INSTALL_ALL_ARTIFACTS})
599+
install(TARGETS ${targetName} DESTINATION ${destination} COMPONENT ${INSTALL_CLR_COMPONENT})
600+
else()
601+
# We don't need to install the export libraries for our DLLs
602+
# since they won't be directly linked against.
603+
install(PROGRAMS $<TARGET_FILE:${targetName}> DESTINATION ${destination} COMPONENT ${INSTALL_CLR_COMPONENT})
604+
endif()
600605
if (NOT "${symbolFile}" STREQUAL "")
601606
install_symbol_file(${symbolFile} ${destination} COMPONENT ${INSTALL_CLR_COMPONENT})
602607
endif()

eng/pipelines/runtime-official.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -367,7 +367,7 @@ extends:
367367
- windows_x64
368368
jobParameters:
369369
templatePath: 'templates-official'
370-
buildArgs: -s tools+libs -pack -c $(_BuildConfig) /p:TestAssemblies=false /p:TestPackages=true
370+
buildArgs: -s tools.illink+libs -pack -c $(_BuildConfig) /p:TestAssemblies=false /p:TestPackages=true
371371
nameSuffix: Libraries_WithPackages
372372
isOfficialBuild: ${{ variables.isOfficialBuild }}
373373
postBuildSteps:

eng/pipelines/runtime.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1244,7 +1244,7 @@ extends:
12441244
platforms:
12451245
- windows_x64
12461246
jobParameters:
1247-
buildArgs: -test -s tools+libs+libs.tests -pack -c $(_BuildConfig) /p:TestAssemblies=false /p:TestPackages=true
1247+
buildArgs: -test -s tools.illink+libs+libs.tests -pack -c $(_BuildConfig) /p:TestAssemblies=false /p:TestPackages=true
12481248
nameSuffix: Libraries_WithPackages
12491249
timeoutInMinutes: 150
12501250
condition: >-

src/coreclr/debug/daccess/cdac.cpp

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,15 @@ namespace
2525
iter++;
2626
path.Truncate(iter);
2727
path.Append(CDAC_LIB_NAME);
28+
29+
#ifdef HOST_WINDOWS
30+
// LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR tells the native windows loader to load dependencies
31+
// from the same directory as cdacreader.dll. Once the native portions of the cDAC
32+
// are statically linked, this won't be required.
33+
*phCDAC = CLRLoadLibraryEx(path.GetUnicode(), NULL, LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR);
34+
#else // !HOST_WINDOWS
2835
*phCDAC = CLRLoadLibrary(path.GetUnicode());
36+
#endif // HOST_WINDOWS
2937
if (*phCDAC == NULL)
3038
return false;
3139

@@ -41,6 +49,26 @@ namespace
4149

4250
return S_OK;
4351
}
52+
53+
int ReadThreadContext(uint32_t threadId, uint32_t contextFlags, uint32_t contextBufferSize, uint8_t* contextBuffer, void* context)
54+
{
55+
ICorDebugDataTarget* target = reinterpret_cast<ICorDebugDataTarget*>(context);
56+
HRESULT hr = target->GetThreadContext(threadId, contextFlags, contextBufferSize, contextBuffer);
57+
if (FAILED(hr))
58+
return hr;
59+
60+
return S_OK;
61+
}
62+
63+
int GetPlatform(uint32_t* platform, void* context)
64+
{
65+
ICorDebugDataTarget* target = reinterpret_cast<ICorDebugDataTarget*>(context);
66+
HRESULT hr = target->GetPlatform((CorDebugPlatform*)platform);
67+
if (FAILED(hr))
68+
return hr;
69+
70+
return S_OK;
71+
}
4472
}
4573

4674
CDAC CDAC::Create(uint64_t descriptorAddr, ICorDebugDataTarget* target, IUnknown* legacyImpl)
@@ -53,7 +81,7 @@ CDAC CDAC::Create(uint64_t descriptorAddr, ICorDebugDataTarget* target, IUnknown
5381
_ASSERTE(init != nullptr);
5482

5583
intptr_t handle;
56-
if (init(descriptorAddr, &ReadFromTargetCallback, target, &handle) != 0)
84+
if (init(descriptorAddr, &ReadFromTargetCallback, &ReadThreadContext, &GetPlatform, target, &handle) != 0)
5785
{
5886
::FreeLibrary(cdacLib);
5987
return {};

0 commit comments

Comments
 (0)