Skip to content

[cDAC] Adds DAC like entrypoint with new COM interface #113899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Apr 10, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
573d12e
add DAC like entrypoint
Mar 26, 2025
c60292e
implement RuntimeInfo contract
Mar 27, 2025
238828d
convert to use IRuntimeInfo to fetch platform
Mar 27, 2025
95433f2
remove modifiers
Mar 27, 2025
bae1e05
add RuntimeInfo doc
Mar 28, 2025
5e06e14
improve error messages
Mar 28, 2025
7fec5a4
improve fetching runtimeinfo globals
Mar 28, 2025
26bea19
fix ContractDescriptorBuilder
Mar 28, 2025
20089c1
add RID define
Apr 3, 2025
1e6ed61
change runtimeinfo datastructures to be strings
Apr 3, 2025
5febe9b
fix managed portions of runtime info
Apr 3, 2025
717300d
implement runtime/cdac-built-tool portion of passing strings
Apr 3, 2025
abedd90
implement cdacreader portion of passing strings
Apr 3, 2025
bdf1a73
improve wording
Apr 3, 2025
f66fa9f
update contract doc
Apr 3, 2025
c13dfdf
rename cmake RID parameter
Apr 3, 2025
b39234c
add fallback entrypoint
Apr 3, 2025
08f557a
make it more clear we won't overrun buffer of GetThreadContext
Apr 4, 2025
40b78e0
remove trailing comma
Apr 4, 2025
a421798
add test for global string with escaped characters
Apr 4, 2025
9568f11
use portable RID value
Apr 4, 2025
4acb06c
update enum values
Apr 4, 2025
d11b62d
update example in contract descriptor
Apr 4, 2025
25e3da1
fix StressLogAnalyzer
Apr 4, 2025
1fc1289
improve docs using BNF
Apr 4, 2025
b1d5fd7
change error message in datadescriptors to point user in right direct…
Apr 4, 2025
54f186a
text
Apr 4, 2025
fe80a63
improve error message
Apr 4, 2025
6f83343
update doc
Apr 4, 2025
ca053e8
more doc
Apr 4, 2025
4de13eb
add note about undefined behavior
Apr 7, 2025
4880ec0
clean up BNF
Apr 7, 2025
d0a9e0b
fix macro expansion stringification
Apr 9, 2025
63de885
Merge remote-tracking branch 'origin/main' into cdac-new-entrypoint
Apr 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions docs/design/datacontracts/RuntimeInfo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Contract RuntimeInfo

This contract encapsulates support for fetching information about the target runtime.

## APIs of contract

```csharp
public enum RuntimeInfoArchitecture : uint
Copy link
Member

@jkotas jkotas Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum looks very similar to https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.architecture

We may want to make the names exactly same (it can be still a separate enum to make versioning easier).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to align with that enum.

I do have two follow-up questions:

  1. Do you care if the backing enum values are the same? I'd like to keep an Unknown value in the enum in case the contract can't find the arch/OS. I could modify Unknown to be -1 instead of 0 to use the exact same values as System.Runtime.InteropServices.Architecture. If you don't care, I'd prefer to have 0 because its the default value.
  2. Do you have any thoughts on using the equivalent OS enum (https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.osplatform?view=net-9.0)? This is a little more information that we currently have through defines AFAIK.

Copy link
Member

@jkotas jkotas Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you care if the backing enum values are the same?

I do not care. I would just keep the exact same order and names.

Do you have any thoughts on using the equivalent OS enum (https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.osplatform?view=net-9.0)?

This type is not an enum. It is a struct with a set of methods and it is soft-deprecated (we are not adding new members to this type for new OSes).

The actively maintained type is https://learn.microsoft.com/en-us/dotnet/api/system.operatingsystem.version?view=net-9.0 . Again, no enum there - just a bunch of Is... methods. If you are looking for something to align the managed OS names with, this one would be the managed type to align with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Under System.OperatingSystem.Version, there is the PlatformId enum (https://learn.microsoft.com/en-us/dotnet/api/system.platformid?view=net-9.0) which uses Win32NT for modern Windows.

RuntimeInfoOperatingSystem

We could align with the PlatformID enum or keep the existing values.

// option 1 - current
public enum RuntimeInfoOperatingSystem : uint
{
    Unknown = 0,
    Windows,
    Unix,
}

// option 2 - aligned with PlatformID
public enum RuntimeInfoOperatingSystem : uint
{
    Unknown = 0,
    Win32NT,
    Unix,
    Other,
}

RuntimeInfoArchitecture

Based on previous comments, aligned with the System.Runtime.InteropServices.Architecture values + Unknown.

// Values are similar to System.Runtime.InteropServices.Architecture
public enum RuntimeInfoArchitecture : uint
{
    Unknown = 0,
    X86,
    X64,
    Arm,
    Arm64,
    Wasm,
    S390x,
    LoongArch64,
    Armv6,
    Ppc64le,
    RiscV64,
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PlatformId

This is soft-deprecated enum as well. It does not make sense to align with it.

Copy link
Member

@jkoritzinsky jkoritzinsky Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PlatformID also has a number of issues with the actual implementation/usage in the runtimes, so I'd also recommend against trying to align with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on feedback, I have settled on option 1 (above) for RuntimeInfoOperatingSystem and the listed RuntimeInfoArchitecture enum.

{
Unknown = 0,
X86,
Arm32,
X64,
Arm64,
LoongArch64,
RISCV,
}

public enum RuntimeInfoOperatingSystem : uint
{
Unknown = 0,
Win,
Unix,
}
```

```csharp
// Gets the targets architecture. If this information is not available returns Unknown.
RuntimeInfoArchitecture GetTargetArchitecture();

// Gets the targets operating system. If this information is not available returns Unknown.
RuntimeInfoOperatingSystem GetTargetOperatingSystem();
```

## Version 1

Global variables used:
| Global Name | Type | Purpose |
| --- | --- | --- |
| Architecture | string | Target architecture |
| OperatingSystem | string | Target operating system |

The contract implementation simply returns the contract descriptor global values parsed as the respective enum case-insensitively. If these globals are not available, the contract returns Unknown.
3 changes: 2 additions & 1 deletion docs/design/datacontracts/contract-descriptor.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,8 @@ a JSON integer constant.
"globals":
{
"FEATURE_COMINTEROP": 0,
"s_pThreadStore": [ 0 ] // indirect from pointer data offset 0
"s_pThreadStore": [ 0 ], // indirect from pointer data offset 0
"RuntimeID": "win-x64" // string value
},
"contracts": {"Thread": 1, "GCHandle": 1, "ThreadStore": 1}
}
Expand Down
57 changes: 47 additions & 10 deletions docs/design/datacontracts/data_descriptor.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,33 +212,69 @@ The global values will be in an array, with each value described by a dictionary

* `"name": "global value name"` the name of the global value
* `"type": "type name"` the type of the global value
* optional `"value": VALUE | [ int ] | "unknown"` the value of the global value, or an offset in an auxiliary array containing the value or "unknown".
* optional `"value": <global_value>` where `<global_value>` is defined below


Numeric constants must be within the range of the type of the global value. If a constant is out of range, behavior is undefined.

The `VALUE` may be a JSON numeric constant integer or a string containing a signed or unsigned
decimal or hex (with prefix `0x` or `0X`) integer constant. The constant must be within the range
of the type of the global value.

**Compact format**:

The global values will be in a dictionary, with each key being the name of a global and the values being one of:

* `[VALUE | [int], "type name"]` the type and value of a global
* `VALUE | [int]` just the value of a global
* `[<global_value>, "type name"]` the type and value of a global
* `<global_value>` just the value of a global

As in the regular format, `VALUE` is a numeric constant or a string containing an integer constant.
Where `<global_value>` is defined as below.

Numeric constants must be within the range of the type of the global value. If a constant is out of range, behavior is undefined.

Note that a two element array is unambiguously "type and value", whereas a one-element array is
unambiguously "indirect value".


**Both formats**

#### Specification Appendix

```
<global_value> ::= <value> | <pointer_table_index>
<pointer_table_index> ::= [ <number_value> ]
<value> ::= <json_string> | <number_value>
<number_value> ::= <json_number> | <decimal_string> | <hex_string>

<json_string> is any JSON string element
<json_number> is any JSON number element
<hex_string> is a <json_string> which can be parsed as a hexadecimal number prefixed with "0x" or "0X"
<decimal_string> is a <json_string> which can be parsed as a decimal number.
```

#### Parsing Rules
`<json_number>` is parsed as a numeric value.
`<hex_string>` and `<decimal_string>` can be parsed as either a string or numeric value.
`<json_string>` (that does not form a valid hex or decimal number) is parsed as a string.

Example using compact format:
```json
{
"int" : 1234, // Can only be parsed as numeric constant 1234
"stringyInt" : "1234", // Can be parsed as 1234 or "1234"
"stringyHex" : "0x1234", // Can be parsed as 4660 (0x1234 in decimal) or "0x1234"
"stringValue" : "Hello World" // Can only be parsed as "Hello World"
}
```

#### Typing

For pointer and nuint globals, the value may be assumed to fit in a 64-bit unsigned integer. For
nint globals, the value may be assumed to fit in a 64-bit signed integer.

Note that the logical descriptor does not contain "unknown" values: it is expected that the
in-memory data descriptor will augment the baseline with a known offset for all fields in the
baseline.

#### Indirect Types

If the value is given as a single-element array `[ int ]` then the value is stored in an auxiliary
array that is part of the data contract descriptor. Only in-memory data descriptors may have
indirect values; baseline data descriptors may not have indirect values.
Expand All @@ -251,7 +287,6 @@ The indirection array is not part of the data descriptor spec. It is part of th
descriptor](./contract_descriptor.md#Contract_descriptor).



## Example

This is an example of a baseline descriptor for a 64-bit architecture. Suppose it has the name `"example-64"`
Expand Down Expand Up @@ -288,7 +323,7 @@ The baseline is given in the "regular" format.
],
"globals": [
{ "name": "FEATURE_EH_FUNCLETS", "type": "uint8", "value": "0" }, // baseline defaults value to 0
{ "name": "FEATURE_COMINTEROP", "type", "uint8", "value": "1"},
{ "name": "FEATURE_COMINTEROP", "type": "uint8", "value": "1"},
{ "name": "s_pThreadStore", "type": "pointer" } // no baseline value
]
}
Expand All @@ -308,7 +343,8 @@ The following is an example of an in-memory descriptor that references the above
"globals":
{
"FEATURE_COMINTEROP": 0,
"s_pThreadStore": [ 0 ] // indirect from aux data offset 0
"s_pThreadStore": [ 0 ], // indirect from aux data offset 0
"RuntimeID": "windows-x64"
}
}
```
Expand All @@ -332,6 +368,7 @@ And the globals will be:
| FEATURE_COMINTEROP | uint8 | 0 |
| FEATURE_EH_FUNCLETS | uint8 | 0 |
| s_pThreadStore | pointer | 0x0100ffe0 |
| RuntimeID | string |"windows-x64"|

The `FEATURE_EH_FUNCLETS` global's value comes from the baseline - not the in-memory data
descriptor. By contrast, `FEATURE_COMINTEROP` comes from the in-memory data descriptor - with the
Expand Down
12 changes: 1 addition & 11 deletions src/coreclr/debug/daccess/cdac.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -52,16 +52,6 @@ namespace

return S_OK;
}

int GetPlatform(uint32_t* platform, void* context)
{
ICorDebugDataTarget* target = reinterpret_cast<ICorDebugDataTarget*>(context);
HRESULT hr = target->GetPlatform((CorDebugPlatform*)platform);
if (FAILED(hr))
return hr;

return S_OK;
}
}

CDAC CDAC::Create(uint64_t descriptorAddr, ICorDebugDataTarget* target, IUnknown* legacyImpl)
Expand All @@ -74,7 +64,7 @@ CDAC CDAC::Create(uint64_t descriptorAddr, ICorDebugDataTarget* target, IUnknown
_ASSERTE(init != nullptr);

intptr_t handle;
if (init(descriptorAddr, &ReadFromTargetCallback, &ReadThreadContext, &GetPlatform, target, &handle) != 0)
if (init(descriptorAddr, &ReadFromTargetCallback, &ReadThreadContext, target, &handle) != 0)
{
::FreeLibrary(cdacLib);
return {};
Expand Down
5 changes: 5 additions & 0 deletions src/coreclr/debug/runtimeinfo/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,11 @@ install_clr(TARGETS runtimeinfo DESTINATIONS lib COMPONENT runtime)

# cDAC contract descriptor

if("${CLR_DOTNET_RID}" STREQUAL "")
message(FATAL_ERROR "CLR_DOTNET_RID is not set. Please ensure it is being set to the portable RID of the target platform by runtime.proj.")
endif()
configure_file(configure.h.in ${CMAKE_CURRENT_BINARY_DIR}/configure.h)

if (NOT CDAC_BUILD_TOOL_BINARY_PATH)
# if CDAC_BUILD_TOOL_BINARY_PATH is unspecified (for example for a build without a .NET SDK or msbuild),
# link a stub contract descriptor into the runtime
Expand Down
6 changes: 6 additions & 0 deletions src/coreclr/debug/runtimeinfo/configure.h.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#ifndef RUNTIME_INFO_CONFIGURE_H_INCLUDED
#define RUNTIME_INFO_CONFIGURE_H_INCLUDED

#define RID_STRING @CLR_DOTNET_RID@

#endif // RUNTIME_INFO_CONFIGURE_H_INCLUDED
1 change: 1 addition & 0 deletions src/coreclr/debug/runtimeinfo/contracts.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
"PlatformMetadata": 1,
"PrecodeStubs": 2,
"ReJIT": 1,
"RuntimeInfo": 1,
"RuntimeTypeSystem": 1,
"StackWalk": 1,
"StressLog": 2,
Expand Down
37 changes: 37 additions & 0 deletions src/coreclr/debug/runtimeinfo/datadescriptor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@
#include "methodtable.h"
#include "threads.h"

#include "configure.h"

#include "../debug/ee/debugger.h"

#ifdef HAVE_GCCOVER
Expand Down Expand Up @@ -51,6 +53,12 @@ struct GlobalPointerSpec
uint32_t PointerDataIndex;
};

struct GlobalStringSpec
{
uint32_t Name;
uint32_t StringValue;
};

#define CONCAT(token1,token2) token1 ## token2
#define CONCAT4(token1, token2, token3, token4) token1 ## token2 ## token3 ## token4

Expand All @@ -59,6 +67,10 @@ struct GlobalPointerSpec
#define MAKE_FIELDTYPELEN_NAME(tyname,membername) CONCAT4(cdac_string_pool_membertypename__, tyname, __, membername)
#define MAKE_GLOBALLEN_NAME(globalname) CONCAT(cdac_string_pool_globalname__, globalname)
#define MAKE_GLOBALTYPELEN_NAME(globalname) CONCAT(cdac_string_pool_globaltypename__, globalname)
#define MAKE_GLOBALVALUELEN_NAME(globalname) CONCAT(cdac_string_pool_globalvalue__, globalname)

// used to stringify the result of a macros expansion
#define STRINGIFY(x) #x

// define a struct where the size of each field is the length of some string. we will use offsetof to get
// the offset of each struct element, which will be equal to the offset of the beginning of that string in the
Expand All @@ -71,6 +83,8 @@ struct CDacStringPoolSizes
#define CDAC_TYPE_BEGIN(name) DECL_LEN(MAKE_TYPELEN_NAME(name), sizeof(#name))
#define CDAC_TYPE_FIELD(tyname,membertyname,membername,offset) DECL_LEN(MAKE_FIELDLEN_NAME(tyname,membername), sizeof(#membername)) \
DECL_LEN(MAKE_FIELDTYPELEN_NAME(tyname,membername), sizeof(#membertyname))
#define CDAC_GLOBAL_STRING(name, stringval) DECL_LEN(MAKE_GLOBALLEN_NAME(name), sizeof(#name)) \
DECL_LEN(MAKE_GLOBALVALUELEN_NAME(name), sizeof(STRINGIFY(stringval)))
#define CDAC_GLOBAL_POINTER(name,value) DECL_LEN(MAKE_GLOBALLEN_NAME(name), sizeof(#name))
#define CDAC_GLOBAL(name,tyname,value) DECL_LEN(MAKE_GLOBALLEN_NAME(name), sizeof(#name)) \
DECL_LEN(MAKE_GLOBALTYPELEN_NAME(name), sizeof(#tyname))
Expand All @@ -84,6 +98,7 @@ struct CDacStringPoolSizes
#define GET_FIELDTYPE_NAME(tyname,membername) offsetof(struct CDacStringPoolSizes, MAKE_FIELDTYPELEN_NAME(tyname,membername))
#define GET_GLOBAL_NAME(globalname) offsetof(struct CDacStringPoolSizes, MAKE_GLOBALLEN_NAME(globalname))
#define GET_GLOBALTYPE_NAME(globalname) offsetof(struct CDacStringPoolSizes, MAKE_GLOBALTYPELEN_NAME(globalname))
#define GET_GLOBALSTRING_VALUE(globalname) offsetof(struct CDacStringPoolSizes, MAKE_GLOBALVALUELEN_NAME(globalname))

// count the types
enum
Expand Down Expand Up @@ -123,6 +138,15 @@ enum
#include "datadescriptor.h"
};

// count the global strings
enum
{
CDacBlobGlobalStringsCount =
#define CDAC_GLOBALS_BEGIN() 0
#define CDAC_GLOBAL_STRING(name,value) + 1
#include "datadescriptor.h"
};


#define MAKE_TYPEFIELDS_TYNAME(tyname) CONCAT(CDacFieldsPoolTypeStart__, tyname)

Expand Down Expand Up @@ -197,27 +221,31 @@ struct BinaryBlobDataDescriptor
uint32_t GlobalLiteralValuesStart;

uint32_t GlobalPointersStart;
uint32_t GlobalStringValuesStart;
uint32_t NamesPoolStart;

uint32_t TypeCount;
uint32_t FieldsPoolCount;

uint32_t GlobalLiteralValuesCount;
uint32_t GlobalPointerValuesCount;
uint32_t GlobalStringValuesCount;

uint32_t NamesPoolCount;

uint8_t TypeSpecSize;
uint8_t FieldSpecSize;
uint8_t GlobalLiteralSpecSize;
uint8_t GlobalPointerSpecSize;
uint8_t GlobalStringSpecSize;
} Directory;
uint32_t PlatformFlags;
uint32_t BaselineName;
struct TypeSpec Types[CDacBlobTypesCount];
struct FieldSpec FieldsPool[CDacBlobFieldsPoolCount];
struct GlobalLiteralSpec GlobalLiteralValues[CDacBlobGlobalLiteralsCount];
struct GlobalPointerSpec GlobalPointerValues[CDacBlobGlobalPointersCount];
struct GlobalStringSpec GlobalStringValues[CDacBlobGlobalStringsCount];
uint8_t NamesPool[sizeof(struct CDacStringPoolSizes)];
uint8_t EndMagic[4];
};
Expand All @@ -242,16 +270,19 @@ struct MagicAndBlob BlobDataDescriptor = {
/* .FieldsPoolStart = */ offsetof(struct BinaryBlobDataDescriptor, FieldsPool),
/* .GlobalLiteralValuesStart = */ offsetof(struct BinaryBlobDataDescriptor, GlobalLiteralValues),
/* .GlobalPointersStart = */ offsetof(struct BinaryBlobDataDescriptor, GlobalPointerValues),
/* .GlobalStringValuesStart = */ offsetof(struct BinaryBlobDataDescriptor, GlobalStringValues),
/* .NamesPoolStart = */ offsetof(struct BinaryBlobDataDescriptor, NamesPool),
/* .TypeCount = */ CDacBlobTypesCount,
/* .FieldsPoolCount = */ CDacBlobFieldsPoolCount,
/* .GlobalLiteralValuesCount = */ CDacBlobGlobalLiteralsCount,
/* .GlobalPointerValuesCount = */ CDacBlobGlobalPointersCount,
/* .GlobalStringValuesCount = */ CDacBlobGlobalStringsCount,
/* .NamesPoolCount = */ sizeof(struct CDacStringPoolSizes),
/* .TypeSpecSize = */ sizeof(struct TypeSpec),
/* .FieldSpecSize = */ sizeof(struct FieldSpec),
/* .GlobalLiteralSpecSize = */ sizeof(struct GlobalLiteralSpec),
/* .GlobalPointerSpecSize = */ sizeof(struct GlobalPointerSpec),
/* .GlobalStringSpecSize = */ sizeof(struct GlobalStringSpec)
},
/* .PlatformFlags = */ (sizeof(void*) == 4 ? 0x02 : 0) | 0x01,
/* .BaselineName = */ offsetof(struct CDacStringPoolSizes, cdac_string_pool_baseline_),
Expand Down Expand Up @@ -287,10 +318,16 @@ struct MagicAndBlob BlobDataDescriptor = {
#include "datadescriptor.h"
},

/* .GlobalStringValues = */ {
#define CDAC_GLOBAL_STRING(name,value) { /* .Name = */ GET_GLOBAL_NAME(name), /* .Value = */ GET_GLOBALSTRING_VALUE(name) },
#include "datadescriptor.h"
},

/* .NamesPool = */ ("\0" // starts with a nul
#define CDAC_BASELINE(name) name "\0"
#define CDAC_TYPE_BEGIN(name) #name "\0"
#define CDAC_TYPE_FIELD(tyname,membertyname,membername,offset) #membername "\0" #membertyname "\0"
#define CDAC_GLOBAL_STRING(name,value) #name "\0" STRINGIFY(value) "\0"
#define CDAC_GLOBAL_POINTER(name,value) #name "\0"
#define CDAC_GLOBAL(name,tyname,value) #name "\0" #tyname "\0"
#include "datadescriptor.h"
Expand Down
Loading
Loading