-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std::bad_cast exception from std::regex when using vkconfig to enable vulkan layers #694
Comments
Hello Jeremy, thank you for your report. This is a bit of processing on the minidump:
|
That's roughly consistent with what I see locally with gdb, initially I was getting this exception inside std::regex but now it looks like it is std::iostream.
|
Can you provide the |
@smcv is our specialist for this kind of thing, but I'm not sure it's particularly good or intentional that the internal usage of There's probably several problems at play here, one of them being our exposing of these symbols in First order of business is probably to chase broader reproduction and see if we see this happening on Ubuntu 22 or a non debian distro (Arch possibly since it's our primary dev environment). |
I think that's probably the root cause here. Is
This sounds like some component is (quite reasonably!) complaining that the runtime type information for std::regex implementation A doesn't say it's an instance of std::regex implementation B. |
One possible mitigation for this would be having Vulkan-Loader load its layers with |
The loader is using |
having the layer statically link against libstdc++ seems to avoid the problem:
|
Yeah, that's a useful workaround if you can arrange for it to happen on the host system. It would be better if we can avoid the Source2 engine triggering this problem from the engine end, but that's difficult to achieve (I have a couple of ideas which might potentially help there, I'll see what happens after testing them). The fundamental problem here is that there are two copies of |
That means the Vulkan layer can't cause the equivalent of this crash in Source2 engine code, but the Source2 engine can still cause this crash in Vulkan layer code. Using |
What does vkconfig actually do to get CS2 to load these layers? I'm trying to keep my test systems relatively "ordinary" to make my testing as representative as possible of what will happen on end-user systems, so if possible I'd prefer not to install software from outside the official Arch and Ubuntu repositories that end users would not normally have, which includes the external Vulkan SDK packages. |
OK, vkconfig is available in Arch (packaged in It looks as though it arranges for extra layers to be loaded by writing out |
Exactly right. Does the runtime do something special for layers set in the VK_INSTANCE_LAYERS environment variable? That works on my system when vkconfig doesn't. And at this point it is the only part of this issue that doesn't make sense to me. |
No, the source code doesn't even mention it. At the moment we "capture" every implicit and explicit layer from the host system, together with its recursive dependencies, and let Vulkan-Loader worry about whether the layer is actually runtime-enabled or not. We use the environment variables that are part of how layers are discovered, like (We're actually thinking of changing that, to avoid disabled layers creating action-at-a-distance side-effects - but that isn't implemented yet.) You've said that when the validation layer is activated via |
I tried to reproduce this on an up-to-date Ubuntu 24.04 Nvidia system with the distro's validation layer (version 1.3.275.0-1), by creating this:
If you use a similar hand-written JSON file instead of |
Good idea. I was able to reproduce with the following: ~/.local/share/vulkan/implicit_layer.d/bad_cast.json:
~/.local/share/vulkan/settings.d/vk_layer_settings.txt:
I think there's 2 key parts:
|
I created a simplified version of this situation to test various ideas against, and I found that in that project, I had to link the mockup Vulkan layer with |
Aha, that's useful information. I couldn't reproduce this today, but I'll try again tomorrow armed with that additional knowledge. |
On Ubuntu 24.04 + Nvidia + proprietary driver, I can reproduce a crash. Steps to reproduce:
Unfortunately
Crash dumps: Hopefully this is enough for an engine developer to at least confirm whether I reproduced the same crash? And maybe those steps help an engine developer to reproduce this. On Arch + AMD + Mesa, I couldn't reproduce this by the same method (this time the validation layer was from
If I'm reading the specification correctly, |
On reflection, I don't think this is actually a Steam Runtime problem, more like a game-engine problem. If we were running the game without using the SLR container, its statically-linked copy of (part of) our SDK's compiler's libstdc++ would be equally able to conflict with the host system's libstdc++, unless they happened to be the same version (which is obviously not possible in general, because the environment used to compile the game can't possibly have the same libstdc++ version as both Arch and Ubuntu at the same time). So I think this issue report should perhaps move to https://github.com/ValveSoftware/csgo-osx-linux, or be closed as "not planned" for the container runtime - it's only CS2 developers who will be in a position to be able to address this.
I made a simplified mockup of this situation, and while I couldn't figure out how to reproduce a similar crash, I was able to test out my ideas by inspecting the symbol tables. Based on that, I've made some suggestions internally (reference for maintainers: steamrt/tasks#547) which might provide ways to resolve this. Until then, you have some workarounds (linking your Vulkan layer's libstdc++ statically as well, or loading it via |
I'd rather keep it here since it's focused on the general problem of distributing binaries on Linux that steam-runtime aims to fix, and generally not a problem restricted to CS2. |
Yeah, reasonable. Perhaps when we've figured out how to avoid this problem in CS2, we can add a note to https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/blob/main/docs/slr-for-game-developers.md?ref_type=heads#making-a-game-container-friendly (or some more general document about building generally-usable Linux binaries) with recommendations for good patterns to use and anti-patterns to avoid? |
Your system information
steamapps/common/SteamLinuxRuntime/VERSIONS.txt
? not installedsteamapps/common/SteamLinuxRuntime_soldier/VERSIONS.txt
? not installedsteamapps/common/SteamLinuxRuntime_sniper/VERSIONS.txt
?Please describe your issue in as much detail as possible:
I was trying to use a development version of a vulkan debug tool called Crash Diagnostic Layer (https://github.com/LunarG/CrashDiagnosticLayer). When using local builds of this layer AND enabling with
vkconfig
I get std::bad_cast. This also happens when using the Vulkan-ValidationLayer.Local builds of both of the above layers work correctly when enabled using the environment variables
VK_INSTANCE_LAYERS
andVK_LAYER_PATH
.Log from using vkconfig:
Steps for reproducing this issue:
build/src
subdirectory of the local CrashDiagnosticLayer build.crash_20240925103101_2.dmp.gz
The text was updated successfully, but these errors were encountered: