Skip to content

Panic during component instantiation #10802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
moldhouse opened this issue May 19, 2025 · 9 comments · Fixed by #10803
Closed

Panic during component instantiation #10802

moldhouse opened this issue May 19, 2025 · 9 comments · Fixed by #10803
Labels
bug Incorrect behavior in the current implementation that needs fixing

Comments

@moldhouse
Copy link

Test Case

Work in progress. Currently we only witness this in our production code base. We are currently working on a minimal example. Yet we found it may be valuable to share the back trace with you up front.

Steps to Reproduce

See above.

Expected Results

Instantiating the component without a panic.

Actual Results

We encounter a panic.

thread 'tokio-runtime-worker' panicked at /root/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/rustix-1.0.5/src/backend/linux_raw/param/auxv.rs:302:68:
called `Option::unwrap()` on a `None` value
stack backtrace:
   0: __rustc::rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::panicking::panic
   3: core::option::unwrap_failed
   4: rustix::backend::param::auxv::init_auxv_impl
   5: rustix::backend::param::auxv::init_auxv
   6: <wasmtime::runtime::vm::instance::allocator::on_demand::OnDemandInstanceAllocator as wasmtime::runtime::vm::instance::allocator::InstanceAllocatorImpl>::allocate_fiber_stack
   7: wasmtime::runtime::component::instance::InstancePre<T>::instantiate_async::{{closure}}
   8: pharia_kernel::skills::v0_3::skill::SkillPre<_T>::instantiate_async::{{closure}}
   9: <pharia_kernel::skills::v0_3::skill::SkillPre<engine_room::LinkerImpl<alloc::boxed::Box<dyn pharia_kernel::csi::CsiForSkills+core::marker::Send>>> as pharia_kernel::skills::Skill>::run_as_function::{{closure}}
  10: pharia_kernel::skill_runtime::SkillRuntimeActor<C,S>::run::{{closure}}::{{closure}}
  11: <futures_util::stream::stream::select_next_some::SelectNextSome<St> as core::future::future::Future>::poll
  12: tokio::runtime::task::core::Core<T,S>::poll
  13: tokio::runtime::task::raw::poll
  14: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
  15: tokio::runtime::task::raw::poll

Versions and Environment

We see the panic in wasmtime 32, not in 31. We do not see the problem on all platforms.

We saw it on:

  • MacOS (arm) running a Container with Ubuntu 24 (always)
  • GitHub CI running the same Container (sometimes, flaky)

We did not see it on:

  • MacOS (arm) without Container
  • Running the Container in our Prod environment
@moldhouse moldhouse added the bug Incorrect behavior in the current implementation that needs fixing label May 19, 2025
@moldhouse
Copy link
Author

@pacman82 @markus-klein-aa

@bjorn3
Copy link
Contributor

bjorn3 commented May 19, 2025

The error happens at https://github.com/bytecodealliance/rustix/blob/cb01fbe4660844b67fdd4eee2a5f769518f6a655/src/backend/linux_raw/param/auxv.rs#L302 which indicates that one of the auxv entries for the process may be incorrect. By the way are you running an arm64 version of Wasmtime or an x86_64 version on your mac?

@ghost
Copy link

ghost commented May 19, 2025

On CI we build for x86_64. Locally on our Mac OS Developer machines we build with:

podman build . --tag pharia-kernel --platform linux/arm64

I am walking back a bit, that we witness the same error on CI. We need to validate that. It might be a different issue.

@ghost
Copy link

ghost commented May 19, 2025

Verified it also fails locally on our dev machines if we build for x86_64.

Yet, it does not fail, if we build it natively without a container around it.

alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue May 19, 2025
Currently Wasmtime has a function `crate::runtime::vm::host_page_size`
but this is only used sometimes and the rest of the time
`rustix::param::page_size` is used in a few locations. It looks like
this usage of `rustix` is causing a panic in bytecodealliance#10802 and additionally
it's best to only have one source for this, so this commit updates this
all to route through our preexisting `host_page_size` function.
alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue May 19, 2025
Currently Wasmtime has a function `crate::runtime::vm::host_page_size`
but this is only used sometimes and the rest of the time
`rustix::param::page_size` is used in a few locations. It looks like
this usage of `rustix` is causing a panic in bytecodealliance#10802 and additionally
it's best to only have one source for this, so this commit updates this
all to route through our preexisting `host_page_size` function.
@alexcrichton
Copy link
Member

I'm not sure why this is panicking as I'm not familiar with auxv or how rustix is calculating the host page size, but I've submitted #10803 to remove calls to this function which will somewhat indirectly "fix" this insofar as Wasmtime won't panic at that location any more.

@markus-klein-aa if you're able to reduce I believe the rustix project would likely be thankful to have an issue about this panic on their issue tracker.

@ghost
Copy link

ghost commented May 19, 2025

Yeah, absolutely we want a minimal example. Yet this is some effort and we thought there might be value in sharing the stack trace up front.

@alexcrichton
Copy link
Member

FWIW the reproduction will likely be invoking this function and that's pretty much it. The main thing to reproduce is your environment which triggers this panic.

@ghost
Copy link

ghost commented May 19, 2025

Hey, thanks for the hint. I'll give it a try!

alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue May 20, 2025
Currently Wasmtime has a function `crate::runtime::vm::host_page_size`
but this is only used sometimes and the rest of the time
`rustix::param::page_size` is used in a few locations. It looks like
this usage of `rustix` is causing a panic in bytecodealliance#10802 and additionally
it's best to only have one source for this, so this commit updates this
all to route through our preexisting `host_page_size` function.
alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue May 20, 2025
Currently Wasmtime has a function `crate::runtime::vm::host_page_size`
but this isn't reachable from the `wasmtime-fiber` crate and instead tha
crate uses `rustix::param::page_size` to determine the host page size.
It looks like this usage of `rustix` is causing a panic in bytecodealliance#10802.
Ideally `wasmtime-fiber` would be able to use the same function but the
crate separation does not currently make that feasible. For now
duplicate the logic of `wasmtime` into `wasmtime-fiber` as it's modest
enough to ensure that this does not panic.

Closes bytecodealliance#10802
@ghost
Copy link

ghost commented May 21, 2025

@alexcrichton You were correct about that we only need to call the function to reproduce it. Opened an issue in rustix. Thanks again.

github-merge-queue bot pushed a commit that referenced this issue May 22, 2025
* Duplicate page size determination in `wasmtime-fiber`

Currently Wasmtime has a function `crate::runtime::vm::host_page_size`
but this isn't reachable from the `wasmtime-fiber` crate and instead tha
crate uses `rustix::param::page_size` to determine the host page size.
It looks like this usage of `rustix` is causing a panic in #10802.
Ideally `wasmtime-fiber` would be able to use the same function but the
crate separation does not currently make that feasible. For now
duplicate the logic of `wasmtime` into `wasmtime-fiber` as it's modest
enough to ensure that this does not panic.

Closes #10802

* Run full test suite in CI

prtest:full
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior in the current implementation that needs fixing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants