c-api: component-model: Values and function calling #10697

MangoPeachGrape · 2025-04-29T16:36:38Z

Only contains primitive values and no docs as of now. Trying to gauge if this is the correct way to approach this.

Some things of note:

wasmtime_component_valunion_t::boolean instead of ::bool because keyword.
wasmtime_component_valunion_t::f32 and WASMTIME_COMPONENT_F32 instead of float32, should this be changed to be consistent?
C++20 needed for syntax used when creating wasmtime_component_val_t in the test, so I bumped to 20 only for tests. Is that fine? Could the whole project be bumped to 20?

Also, is there any better way to test all values, other than copying the test for each kind of value?

ac000 · 2025-04-29T23:30:24Z

Could the whole project be bumped to 20?

My feeling is that is a bit too recent, even GCC 15.1 just released still defaults to -std=gnu++17 for C++. Though admittedly I don't really know which bits of wasmtime that would effect. Like how would that affect building wasmtime on older (but still supported, thinking Debian, RHEL etc) Linux distributions?.

Also, is there any better way to test all values, other than copying the test for each kind of value?

An array of structs of the values perhaps?

MangoPeachGrape · 2025-04-30T11:48:00Z

My feeling is that is a bit too recent, even GCC 15.1 just released still defaults to -std=gnu++17 for C++. Though admittedly I don't really know which bits of wasmtime that would effect. Like how would that affect building wasmtime on older (but still supported, thinking Debian, RHEL etc) Linux distributions?.

The feature used was 'designated initializer', which was implemented in gcc 8 and clang 10, in 2018 and 2020 respectively. AFAIK it would only be required for building tests, and when a consumer depends on the wasmtime-cpp project, as the build step for the C-API doesn't build any C++ code.

Feel free to let me know if its too new/not wanted!

ac000 · 2025-04-30T13:47:30Z

The feature used was 'designated initializer', which was implemented in gcc 8 and clang 10, in 2018 and 2020

Heh, only 20 years after C!

respectively. AFAIK it would only be required for building tests, and when a consumer depends on the wasmtime-cpp project, as the build step for the C-API doesn't build any C++ code.

Feel free to let me know if its too new/not wanted!

Not that this is really any business of mine, but that's probably OK then, Debian 11 has GCC 10.x and Clang 11.x and RHEL/etc 8 has GCC 8.x

B.t.w thanks for doing this work!

alexcrichton · 2025-04-30T15:25:27Z

On the PR at-hand here, one thing I am worried about with values is ownership of lists and transitive pointers. For example the string variant here is currently implemented as an owning pointer which means that even for invocations such as wasmtime_component_func_call it'll require that ownership is passed into the component. It also means that on the "other half", defining a function in a linker, ownership will be given and the calling code will be responsible for deallocating it. More-or-less this is I think the main sticking point of component model values and figuring this out will pave the way for the rest of the API.

Historically I've thought that we want to not have ownership during wasmtime_component_func_call and maybe give ownership when wasm calls the host. That being said this strategy has a large downside of now wasmtime_component_val_t doesn't have static ownership semantics which makes it much more difficult to bind in languages and such. Static ownership semantics definitely makes the type more usable (as implemented here), but less efficient.

Personally I'm sort of leaning towards this-style API that you've prototyped here. It's less efficient than a hypothetical alternative but the "hypothetical alternative" is IMO so unusable it's not worth pursuing (e.g. not being able to write down static semantics for a type and how it's owned).

One way I can perhaps think about this API is that it's similar to wasmtime::component::Val which is intended for expressive power, not efficiency. The downside is that there's no way to implement a "typed" API in C so it means that this would be the only way to invoke a component, which isn't great. This is where I've historically gotten stuck...

For your specific questions though:

wasmtime_component_valunion_t::boolean instead of ::bool because keyword.

sounds reasonable to me yeah

wasmtime_component_valunion_t::f32 and WASMTIME_COMPONENT_F32 instead of float32, should this be changed to be consistent?

The "f32" naming is the right naming to use, we just haven't completed the rename from "float32" to "f32" in all places yet (but if you see them feel free to flag them)

C++20 needed for syntax used when creating wasmtime_component_val_t in the test, so I bumped to 20 only for tests. Is that fine? Could the whole project be bumped to 20?

It seems reasonable to bump this for tests yeah if it works on CI given that it's test-only. I'm not really sure how this would impact consumers since this library is primarily built in Rust and otherwise just shipped as a bunch of headers, but we can handle anything in issues and such. As you say if -DBUILD_TESTS=OFF, which is the default, I think will handle most of the impact here.

Also, is there any better way to test all values, other than copying the test for each kind of value?

Given the dynamic ownership nature of values I think this is basically going to involve a lot of copy/paste. That's where I think C++ dtors can help a lot by reducing the amount of boilerplate, but that would also require binding in C++ APIs. Maybe something where tests are mostly calling a central helper function with a few customized callbacks? The callbacks would customize the module/value per-test in that case maybe?

And as a final thought: a natural dual for this feature will be the ability to define host functions that can be called from wasm (e.g. defining a function in a component linker). I've personally found values tricky enough in the past that I've found it worthwhile to implement both calling wasm and wasm calling the host at the same time. If you'd like I think it'd be reasonable to include the basics of such infrastructure in this PR. If you'd prefer not to, however, I think it's reasonable to defer this to a future PR as well.

MangoPeachGrape · 2025-04-30T21:17:49Z

wasmtime_component_val_t doesn't have static ownership semantics which makes it much more difficult to bind in languages and such. Static ownership semantics definitely makes the type more usable

It's less efficient than a hypothetical alternative

Could you expand on what does the "static ownership semantics" and the "hypothetical alternative" mean?

One idea that popped into my head was to have some helper functions for each type to lift/lower to the raw values, or maybe give access to the raw values? I have zero idea if that would be a valid approach. I might be missing something big, as I don't have much at all knowledge of the existing codebase.

alexcrichton · 2025-04-30T22:09:32Z

Oh sure, by static ownership semantics I mean that a type always prescribes a particular way to manage its memory. For example wasmtime_component_t always requires that the caller deallocates it, no questions asked. this is also reflected in your wasmtime_component_val_t where the string variant is "always owned" and whenever it's passed along that's a transfer of ownership.

This is in contrast to thinking I've had historically about component values. For example invoking a component function does not in theory require giving up ownership of anything. You could in theory pass in string pointers that aren't free'd by wasmtime_component_func_call. If we were to go down this road though this is the "hypothetical alternative" where all of a sudden the memory management of wasmtime_component_val_t is dependent on where the value came from. Sometimes it might be transferred by ownership, sometimes not. Personally I think such a system, while possibly more efficient, is too complex to reliably use correctly.

Basically what you've sketched out here is I think a good idea and we should keep it this way.

In terms of prototyping I think it might be useful to, here in this PR, sketch out not only the terminal types of wasmtime_component_val_t but something that also involves recursion, for example a record type. That might help sort out some of these questions perhaps?

MangoPeachGrape · 2025-05-05T20:39:50Z

After thinking about this for few days, I started to dislike passing args as mut, and the forced allocation. For strings requiring the C-API to allocate a wasm_name_t isn't a big issue, as rust can use that allocation, but for lists and other values that isn't possible.

What I tried to achiveve in this iteration is being able to create arguments for a function fully on the stack, and rust to be able to still return values.

In practice this would mean that you need to call wasmtime_component_val_delete() on each return value that rust might have allocated.

Please let me know if you feel like I'm going in the wrong direction!

alexcrichton · 2025-05-06T14:39:33Z

This is possible yeah and it's the alternative I mentioned above too. The main downside is that this is quite difficult to bind in other languages because the ownership semantics of wasmtime_component_val_t is not clear. Sometimes you need to call a destructor and sometimes not, and that makes it quite difficult to use correctly in any context, including bindings in other languages.

This is basically where I've always stopped short historically in thinking about what this would look like. I get stuck here not knowing how best to bind component values. I've yet to find a scheme that feels like it balances pros/cons effectively unfortunately :(

My best thinking at this time is that we should have two methods of calling component functions, similar to the wasmtime_func_call{,_unchecked}. One takes wasmtime_component_val_t which is flexible, but slow. This would look mostly like this PR wasmtime_component_val_t but it always has a destructor with it. How exactly that would work out I'm not entirely certain. The "unchecked" path though would take a "canonical ABI blob" which is sort of a packed representation of the canonical ABI of a type, but for the host, and in theory has much lower overhead.

Or... something like that, I really don't have concrete ideas about how best to progress here. I really am worried though about this style of API though in that I don't know how to bind it in languages like Python safely.

MangoPeachGrape · 2025-05-06T18:35:01Z

Would something like this be an acceptable API for bindings in other languages:

wasmtime_component_val_t argument = wasmtime_component_valrecord_new(2);
assert(argument.kind == WASMTIME_COMPONENT_RECORD);
assert(argument.of.record.len == 2);

wasm_name_new(&argument.of.record.ptr[0].name, "first")
argument.of.record.ptr[0].val.kind = WASMTIME_COMPONENT_U32;
argument.of.record.ptr[0].val.of.u32 = 1;

wasm_name_new(&argument.of.record.ptr[1].name, "second")
argument.of.record.ptr[1].val.kind = WASMTIME_COMPONENT_U32;
argument.of.record.ptr[1].val.of.u32 = 1;

wasmtime_component_val_t result = {};

wasmtime_component_func_call(func, &argument, 1, &result, 1);

wasmtime_component_val_delete(&argument);
wasmtime_component_val_delete(&result);

This kind of API would allow the record entries to be both heap allocated (like this example) or stack allocated (not calling wasmtime_component_valrecord_new() and wasmtime_component_val_delete()). I haven't looked at the Python bindings, does it allow having wasmtime_component_val_t on the stack, does that also have to be a pointer to heap?

The "unchecked" path though would take a "canonical ABI blob" which is sort of a packed representation of the canonical ABI of a type, but for the host, and in theory has much lower overhead.

I agree, that would be great in the long run.

Why I leaned towards having the option for stack allocated values is that I thought it would make some cases simpler and faster, like:

wasmtime_component_valrecord_entry_t entry = (wasmtime_component_valrecord_entry_t) {
    .name.ptr = "namee",
    .name.len = 5,
    .val.kind = WASMTIME_COMPONENT_U8,
    .val.of.u8 = 123,
};
wasmtime_component_func_call(
    func,
    &(wasmtime_component_val_t) {
        .kind = WASMTIME_COMPONENT_RECORD,
        .of.record.ptr = &entry,
        .of.record.len = 1
    },
    1,
    &return,
    1
)

(not sure if that's 100% correct C syntax, but the idea should remain)

alexcrichton · 2025-05-06T20:16:18Z

I think the trickier parts of memory management are going to come up during host-defined functions being inserted into a linker. For example when arguments are passed to the host, should the host or wasmtime deallocate them? When arguments are returned, does wasmtime need to deallocate them? The precise answer to this question ends up informing how difficult this is to bind in other languages.

For other languages though I don't really have a concrete worry per se. Historically wasmtime_val_t had an allocation inside of it for wasmtime_externref_t which I remember being a huge pain binding. Everything got way easier when it became a bland u32 from a guest bindings perspective. Unfortunately though I don't recall the exact pain and issues that I came up with. I think the only real way to bottom out such a concern would be to write such bindings, but I also don't think that's reasonable to expect of you.

Perhaps though another radical alternative. This is something I've had rattling around in my head for awhile that I keep forgetting about and have also not fully fleshed out. What I'm imagining is something like this:

typedef struct wasmtime_component_vals wasmtime_component_vals_t;
typedef struct wasmtime_component_call wasmtime_component_call_t;

wasmtime_component_call_t *call = wasmtime_component_func_call_start(store, &func);
wasmtime_component_vals_t *params = wasmtime_component_call_params(call);
// set the first parameter, in this case a `u32`
wasmtime_component_vals_set_u32(params, 3);
// set the second parameter, in this case a `string`
wasmtime_component_vals_set_string(params, &my_variable_typed_as_wasm_name_t);
// set the third parameter, in this case a `record point { x: u32, y: u32 }`
wasmtime_component_vals_set_record(params, /*nfields=*/ 2);
  // set "x: 0"
wasmtime_component_vals_set_record_field(params, &x_as_wasm_name_t);
wasmtime_component_vals_set_u32(params, 0);
  // set "y: 1"
wasmtime_component_vals_set_record_field(params, &y_as_wasm_name_t);
wasmtime_component_vals_set_u32(params, 1);

// dispatch the call
wasmtime_component_vals_t *results = wasmtime_component_call_finish(call);

// if the result is a u32
wasmtime_component_vals_get_u32(results, &ret);
// if the result is a string
wasmtime_component_vals_get_string(results, &something_with_wasm_name_t);
// etc ..

The rough idea is that this is a much "chattier" C ABI boundary but is, in theory, much more flexible about where things are stored and how exactly the host represents things. The wasmtime_component_vals_t type is an "iterator" of sorts that walks over the structure of the type tree in tandem with where values are actually being stored in a WebAssembly module's linear memory and such. This would be, in essence, interleaving lowering code with actually storing into wasm linear memory.

Actually implementing this would require new support on the Wasmtime side of things, which would arguably be a good thing as well. Whether or not this is a good idea I don't know, this definitely isn't a fully fleshed out idea. It would, however, remove the need for strict ownership around wasmtime_component_val_t

MangoPeachGrape · 2025-05-06T20:53:10Z

I think the trickier parts of memory management are going to come up during host-defined functions being inserted into a linker. For example when arguments are passed to the host, should the host or wasmtime deallocate them? When arguments are returned, does wasmtime need to deallocate them? The precise answer to this question ends up informing how difficult this is to bind in other languages.

These are the semantics I've been thinking of:

When calling guest function, you own both the arguments and return values, so you need to clean both up (i.e. call wasmtime_component_val_delete()).
When guest calls your host function, the arguments and return values will be owned by rust, so if you create a record return value (what would be wasmtime_component_valrecord_new(/* size */) from my last comment) and assign in to the return value array, then the rust side cleans it up after your function has executed.

Edit: Just to be clear (because the current naming of wasmtime_component_val_delete() isn't great), wasmtime_component_val_delete() doesn't actually delete the pointer passed to it, it just calls the destructor on the pointer, which will delete the allocations inside the value, e.g. a record that has allocated entries.

I feel like this is what I had initially, just that you need to clean up the arguments you pass to a function after calling it, which I feel should be fine, as return values would have had to been cleaned up anyway.

Perhaps though another radical alternative. This is something I've had rattling around in my head for awhile that I keep forgetting about and have also not fully fleshed out. What I'm imagining is something like this:

I see the idea, I'll need some time to properly think through it.

I'll try to implement host functions now, to understand the problem better.

How should testing of records be done? I've been doing using this guest component locally:

record ccc {
	value: u64,
	multiplier: u64,
}

export cccc: func(a: ccc) -> ccc;

fn cccc(a: Ccc) -> Ccc {
    Ccc {
        value: a.value * a.multiplier,
        multiplier: a.multiplier,
    }
}

... and then calling it and checking the return value. Is there any resonable way to test these kinds of values without a guest component written in a higher level language?
I saw the /examples/component/ that used a guest component written in rust, should it be done similarly here?

alexcrichton · 2025-05-08T15:45:39Z

Those ownership semantics make sense to me, but they're a bit tricky to bind in a higher level language like Python. For example Python will have some sort of Val type which is the union wasmtime_component_val_t under the hood. When calling a function you'd provide a list of Val and get back a list of Val, and that particular one would have to be owned and managed in Python itself. When wasm calls back into the host, in this case Python, then you'd still get a list of Val and produce a list of Val but the ownership is different where Python can't persist the Val beyond the function call, for example, so various pointers would have to be invalidated just before the host function returns. This is all doable, but will be tricky basically.

How should testing of records be done?

Yeah it's possible to use the text format of components, albeit it's a bit verbose. You can get a bit of a feel for the text format for components from this directory

MangoPeachGrape · 2025-05-14T01:44:43Z

Wrote some tests for the complex types, tried to abstract away most of the creation. Followed the tests in tests/all/component_model/ so I duplicated the REALLOC_AND_FREE helper. Not sure if its needed though, as the current tests only require one allocation. Could you look thoroughly through the test code, as I was not at all familiar with the text format.

What's your current thoughts on the design? Do you think its good enough to go forward?

I think one case is not covered: in a host function, setting the return value to the argument value, as doing rets[0] = args[0] would result in a double free. A wasmtime_component_val_clone() should fix this though.

alexcrichton

Ok thanks for being patient I was a bit too busy last week! Overall I think let's commit to this approach. This looks to be workable enough and if it's difficult to integrate into other languages we can tackle that then.

Thank you again very much for working on this!

crates/c-api/src/component/val.rs

Later we can provide implementations that take it as a value to avoid copying

MangoPeachGrape · 2025-05-20T18:31:56Z

Oops... thought I could rebase on top of main cleanly because I didn't get any conflicts, sorry.

MangoPeachGrape · 2025-05-22T14:19:26Z

Refactored to use declare_vecs!, but I'm not entirely happy with it:

_new(), _empty(), and _uninit() using an out param, instead of a return value, but I also understand the backward compatibility requirements of the other APIs...

_copy() and _delete() are duplicated for each inner type, like wasmtime_component_vallist_copy() and wasmtime_component_valrecord_copy().
What I've been pondering is would it be simpler to have a "global" wasmtime_component_val_copy() instead? I can't seem to think of scenarios where an inner type copy() would be needed, but they might exist?

Feel free to bikeshed all the names, like:
wasmtime_component_vallist_t vs wasmtime_component_val_list_t (i.e. should there be _)
wasmtime_component_valrecord_entry_t vs wasmtime_component_valrecord_field_t or something else?

I also assume rest of the value types should come in a later PR?

alexcrichton

I think it's good to have wasmtime_component_val_copy but I also think it's good to have intermediate helpers as well in case they're needed, so personally I'm ok having both the top-level copy method plus helpers for intermediate ones as needed.

crates/c-api/src/component/func.rs

crates/c-api/include/wasmtime/component/val.h

MangoPeachGrape · 2025-05-22T19:15:41Z

Hmm, the doxygen output for the component files don't show up like it does for other files..
See https://docs.wasmtime.dev/c-api/component_2component_8h_source.html

alexcrichton · 2025-05-22T20:07:23Z

Odd! Not that I'm enough of a doxygen expert to know why...

alexcrichton · 2025-05-23T00:27:37Z

If this bounces again you can put prtest:full in a commit message somewhere and it'll run full CI on this PR instead of just a subset (avoids the need to have me in the middle of the iteration loop)

MangoPeachGrape requested a review from a team as a code owner April 29, 2025 16:36

MangoPeachGrape requested review from pchickey and removed request for a team April 29, 2025 16:36

github-actions bot added the wasmtime:c-api Issues pertaining to the C API. label Apr 29, 2025

pchickey requested review from alexcrichton and removed request for pchickey April 30, 2025 00:00

alexcrichton approved these changes May 20, 2025

View reviewed changes

crates/c-api/src/component/val.rs Outdated Show resolved Hide resolved

crates/c-api/src/component/val.rs Outdated Show resolved Hide resolved

crates/c-api/src/component/val.rs Outdated Show resolved Hide resolved

MangoPeachGrape added 10 commits May 20, 2025 21:19

c-api: component-model: Primitive values

358f78a

c-api: component-model: Function calling

1dad666

A test

151581d

Take args as mut to avoid copying

a20c540

String and char

ac9eb55

Rethink value ownership semantics, add list values

77c103e

Record values

13e9be6

Make take Rust values as refs in ::from() functions

5040404

Later we can provide implementations that take it as a value to avoid copying

Define host functions

c2351c6

wasmtime_component_valrecord_new()

c9f7a35

Test lists

e644e3e

MangoPeachGrape force-pushed the c-api/component-model/val branch from b7f9dd6 to e644e3e Compare May 20, 2025 18:26

MangoPeachGrape added 2 commits May 20, 2025 21:33

Fix formatting

85f9c79

Use existing declare_vecs construct

d9e81ef

MangoPeachGrape force-pushed the c-api/component-model/val branch from 7467bd3 to d9e81ef Compare May 22, 2025 13:59

alexcrichton approved these changes May 22, 2025

View reviewed changes

crates/c-api/src/component/func.rs Show resolved Hide resolved

crates/c-api/include/wasmtime/component/val.h Show resolved Hide resolved

Add rest of helper functions

73ec3c6

alexcrichton added this pull request to the merge queue May 22, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 22, 2025

Add documentation

c131719

Fix multiline comments

84f2298

alexcrichton added this pull request to the merge queue May 22, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 22, 2025

Third time's the charm

50b0270

alexcrichton added this pull request to the merge queue May 23, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 23, 2025

MangoPeachGrape added 4 commits May 23, 2025 13:25

Fourth time's the charm prtest:full

01bc672

Doxygen file headers

e8f72ef

Fix other missing documentation

c9fbe6b

Small fix to docs

4f6c70c

alexcrichton enabled auto-merge May 23, 2025 14:02

alexcrichton added this pull request to the merge queue May 23, 2025

Merged via the queue into bytecodealliance:main with commit b2c64de May 23, 2025
160 checks passed

MangoPeachGrape deleted the c-api/component-model/val branch May 23, 2025 14:36

c-api: component-model: Values and function calling #10697

c-api: component-model: Values and function calling #10697

Uh oh!

Conversation

MangoPeachGrape commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ac000 commented Apr 29, 2025

Uh oh!

MangoPeachGrape commented Apr 30, 2025

Uh oh!

ac000 commented Apr 30, 2025

Uh oh!

alexcrichton commented Apr 30, 2025

Uh oh!

MangoPeachGrape commented Apr 30, 2025

Uh oh!

alexcrichton commented Apr 30, 2025

Uh oh!

MangoPeachGrape commented May 5, 2025

Uh oh!

alexcrichton commented May 6, 2025

Uh oh!

MangoPeachGrape commented May 6, 2025

Uh oh!

alexcrichton commented May 6, 2025

Uh oh!

MangoPeachGrape commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexcrichton commented May 8, 2025

Uh oh!

MangoPeachGrape commented May 14, 2025

Uh oh!

alexcrichton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MangoPeachGrape commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MangoPeachGrape commented May 22, 2025

Uh oh!

alexcrichton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MangoPeachGrape commented May 22, 2025

Uh oh!

alexcrichton commented May 22, 2025

Uh oh!

Uh oh!

alexcrichton commented May 23, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MangoPeachGrape commented Apr 29, 2025 •

edited

Loading

MangoPeachGrape commented May 6, 2025 •

edited

Loading

MangoPeachGrape commented May 20, 2025 •

edited

Loading