-
Notifications
You must be signed in to change notification settings - Fork 1.4k
c-api: component-model: Values and function calling #10697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
c-api: component-model: Values and function calling #10697
Conversation
My feeling is that is a bit too recent, even GCC 15.1 just released still defaults to
An array of structs of the values perhaps? |
The feature used was 'designated initializer', which was implemented in gcc 8 and clang 10, in 2018 and 2020 respectively. AFAIK it would only be required for building tests, and when a consumer depends on the wasmtime-cpp project, as the build step for the C-API doesn't build any C++ code. Feel free to let me know if its too new/not wanted! |
Heh, only 20 years after C!
Not that this is really any business of mine, but that's probably OK then, Debian 11 has GCC 10.x and Clang 11.x and RHEL/etc 8 has GCC 8.x B.t.w thanks for doing this work! |
On the PR at-hand here, one thing I am worried about with values is ownership of lists and transitive pointers. For example the string variant here is currently implemented as an owning pointer which means that even for invocations such as Historically I've thought that we want to not have ownership during Personally I'm sort of leaning towards this-style API that you've prototyped here. It's less efficient than a hypothetical alternative but the "hypothetical alternative" is IMO so unusable it's not worth pursuing (e.g. not being able to write down static semantics for a type and how it's owned). One way I can perhaps think about this API is that it's similar to For your specific questions though:
sounds reasonable to me yeah
The "f32" naming is the right naming to use, we just haven't completed the rename from "float32" to "f32" in all places yet (but if you see them feel free to flag them)
It seems reasonable to bump this for tests yeah if it works on CI given that it's test-only. I'm not really sure how this would impact consumers since this library is primarily built in Rust and otherwise just shipped as a bunch of headers, but we can handle anything in issues and such. As you say if
Given the dynamic ownership nature of values I think this is basically going to involve a lot of copy/paste. That's where I think C++ dtors can help a lot by reducing the amount of boilerplate, but that would also require binding in C++ APIs. Maybe something where tests are mostly calling a central helper function with a few customized callbacks? The callbacks would customize the module/value per-test in that case maybe? And as a final thought: a natural dual for this feature will be the ability to define host functions that can be called from wasm (e.g. defining a function in a component linker). I've personally found values tricky enough in the past that I've found it worthwhile to implement both calling wasm and wasm calling the host at the same time. If you'd like I think it'd be reasonable to include the basics of such infrastructure in this PR. If you'd prefer not to, however, I think it's reasonable to defer this to a future PR as well. |
Could you expand on what does the "static ownership semantics" and the "hypothetical alternative" mean? One idea that popped into my head was to have some helper functions for each type to lift/lower to the raw values, or maybe give access to the raw values? I have zero idea if that would be a valid approach. I might be missing something big, as I don't have much at all knowledge of the existing codebase. |
Oh sure, by static ownership semantics I mean that a type always prescribes a particular way to manage its memory. For example This is in contrast to thinking I've had historically about component values. For example invoking a component function does not in theory require giving up ownership of anything. You could in theory pass in string pointers that aren't free'd by Basically what you've sketched out here is I think a good idea and we should keep it this way. In terms of prototyping I think it might be useful to, here in this PR, sketch out not only the terminal types of |
After thinking about this for few days, I started to dislike passing args as mut, and the forced allocation. For strings requiring the C-API to allocate a wasm_name_t isn't a big issue, as rust can use that allocation, but for lists and other values that isn't possible. What I tried to achiveve in this iteration is being able to create arguments for a function fully on the stack, and rust to be able to still return values. In practice this would mean that you need to call Please let me know if you feel like I'm going in the wrong direction! |
This is possible yeah and it's the alternative I mentioned above too. The main downside is that this is quite difficult to bind in other languages because the ownership semantics of This is basically where I've always stopped short historically in thinking about what this would look like. I get stuck here not knowing how best to bind component values. I've yet to find a scheme that feels like it balances pros/cons effectively unfortunately :( My best thinking at this time is that we should have two methods of calling component functions, similar to the Or... something like that, I really don't have concrete ideas about how best to progress here. I really am worried though about this style of API though in that I don't know how to bind it in languages like Python safely. |
Would something like this be an acceptable API for bindings in other languages: wasmtime_component_val_t argument = wasmtime_component_valrecord_new(2);
assert(argument.kind == WASMTIME_COMPONENT_RECORD);
assert(argument.of.record.len == 2);
wasm_name_new(&argument.of.record.ptr[0].name, "first")
argument.of.record.ptr[0].val.kind = WASMTIME_COMPONENT_U32;
argument.of.record.ptr[0].val.of.u32 = 1;
wasm_name_new(&argument.of.record.ptr[1].name, "second")
argument.of.record.ptr[1].val.kind = WASMTIME_COMPONENT_U32;
argument.of.record.ptr[1].val.of.u32 = 1;
wasmtime_component_val_t result = {};
wasmtime_component_func_call(func, &argument, 1, &result, 1);
wasmtime_component_val_delete(&argument);
wasmtime_component_val_delete(&result); This kind of API would allow the record entries to be both heap allocated (like this example) or stack allocated (not calling
I agree, that would be great in the long run. Why I leaned towards having the option for stack allocated values is that I thought it would make some cases simpler and faster, like: wasmtime_component_valrecord_entry_t entry = (wasmtime_component_valrecord_entry_t) {
.name.ptr = "namee",
.name.len = 5,
.val.kind = WASMTIME_COMPONENT_U8,
.val.of.u8 = 123,
};
wasmtime_component_func_call(
func,
&(wasmtime_component_val_t) {
.kind = WASMTIME_COMPONENT_RECORD,
.of.record.ptr = &entry,
.of.record.len = 1
},
1,
&return,
1
) (not sure if that's 100% correct C syntax, but the idea should remain) |
I think the trickier parts of memory management are going to come up during host-defined functions being inserted into a linker. For example when arguments are passed to the host, should the host or wasmtime deallocate them? When arguments are returned, does wasmtime need to deallocate them? The precise answer to this question ends up informing how difficult this is to bind in other languages. For other languages though I don't really have a concrete worry per se. Historically Perhaps though another radical alternative. This is something I've had rattling around in my head for awhile that I keep forgetting about and have also not fully fleshed out. What I'm imagining is something like this: typedef struct wasmtime_component_vals wasmtime_component_vals_t;
typedef struct wasmtime_component_call wasmtime_component_call_t;
wasmtime_component_call_t *call = wasmtime_component_func_call_start(store, &func);
wasmtime_component_vals_t *params = wasmtime_component_call_params(call);
// set the first parameter, in this case a `u32`
wasmtime_component_vals_set_u32(params, 3);
// set the second parameter, in this case a `string`
wasmtime_component_vals_set_string(params, &my_variable_typed_as_wasm_name_t);
// set the third parameter, in this case a `record point { x: u32, y: u32 }`
wasmtime_component_vals_set_record(params, /*nfields=*/ 2);
// set "x: 0"
wasmtime_component_vals_set_record_field(params, &x_as_wasm_name_t);
wasmtime_component_vals_set_u32(params, 0);
// set "y: 1"
wasmtime_component_vals_set_record_field(params, &y_as_wasm_name_t);
wasmtime_component_vals_set_u32(params, 1);
// dispatch the call
wasmtime_component_vals_t *results = wasmtime_component_call_finish(call);
// if the result is a u32
wasmtime_component_vals_get_u32(results, &ret);
// if the result is a string
wasmtime_component_vals_get_string(results, &something_with_wasm_name_t);
// etc .. The rough idea is that this is a much "chattier" C ABI boundary but is, in theory, much more flexible about where things are stored and how exactly the host represents things. The Actually implementing this would require new support on the Wasmtime side of things, which would arguably be a good thing as well. Whether or not this is a good idea I don't know, this definitely isn't a fully fleshed out idea. It would, however, remove the need for strict ownership around |
These are the semantics I've been thinking of:
Edit: Just to be clear (because the current naming of I feel like this is what I had initially, just that you need to clean up the arguments you pass to a function after calling it, which I feel should be fine, as return values would have had to been cleaned up anyway.
I see the idea, I'll need some time to properly think through it. I'll try to implement host functions now, to understand the problem better. How should testing of records be done? I've been doing using this guest component locally: record ccc {
value: u64,
multiplier: u64,
}
export cccc: func(a: ccc) -> ccc; fn cccc(a: Ccc) -> Ccc {
Ccc {
value: a.value * a.multiplier,
multiplier: a.multiplier,
}
} ... and then calling it and checking the return value. Is there any resonable way to test these kinds of values without a guest component written in a higher level language? |
Those ownership semantics make sense to me, but they're a bit tricky to bind in a higher level language like Python. For example Python will have some sort of
Yeah it's possible to use the text format of components, albeit it's a bit verbose. You can get a bit of a feel for the text format for components from this directory |
Wrote some tests for the complex types, tried to abstract away most of the creation. Followed the tests in What's your current thoughts on the design? Do you think its good enough to go forward? I think one case is not covered: in a host function, setting the return value to the argument value, as doing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks for being patient I was a bit too busy last week! Overall I think let's commit to this approach. This looks to be workable enough and if it's difficult to integrate into other languages we can tackle that then.
Thank you again very much for working on this!
Later we can provide implementations that take it as a value to avoid copying
b7f9dd6
to
e644e3e
Compare
Oops... thought I could rebase on top of main cleanly because I didn't get any conflicts, sorry. |
7467bd3
to
d9e81ef
Compare
Refactored to use
Feel free to bikeshed all the names, like: I also assume rest of the value types should come in a later PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to have wasmtime_component_val_copy
but I also think it's good to have intermediate helpers as well in case they're needed, so personally I'm ok having both the top-level copy method plus helpers for intermediate ones as needed.
Hmm, the doxygen output for the component files don't show up like it does for other files.. |
Odd! Not that I'm enough of a doxygen expert to know why... |
If this bounces again you can put |
Only contains primitive values and no docs as of now. Trying to gauge if this is the correct way to approach this.
Some things of note:
wasmtime_component_valunion_t::boolean
instead of::bool
because keyword.wasmtime_component_valunion_t::f32
andWASMTIME_COMPONENT_F32
instead offloat32
, should this be changed to be consistent?wasmtime_component_val_t
in the test, so I bumped to 20 only for tests. Is that fine? Could the whole project be bumped to 20?Also, is there any better way to test all values, other than copying the test for each kind of value?