Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory allocation patterns #4

Open
pepyakin opened this issue Apr 19, 2018 · 1 comment
Open

Memory allocation patterns #4

pepyakin opened this issue Apr 19, 2018 · 1 comment

Comments

@pepyakin
Copy link

It is actually tough problem how to organize allocation in a wasm environment.

Let's assume you want to publish key-value storage in your host environment runtime. Key and value can be arbitrary length bytestrings.

The question is, how to organize returning data from the get_storage function? The answer depends on how the memory allocation is organized.

I can see two obvious ways to organize it:

Teach the runtime how to allocate and deallocate memory

For example: make the runtime publish alloc/dealloc functions and implement The Single Allocator in the host runtime. But unfortunately this approach isn't too flexible. I think that wasm module should choice the most appropriate allocator by itself (after all, maybe the logic will allow to use something most simple like bump allocator). This approach also might introduce some non-determinism that depends on the exact implemented allocator.
Also the module always should have in mind that it doesn't control the environment and it should account that host environment may do some non-deterministic things.

The other solution would be make a requirement for wasm modules to publish alloc/dealloc functions. That's might be better in terms of flexibility.

Allow allocation only on the wasm side

And make host environment take buffers. This would work with simple things like:

extern "C" {
  fn get_args_len()usize;
  fn get_args_data(buf: *mut u8);
}

and actually pretty well but this model really tears apart when the size of the data buffer isn't know upfront and requires some effort to find it out, as with our get_storage example. To find out what the size of will-be returned value we need to go to the database and fetch the value, so this trick will incur additional overhead or unneeded complexity.

Third approach

In one my experiments I discovered other approach that might combine the best of the previous two approaches.

The idea is to provide a callback to allocate the data. For example:

extern "C" fn get_storage(
    key_ptr: *const u8,
    key_len: usize,
    cb: extern "C" fn(*mut Void, usize) -> *mut u8,
    cb_data: *mut Void,
);

When called, this function will fetch the value from the storage by key. After the fetch this function will call provided callback cb with the size of a value (*mut Void/cb_data part is for implementing closures). This function should allocate space sufficiently to fit the value and return pointer to that allocated space. Then the get_storage will write value at the address provided by the callback cb and return control to the wasm module.

This trick don't rely on the memory allocation mechanism. In certain cases, users can even allocate the space directly on the stack. Or even use scratch space. Or whatever! And on the same time runtime doesn't do anything behind your back.

I think that this trick might be useful in your efforts.

You can check full code here.

Cheers!

@losfair
Copy link
Contributor

losfair commented May 3, 2018

I'd prefer a method similar to how POSIX read/write syscalls work, as explained in ns/resource.

For example, to fetch the value associated with a key from a key-value storage, we could do the following:

a) Initialize request:

fn get_storage(
    storage_handle: i32,
    key_ptr: *const u8,
    key_len: usize
) -> i32;

When get_storage is called, it doesn't actually do the query, but instead creates a Resource containing all the necessary information to make one and returns its handle.

b) Fetch result:

Just call resource_read on the returned handle, and the runtime would fetch the corresponding data from the kv store.

If the data fits in the buffer passed to resource_read, then it will be directly written there; otherwise, a buffer is created to hold the remaining data.

If the client application detects that there may be more data, it calls resource_read again, and the runtime directly feeds data in the created buffer back and moves forward the buffer pointer, doing no more kvstore queries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants