Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to run Go code compiled for WASI #521

Open
yasoob opened this issue Feb 22, 2024 · 10 comments
Open

Unable to run Go code compiled for WASI #521

yasoob opened this issue Feb 22, 2024 · 10 comments

Comments

@yasoob
Copy link

yasoob commented Feb 22, 2024

Hi @tessi !

I am fairly new to web assembly and trying to get Go code compiled for WASI to run via Elixir using Wasmex. Here is a super simple go code:

package main

//export greet
func greet() int32 {
	return 5
}

func main() {}

I can compile this using Tinygo to target WASI like this:

tinygo build -target=wasi -o tiny.wasm main.go

Now I can run this using Wasmex and call greet like this:

bytes = File.read!("./native/tiny.wasm")
{:ok, pid} = Wasmex.start_link(%{bytes: bytes, wasi: true}) 
{:ok, m} = Wasmex.module(pid)

Wasmex.call_function(pid, "greet", [])

This prints 5 as expected.

However, my next plan is to pass complex arguments to a Go program and for that I will need to write to memory. My understanding so far is that I will need to create a Wasmex.Instance using something like this:

{:ok, store} = Wasmex.Store.new_wasi(%Wasmex.Wasi.WasiOptions{})
{:ok, instance} = Wasmex.Instance.new(store, m, %{})

However, this fails with the following error:

iex(99)> {:ok, instance} = Wasmex.Instance.new(store, m, %{})
** (MatchError) no match of right hand side value: {:error, "incompatible import type for `wasi_snapshot_preview1::fd_write`"}
    (stdlib 5.2) erl_eval.erl:498: :erl_eval.expr/6
    iex:99: (file)

These are the exports and imports of my golang WASI binary:

iex(103)> Wasmex.Module.exports(m)
%{
  "_start" => {:fn, [], []},
  "asyncify_get_state" => {:fn, [], [:i32]},
  "asyncify_start_rewind" => {:fn, [:i32], []},
  "asyncify_start_unwind" => {:fn, [:i32], []},
  "asyncify_stop_rewind" => {:fn, [], []},
  "asyncify_stop_unwind" => {:fn, [], []},
  "calloc" => {:fn, [:i32, :i32], [:i32]},
  "free" => {:fn, [:i32], []},
  "greet" => {:fn, [], [:i32]},
  "malloc" => {:fn, [:i32], [:i32]},
  "memory" => {:memory, %{minimum: 2, shared: false, memory64: false}},
  "realloc" => {:fn, [:i32, :i32], [:i32]}
}
iex(104)> Wasmex.Module.imports(m)
%{
  "wasi_snapshot_preview1" => %{
    "fd_write" => {:fn, [:i32, :i32, :i32, :i32], [:i32]}
  }
}

I am not sure why it thinks the types are incompatible for fd_write. Can you please help with this? I have been trying to resolve this on my own for a few days on and off and haven't gotten anywhere. Going to do a last ditch effort to make it work with your help before giving up.

Let me know if you want me to provide any more information.

@tessi
Copy link
Owner

tessi commented Feb 22, 2024

@yasoob I'll try to help :) for the second example (the one that fails with fd_write) do you use the same go binary? if not, can you maybe share your .go or .wasm file for me to replicate the issue?

@tessi
Copy link
Owner

tessi commented Feb 22, 2024

But assuming you didn't change the binary, try following this example and please tell me what happens :)

I think you don't need to instantiate the instance nor module yourself. You may get away by just using another variant of Wasmex.start_link.

This is assuming the following go file:

package main

import "fmt"

func main() {
	var input string
	fmt.Scanln(&input)
	result := fmt.Sprintf("hello %s", input)
	fmt.Println(result)
}

I expanded it a little to demonstrate an option how you could get inputs/outputs to your program using stdin/stdout as an example. (and please forgive me if I'm writing go code that's insulting to any sane go dev - I never did any go before)

binary = File.read!("./native/tiny.wasm")
{:ok, stdout_pipe} = Wasmex.Pipe.new()
{:ok, stdin_pipe} = Wasmex.Pipe.new()
wasi = %Wasmex.Wasi.WasiOptions{args: [], stdout: stdout_pipe, stdin: stdin_pipe}
{:ok, pid} = Wasmex.start_link(%{bytes: binary, wasi: wasi}) 

Wasmex.Pipe.write(stdin_pipe, "yasoob\n")
Wasmex.Pipe.seek(stdin_pipe, 0)
{:ok, []} = Wasmex.call_function(pid, :_start, [])
Wasmex.Pipe.seek(pipe, 0)
Wasmex.Pipe.read(pipe) # "hello yasoob\n"

@yasoob
Copy link
Author

yasoob commented Feb 23, 2024

Thank you so much for your response! Let me start by saying that everything worked out using the stdin and stdout pipes. Here is what I ended up using on the go side:

func test() {
        // Read input from stdin
        data, _ := io.ReadAll(os.Stdin)
	// Send data on stdout
	fmt.Println("👋 Data from Elixir:", string(data), "🌍")
}

My problem is solved and I might continue using the same method. If someone else stumbles on this, I also faced another challenge (unrelated to Wasmex) that took me a while to figure out. By default, go runs the init function of a package whenever you import it somewhere and this function does the package initialization. I had the same function in my go package that I was importing in main but it wasn't running and it took me a while to track this down. I ended up moving the init logic to the initial function call that I run from the package. Maybe this saves someone some debugging time....

However, I am still curious if it is possible to write to memory rather than using stdin and stdout pipes. And to answer your question:

for the second example (the one that fails with fd_write) do you use the same go binary?

It is the exact same binary. 0 changes. I am able to call functions directly using Wasmex but it fails as soon as I create an Instance like I shared in the initial report. It seems like the current method you shared is just an alternative and it should theoretically be possible to write to go memory. Do you have anything else to share that I can try maybe? Also this is not super high priority so it is ok if you have something else going on.

I do have 2 additional followup questions if you have time:

  1. Are you aware of any benchmarks regarding the difference between stdin/stdout and direct copy to memory for WASI? I am currently exporting data out of the WASI binary using JSON and plan on parsing the JSON on Elixir side and make use of it. Is this the best way or do you have a different suggestion?

  2. My WASI binary is stateless. I will need to call the binary every time I get a request from the client. The WASI code will do some computation on the input and return the output to Elixir which will then be returned to the client. What is the preferred way to orchestrate this all. Should I keep a long running Genserver using Wasmex.start_link and use the same pid to do all the calculations whenever I get a new request or is it preferred to start/stop a new Wasmex Genserver for each new client request? Is there anything I need to be careful about regarding memory usage/leaks in either scenario?

Thanks once again for such a wonderful library! I didn't initially think that I would be able to successfully integrate Go code with Elixir in this beautiful way. Looking forward to your enlightening response :)

@tessi
Copy link
Owner

tessi commented Feb 23, 2024

Hey @yasoob - love to hear it works!

Looking forward to your enlightening response :)

haha, let's find out how enlightening it will be 😅

I'm not at my keyboard right now, but still wanted to give you a response swiftly. So I recorded a little video with my thoughts on the topic. Hope it works, if not, I'm happy to write things down later

https://www.loom.com/share/f4a606a4dd72495999ff660716eb12c2

@yasoob
Copy link
Author

yasoob commented Feb 23, 2024

It was so nice to put a face to your name @tessi I am also in a similar situation and exploring all of this in the free time during my paternity leave so I can totally relate with changing priorities 😄

Your responses made a lot of sense. For now I will keep things very simple and keep using pipes and will thinking about implementing parallel computation in the future using multiple genservers if/when the need arises. It was a nice exercise to think through the available options. I am currently writing an article on WASI and how to interface with Go code using Wasmex. Will share it once it is complete.

Just out of curiosity, are you aware of any projects making production use of wasmex?

@yasoob
Copy link
Author

yasoob commented Feb 23, 2024

Update: A small tribute to wasmex in the form of an article https://yasoob.me/posts/running-go-web-assembly-from-elixir/

@tessi
Copy link
Owner

tessi commented Feb 23, 2024

uhhh, nice! I will have a proper look after work on your article. Thanks for mentioning me in there! ❤️

regarding companies using wasmex in production, there are a few I know of who contacted me privately and some more who are strongly considering it. The number of companies that I'm allowed to share is lower. I know @myobie (hey Nathan, hope you're doing great! 👋 ) uses wasmex (or used to at least) and cosmonic/wasmcloud uses it in their open source wasm framework (among other wasm backends).

@tessi
Copy link
Owner

tessi commented Feb 24, 2024

Hey @yasoob I have some good news. After some learning of golang, tinygo, and a lot of googling, I have a working solution which writes to WASM memory using a memory allocator. I'm dividing the solution into several parts - guess I should start blogging too! :D

Implementing a memory allocator

As I said above, we should really get a chunk of reserved memory from our go-WASM program instead of just writing to random memory locations. If we don't do that, we have the risk to accidentally overwrite existing data (like constants, the go stack, or heap). Instead, we call malloc(size) to get a pointer to a block of reserved memory of a certain size and later free(prt) it again.

Turns out that tinygo already gives us malloc and free as WASM exported functions. It's actually already in your first comment in the list of exports of your module. Now, that was surprisingly easy - memory allocator implementation solved! ✅

Passing strings to the golang WASM guest

Next comes writing a go function which can accept a pointer, length i32 pair and make it a proper string.
https://github.com/tetratelabs/wazero/blob/main/examples/allocation/tinygo/testdata/greet.go is a real gem here from which I liberally borrowed some code 👇

//export greet
func greet(ptr, size uint32) (ptrSize uint64) {
	name := ptrToString(ptr, size)
	result := fmt.Sprintf("hello %s", name)
	// .. we'll come to the rest in a second
}

// ptrToString returns a string from WebAssembly compatible numeric types
// representing its pointer and length.
func ptrToString(ptr uint32, size uint32) string {
	return unsafe.String((*byte)(unsafe.Pointer(uintptr(ptr))), size)
}

As far as I understand golang here, we must be careful with ownership of the allocated memory. When we call the exported malloc function in Elixir, we receive a pointer to a chunk of memory. We - in Elixir - own that memory chunk now and are responsible to free it later. BUT by passing it to the greet function, we transfer ownership of this memory chunk to the go runtime. The go garbage collector will now deallocate the memory for us. nice!

Passing strings to golang wasm-guests solved! ✅

Returning a string from the golang WASM guest towards our Elixir host code

Now, we need to find a way to pass back a string towards our Elixir host. We will return a new memory pointer, length pair from go. Thinking about memory ownership again, we create and allocate the string from within go, but transfer ownership of this memory chunk to the Elixir host. Thus, we must free() the returned pointer in Elixir to avoid memory leaks.

Next problem, is that go function can only return one value. But since WASM pointers and length values are both 32-bit wide, we can concat them into one 64-bit integer and return that instead.

//export greet
func greet(ptr, size uint32) (ptrSize uint64) {
	// ...
	ptr, size = stringToLeakedPtr(result)
	return (uint64(ptr) << uint64(32)) | uint64(size)
}

// stringToLeakedPtr returns a pointer and size pair for the given string in a way
// compatible with WebAssembly numeric types.
// The pointer is not automatically managed by TinyGo hence it must be freed by the host.
func stringToLeakedPtr(s string) (uint32, uint32) {
	size := C.ulong(len(s))
	ptr := unsafe.Pointer(C.malloc(size))
	copy(unsafe.Slice((*byte)(ptr), size), s)
	return uint32(uintptr(ptr)), uint32(size)
}

Returning a string from a golang WASM-guest function solved! ✅

Calling our golang WASM function from Elixir

Now comes the fun part where we can finally get some Elixir code typed out! We start by instantiating our golang WASM binary with WASI support. This time, we make sure to create our store, module, and memory explicitly for later use:

binary = File.read!("./native/tiny.wasm")
{:ok, store} = Wasmex.Store.new_wasi(%Wasmex.Wasi.WasiOptions{})
{:ok, module} = Wasmex.Module.compile(store, binary)
{:ok, pid} = Wasmex.start_link(%{store: store, module: module}) 
{:ok, memory} = Wasmex.memory(pid)

Nice! With that boilerplate code out of the way, let's allocate some WASM memory for the string we want to pass and write it to WASM memory at that pointer location!

name = "yasoob"
{:ok, [ptr]} = Wasmex.call_function(pid, :malloc, [byte_size(name)])
Wasmex.Memory.write_binary(store, memory, ptr, name)

Calling our greet function is easy now, we have our ptr freshly allocated and populated and know our strings length. Remember: greet() returns a 64-bit integer with the result-strings pointer and length crammed into one value.

{:ok, [result_ptr_length]} = Wasmex.call_function(pid, :greet, [ptr, byte_size(name)])

That 64-bit value needs to be de-mangled into a 32-bit result_ptr and 32-bit result_len value

result_ptr = Bitwise.>>>(result_ptr_length, 32)
result_len = Bitwise.&&&(Bitwise.<<<(1, 32) - 1, result_ptr_length)

It was some actual fun doing this! Haven't pushed bits around in a while in my $day_job. So that was quite refreshing :)

Now, knowing the result strings pointer and length, we can read the value from WASM memory and finally free() it.

Wasmex.Memory.read_string(store, memory, result_ptr, result_len) # => "hello yasoob"
{:ok, []} = Wasmex.call_function(pid, :free, [result_ptr])

Reading a string from our golang WASM guest solved! ✅

Code Listing

For easier copy&paste, I'll list you the complete go and elixir code again. Hope it works for you :) If so, I guess I have a blog post waiting to be written :)

The go program

package main

// #include <stdlib.h>
import "C"

import (
	"fmt"
	"unsafe"
)

// shameless copy of ptrToString and stringToLeakedPtr from
// https://github.com/tetratelabs/wazero/blob/main/examples/allocation/tinygo/testdata/greet.go

//export greet
func greet(ptr, size uint32) (ptrSize uint64) {
	name := ptrToString(ptr, size)
	result := fmt.Sprintf("hello %s", name)
	ptr, size = stringToLeakedPtr(result)
	return (uint64(ptr) << uint64(32)) | uint64(size)
}

// ptrToString returns a string from WebAssembly compatible numeric types
// representing its pointer and length.
func ptrToString(ptr uint32, size uint32) string {
	return unsafe.String((*byte)(unsafe.Pointer(uintptr(ptr))), size)
}

// stringToLeakedPtr returns a pointer and size pair for the given string in a way
// compatible with WebAssembly numeric types.
// The pointer is not automatically managed by TinyGo hence it must be freed by the host.
func stringToLeakedPtr(s string) (uint32, uint32) {
	size := C.ulong(len(s))
	ptr := unsafe.Pointer(C.malloc(size))
	copy(unsafe.Slice((*byte)(ptr), size), s)
	return uint32(uintptr(ptr)), uint32(size)
}

func main() {
	// This is a placeholder to make sure the main package is not empty.
	// The actual code is in the `greet` function.
}

The Elixir program

File.read!("./native/tiny.wasm")
{:ok, store} = Wasmex.Store.new_wasi(%Wasmex.Wasi.WasiOptions{})
{:ok, module} = Wasmex.Module.compile(store, binary)
{:ok, pid} = Wasmex.start_link(%{store: store, module: module}) 
{:ok, memory} = Wasmex.memory(pid)

name = "yasoob"
{:ok, [ptr]} = Wasmex.call_function(pid, :malloc, [byte_size(name)])
Wasmex.Memory.write_binary(store, memory, ptr, name)
{:ok, [result_ptr_length]} = Wasmex.call_function(pid, :greet, [ptr, byte_size(name)])

result_ptr = Bitwise.>>>(result_ptr_length, 32)
result_len = Bitwise.&&&(Bitwise.<<<(1, 32) - 1, result_ptr_length)
Wasmex.Memory.read_string(store, memory, result_ptr, result_len) # -> "hello yasoob"
{:ok, []} = Wasmex.call_function(pid, :free, [result_ptr])

@ndrean
Copy link

ndrean commented Dec 13, 2024

@tessi Thanks for all this work!

Below is a little similar example that uses Zig to Wasm.

What this snippet does: The WebAssembly machine capitalises a string (just ascii for now).

For the data exchange, there are two ways I believe:

  1. whether let Zig allocate memory: export a function that send a pointer/index so that Wasmex can write or reads at this index (knowing the length),
  2. assign some shared memory: Wasmex writes at some index and sends it to the container through a Zig function call (as an argument).

It is not clear if there is a prefered way to manage memory between the container and the host.

Furthermore, I see the Pipe concept: I quote "a memory buffer that can be used in exchange for a Wasm file".

First version (Zig manages and shares the memory)

Exports:

  • a function "memalloc" that takes a length of the input and exports the memory address
  • a "to_uppercase" takes the input memory address, a length. Zig will mutate the char in place .
  • memfree

Zig assigning memory, exported to Wasmex
const std = @import("std");
const allocator = std.heap.wasm_allocator;

export fn memalloc(len: usize) ?[*]u8 {
    return if (allocator.alloc(u8, len)) |slice|
        slice.ptr
    else |_|
        null;
}

export fn to_uppercase1(ptr: [*]u8, len: usize) void {
    const input = ptr[0..len];
    for (input) |*char| {
        char.* = std.ascii.toUpper(char.*);
    }
}

export fn memfree(ptr: [*]u8, len: usize) void {
    const slice = ptr[0..len];
    allocator.free(slice);
}


Compile it to WASM.
const std = @import("std");

pub fn build(b: *std.Build) void {
    const target = b.resolveTargetQuery(.{
        .cpu_arch = .wasm32,
        .os_tag = .wasi,
    });
    const optimize = b.standardOptimizeOption(.{});

    const exe = b.addExecutable(.{
        .name = "uppercase",
        .root_source_file = b.path("src/main.zig"),
        .target = target,
        .optimize = optimize,
    });
    exe.entry = .disabled;
    exe.rdynamic = true;
    b.installArtifact(exe);
}

  • Wasmex runs "memalloc" so Zig can allocate memory of a given length and return the index to Wasmex
  • Wasmex writes the input at this place
  • Wasmex call "to_uppercase" and knows where the data is and how much. Zig mutates the memory in place
  • Wasmex reads the result at this same location (or the location returned by Zig if something new happens, whatever).
  • the host asks Zig to cleanup with "memfree"
Elixir-Wasmex using Zig memoery assigment
def v1(input_string) do
    {:ok, pid} =
      Wasmex.start_link(%{bytes: File.read!("../zig/zig-out/bin/uppercase.wasm"),wasi: true})

    {:ok, store} = Wasmex.store(pid)
    {:ok, memory} = Wasmex.memory(pid)

    input_length = byte_size(input_string)

    {:ok, [input_ptr]} =
      Wasmex.call_function(pid, "memalloc", [byte_size(input_string)])

    :ok = Wasmex.Memory.write_binary(store, memory, input_ptr, input_string)
    Wasmex.call_function(pid, "to_uppercase2", [input_ptr, input_length])
    result = Wasmex.Memory.read_string(store, memory, input_ptr, input_length)
    Wasmex.call_function(pid, "memfree", [input_ptr, input_length])
   result
end

Second version: Wasmex sets and shares the container memory

Zig mutates memory in place
export fn to_uppercase2(ptr: [*]u8, input_len: usize) void {
    const input = ptr[0..input_len];
    for (input) |*char| {
        char.* = @import("std").ascii.toUpper(char.*);
    }
}

For the Elixir/Wasmex part, I follow the "string" example.

I "set an index", although not quite sure how you manage this.

Elixir-Wasmex sharing memory
def v2(input_string) do
  {:ok, pid} =
      Wasmex.start_link(%{wasi: true, bytes: File.read!("../zig/zig-out/bin/exe2uc.wasm")})
      |> dbg()

   {:ok, store} = Wasmex.store(pid)
   {:ok, memory} = Wasmex.memory(pid)

   Wasmex.Memory.write_binary(store, memory, 0, input_string)
   Wasmex.call_function(pid, "to_uppercase2", [0, byte_size(input_string)])
    Wasmex.Memory.read_binary(store, memory, 0, byte_size(input_string))
end

Both work, but the second way looks a bit stupid by assigning some index.

However, I have questions. If you could point me towards some documentation, would be great.

In the second case, how do you free the memory used by the host/Wasmex in the container heap? The first way-to-do seems more sound to me, although pretty verbose.

Also, I have a case where I want to stream strings or repeatedly use the container. If I Wasmex.start_link, then async_stream some function to the instance, it fails:

{:error, "Could not unlock store_or_caller resource: try_lock failed because the operation would block"}

The solution is to compile each time a new instance. Can I improve this and "cache" the compiled module, hoping to only have to instantiate it with ?

Following the documentation, I try:

{:ok, store} = Wasmex.Store.new()
 {:ok, module} = Wasmex.Module.compile(store, bin) |> dbg()
 Wasmex.Instance.new(store, module, %{})

but this fails as I don't what to pass as the import map.

 "unknown import: `wasi_snapshot_preview1::fd_write` has not been defined"}

So any advice on how one can achieve this "streaming".

Thks for reading

@tessi
Copy link
Owner

tessi commented Dec 27, 2024

Hey @ndrean thanks for your detailed and nicely written question!

From what you shared here, 1. sounds like the preferred solution if you want to go with shared memory in a real application (-> outside of toy examples). The reason is that in Option 2 zig could decide to allocate memory at a point where wasmex already uses it (since zig has no idea we are using this part of the memory already).
caveat: I have no clue about zig - maybe I'm wrong - but I extrapolate from other languages I know :)

I agree, though, that Option 1 is verbose. Fortunately, we have two other options:

Option 3

Use pipes - as you already hinted at. You could open as WASI instance and set stdin + stdout pipes and communicated through those:

  1. write to stdin
  2. call Wasm function that does something on it and writes it to stdout
  3. read stdout

Have a look at this test in our WASI tests for an example.

Option 4

We just added experimental (and yet undocumented 😬 ) support for the Wasm component model. Wasm components are way more clever than regular old-school WASM and know about strings (and other more advanced types like records or enums). Using them, you wouldn't need to handle any memory at all - you just pass the string and get another string back!

Have a look at this test here for an example where we implemented a "greeter" (hello world) example.

We have an experimental macro which makes set-up for these component functions very easy, see this example (also taken from our tests).

Now, I don't know how and if zig already supports the Wasm component model - but if it does, this is the future IMO :)

Let me know if I could help you and I hope my answer is not coming too late! 💜

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants