-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Rust tokio async interoperability through pollOneOff #1891
Comments
I definitely don't have enough Rust knowledge to triage this, but for the benefit of others and yourself, do you have code that you can share that shows the issue? You patched wazero, where's this patched wazero? I don't mean snippets, but a minimal, complete and verifiable example. Without that, it seems very unlikely that anyone will be able to help you. |
Thanks for your timely response!
I simply patched it in my fork here: https://github.com/gaukas/wazero
Gladly. https://github.com/gaukas/wasmasync contains the requested examples, including a driver written in Go (using my fork of wazero), 4 examples written in Rust (asyncrs, asynctokio, asynctokio2, syncrs) and 1 example in WAT. |
Any general feedback would be really helpful as well, including my patch and advice on creating WebAssembly binary achieving the same goal. One reason I had to use Rust is that I could not figure out a proper way to do this in Go. It seems TinyGo's WASI target did not really like what I'm trying to do (recovering a |
Hi, thanks for the detailed reproducer. The error "Not supported" is likely not to be a real OS error but a WASI error, specifically, wazero's WASI implementation, which, in some cases, may return ENOTSUP. First of all, you can get further information about what's going on at the WASI/FS/Socket layer by adding these lines to your // Configure host logging.
ctx = context.WithValue(ctx, experimental.FunctionListenerFactoryKey{},
logging.NewHostLoggingListenerFactory(os.Stderr, logging.LogScopeFilesystem|logging.LogScopePoll|logging.LogScopeSock)) This will enable host logging and dump to the console some further info about which syscalls are failing. In this case, however, you can find the offender in wazero/imports/wasi_snapshot_preview1/poll.go Lines 145 to 156 in 009ee70
Since you are already getting your hands dirty with the code base, you could arbitrarily return success to a writing subscription if the underlying fd is already in nonblocking mode. Notice this is somewhat correct for real disk files, but not quite correct for sockets, that's one of the reasons we are not supporting it at the moment: it requires significant plumbing on the codebase to support properly, and we had no evidence of real-world usage so far. case wasip1.EventTypeFdWrite:
fd := int32(le.Uint32(argBuf))
if fd < 0 {
return sys.EBADF
}
if file, ok := fsc.LookupFile(fd); !ok {
evt.errno = wasip1.ErrnoBadf
writeEvent(outBuf[outOffset:], evt)
nevents++
} else if file.File.IsNonblock() {
writeEvent(outBuf[outOffset:], evt)
nevents++
} else {
evt.errno = wasip1.ErrnoNotsup
writeEvent(outBuf[outOffset:], evt)
nevents++
} Eventually, however, on my macOS system, wazero/internal/sysfs/sock_unix.go Lines 106 to 115 in 009ee70
As you can see from the logs as well:
This also happens with I also cannot exclude something is going on at the use-site in Go, e.g. the socket has been closed on that end and the wasm side is now accessing an invalid descriptor: https://github.com/gaukas/wasmasync/blob/9c89b62a92059913607ee398836de361faaa52b4/main.go#L162-L168 maybe @ncruces or @achille-roussel have ideas :) |
Thank you for such a detailed response and walkthrough!
Thanks for the explanation. Is there an issue tracker or roadmap for what's needed to be done? I might be able to help in some ways.
This is a bit concerning since this error does not occur on my Linux test machine. Could you please try Also, do you have any insights on possible discrepancies between wazero's linux and macOS implementation which might be related to this issue? Or are you suggesting it is more likely unrelated to wazero but rather inside WebAssembly or in Go's host setup (before entering wazero)? Again, thank you so much for your time and effort. I appreciate it. |
let me first reply to the last question:
I meant that applying the patch above the programs progress but they fail later with that error. Otherwise they fail exactly as you reported.
In general, requirements are user-driven, so in our experience, so far, nobody has encountered issues in the implementation before you. What's your use case? :o) And I mean, besides "I want to use Tokio" :P You can disclose it privately if you are more comfortable; reach us out on the Gophers slack. We hang on the #wazero channel but you can DM me if you prefer :) In any case, to implement the missing part of poll_oneoff, it should:
In this particular case, as [https://www.remlab.net/op/nonblock.shtml][the Remlab article] put it,
so, my understanding is that a good implementation of a write subscription should return EAGAIN unless there is space left on the system buffer. How do you know? One way might be to simply invoke wazero/imports/wasi_snapshot_preview1/poll.go Lines 181 to 182 in 009ee70
So basically, this would require refactoring the entire |
Thanks for the clarification. That's actually what I meant, asyncrs and syncrs are expected to succeed, and I don't believe the patch will result in them failing. |
Okay, so after applying the suggested patch, the async read operation does get through. However, it is quite weird that the file descriptor inside WebAssembly will fail exactly 5 seconds after the function blocks for 5 seconds, no matter what's being done (e.g., number of bytes written), and if and only if tokio is involved. I did not reproduce the same error on asyncrs. Further, the TCP connections are still alive after the WebAssembly function call fails. Which indicates it is unlikely a problem outside wazero. Guess I will spend more time looking into this. |
hah, interesting both EDIT: ...aaand I get the same result (./wasm/syncrs.wasm) no patches applied
EDIT2: I suspect the Go runtime is involved somehow |
So are we getting split results? In conclusion on my end, Linux (Ubuntu 22.04), without the patch:
With the patch:
Edit: Bad file descriptor is local to WebAssembly, Go can still read/write to the TCP connections after WebAssembly fails. |
I just pushed an update to asynctokio2, which uses the loop correctly to infinitely read-then-write (instead of returning on the first success). |
so yeah apparently I am getting Bad file descriptor in all cases when the patch is applied. Could it be version-dependent? I am testing against go1.21.5 on macOS, go 1.21.1 on Linux |
My results are from Go 1.21.5 on Linux. |
I think I figured this out. It is due to a very trivial misuse of wazero/internal/sysfs/sock_unix.go Lines 98 to 104 in 009ee70
The code above called
Which indicates the fd obtained is rather a different fd for the "copy of the underlying os.File". Therefore it is totally possible and legitimate for the GC to free the file if no pointer to it lives anymore.
Setting finalizer for Alternatively, we might be able to use
|
Similar issue is expected to exist for TCP Listener. I found I have trouble understanding the comment... wazero/internal/sysfs/sock_unix.go Lines 17 to 38 in 009ee70
What is this
Therefore this duplication is likely to be redundant. And similarly, we will need to save the Given the fact that we are manually duplicating, saving the file seems to be more preferable. |
Btw I will be happy to open a PR addressing this, if we can settle on a solution: TCP Conn:
TCP Listener:
|
Thanks @gaukas for the detailed detective work. You are indeed correct that
So, I guess the comment on In the Windows implementation ( wazero/internal/sysfs/sock_windows.go Lines 106 to 108 in 009ee70
there is a good reason for that,
instead, the fd is never leaked, and all syscalls go through While this makes the code slightly more complicated, it might be a better solution for *NIX too: it is probably the safer version, and it comes with the added benefit that it should align a bit more the code to the Windows version. |
Does #1892 fix this? |
Hi @ncruces, no, #1892 fixes a different problem we found during troubleshooting. The actual problem is in poll_one_off, which we are currently using a bypass as @evacchi suggested without fixing the root cause. The real fix could be a part of goals under #1500. But feel free to close this issue if it duplicates #1500. |
I edited the title to better describe the actual issue in case someone had the same problem. Feel free to update if you have a better suggestion! |
Merry Christmas and Happy Holidays! 🎄
Is your feature request related to a problem? Please describe.
I have a WebAssembly binary generated by compiling from Rust, in which async I/O operations are performed using tokio.
However, the WebAssembly panics when performing async read operations:
Ref: tokio-1.35.1/src/runtime/io/driver.rs:157
I noticed this error was actually returned by
mio::Poll::poll()
(Docs), which calls toSelector::select()
(Source). So it is likely the invokedsyscall(kevent)
has triggered the OS error.Describe the solution you'd like
Since the given operation can be executed perfectly fine on at least one other popular WebAssembly runtime (wasmtime) with rest of them untested, the best option is seemingly adding the related syscall support in full in order to support complete async/nonblocking operations.
Describe alternatives you've considered
Or, it could be due to a mistake on my end for non-standard use of wazero (see Additional Context). I would be rather happy to see an existing example that works with
tokio::select
andtokio::io::AsyncReadExt::read()
.Additional context
So I am indeed doing something weird (#1857). I patched several interfaces in wazero to allow me insert
net.TCPConn
into WebAssembly with access through a file descriptor.(*Context).InsertTCPConn
Inside WebAssembly I am wrapping the fd back into a TCP connection (
std::net::TcpStream
):stdasync/lib.rs
Which works perfectly, both read and write could be performed correctly.
But it will not work if "upgraded" with tokio.
tokioasync/lib.rs
So I have made an educated guess that I have been doing what supposed to be done for my purposes. But if there are apparently better ways to insert TCP connections/sockets dynamically (i.e., after instantiation), I am open to advices.
The text was updated successfully, but these errors were encountered: