-
-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle incoming resolve messages. #480
Conversation
Still TODO: - We need to handle disembargos with target = (importedCap = ...). - Testing.
The := below is sufficient, since this isn't actually used before that.
Needs testing.
This caught a bug: we need to check if ClientState.Metadata is nil, which is possible if the client itself is null, and seems to happen to the promise after it is resolved. TODO: we should de-dup the logic between releaseExport and releaseExports. This is not entirely trivial though, because the latter is executed after we've wiped the exports table.
Alright, this is done. This basically completes the promise resolution detour from 3PH, Though there's a semantic thing that I don't like, which is that Fulfill(client) feels like it should take ownership of client, but doesn't. I may submit another PR changing this before moving on. |
I'm very much in favor of taking the time to make reference ownership semantics consistent. 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Test is timing out. Re-running. |
Note that there does appear to be a real deadlock that needs to be addressed; the test failures are not spurious. |
These are much cleaner.
...now that the resolver takes ownership.
Status update: there are at least two bugs, one which is a deadlock, and another that is a message-reordering problem. The deadlock is "solved" by this change to the test:
So the problem I think arises when a promise resolves to another promise (in this case a question; not sure if that matters). I'm attaching a log for the ordering problem. Need to dig in still. |
Solved the deadlock (kinda... see the commit message), still need to work out the ordering problem. |
Worryingly, this was manifesting in the test as a deadlock: we hit the error complaining about it not being an import, but then connection shutdown hung, waiting on tasks. I haven't pinned down exactly what was going on there, but this sidesteps the issue by fixing the thing that was causing a connection abort in the first place.
@zenhack Do we have an issue open for the shutdown bug? |
Not that I know of; I have no memory of having seen something like this before. |
I merged back in main, and also reworked the way NewLocalPromise works to avoid some dodgy code that I think has race conditions. I held out some hope that this would fix the ordering bug, but it seems to have no effect. |
Discussed some on matrix, but for posterity: it's not clear to me that the tribble 4-way race condition actually requires three distinct vats, so it could be a problem even in a level 1 implementation, if each link in the promise chain crosses the network. The failure we're seeing matches this description, so I am wondering if it is a symptom of the 4-way race condition. The flow looks something like:
I'm going to put this PR down until I've more deeply understood the implications. I still haven't totally grokked what exactly goes wrong in the 4-way race condition, only that there is an assumption being violated (i.e. I don't know what an example failure due to this issue should look like). I plan on figuring this out, and probably also stepping back to fix the way we do promise resolution in the library; regardless of whether this hunch is correct, we'll need to for 3PH. |
Per #494 (comment), my suspicion was correct: 3PH is not required for the race, and it seems like a good bed that's what's causing the failure. So I'll work on that refactor before coming back to this. |
...instead of a Client. This is a step towards addressing the tribble 4-way race condition. Currently there are some test failures.
The call to waitRef.Release() in export.go was resolving the receiver too soon, resulting in a leak.
I've merged in main and #520, still reconciling the latter. |
Has build errors that need fixing.
I plan on adding *Snapshot versions and eventually storing snapshots internally.
This seems like an unnecessary complication, and I don't want to deal with it when porting stuff over to snapshots.
...and use it to catch a bug; the test is failing, need to track it down.
Otherwise the target message steals the caps, resulting in errors later when we try to read them.
Per capnproto#523, this is currently failing frequently. Mark it as flaky until we're ready to actually deal with it.
In addition to letting us hand out borrowed references, Storing the snapshots will facilitate code that needs them not to resolve further after AddSnapshot() or SetSnapshot().
There are test failures that need debugging though.
Superseded by #530, closing. |
Still TODO: