-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipe.send occasionally hangs when submitting a large job to child process #17
Comments
Interesting. I've written a reproducible example.
This works for the standard library, and hangs for this library. I think my understanding of pipes is limited. Are you saying that if there's lots of data in the pipe sent by the parent which the child hasn't read, then when the child sends data back to the parent, it gets blocked? Is I suppose we could add some kind of buffer in the child, to read payloads into when this happens. Could we get a race condition? e.g. suppose the child wants to send a response back to the parent. It notices the pipe is full, so reads from the pipe to prematurely get the argument for the next task. Then between that happening and the child sending the response back down the pipe for the previous task, the parent writes a third argument into the pipe, so now the child is still blocked. What if instead of using one pipe bidirectionally, we use two pipes, unidirectionally? So if the pipe from parent to child fills up, the pipe from child to parent is still free? (Although maybe then they can both hit a bottleneck where they are writing to a full pipe?) |
I've had a think about this overnight. I could be wrong, but I think that actually a bidirectional pipe has a separate buffer in each direction. They can fill up separately. In this issue the pipe fills up in both directions. If I change this mwe to send big data and receive a small response, it doesn't hang. If I send small data and receive a big response, it doesn't hang. So the issue is that the child process is halfway through sending the response, waiting for the parent to read from the buffer on the return direction, and the parent is halfway through sending the next argument, waiting for the child to read from the buffer on the initial direction. That's the deadlock, and I'm not sure how to resolve it. Perhaps we could use locks for each pipe (one lock for both of the directions) to make each send/recv atomic, and the child has a cache of unprocessed work from the parent and unsent results. Although locking seems like overkill, and if the work is slow, the parent might hog the lock and just keep sending data, and the child never has a chance to send any back, so the child process fills up with too much memory. Do you have any other ideas? Here's another MWE. Here we're not using this lambda_multiprocessing library. It's just two processes which each try to send a large amount of data to the other, then receive a large amount. They deadlock.
|
So I did some more research into pipes and they do have separate read/write buffers. However, if the write buffer fills up,
Both the parent and the child are now blocked on their I tried adding a
Since the parent and child are different processes, I'm not sure there's a way to ensure that the parent doesn't write to the pipe between the child calling I took a look at the multiprocessing Pipe code and it looks like it uses
It's a bit annoying to read, but the "500000" are microseconds, which translates to a 0.5s timeout. I set I used the following logic for
edit: hit enter too soon... There is still an issue where if a write times out, partial writes have already been made, meaning that the other side reads the partial data and hangs waiting for the rest of it:
Not sure how to fix this one, still thinking about it... edit 2: could probably be fixed by adding a timeout to recv too. duh |
Thanks. I spent some more time on this too. (It's an interesting problem. Although I really should be working on my university assignments instead of this.) I had a similar idea like yours. I think the problem is that if you do (as you suggested):
The problem is that step 4 is not atomic. If the child notices that what it sends is not being read, so it reads the next payload written by the parent, it's likely that as soon as that payload is read and the buffer is empty, the parent will immediately send another payload for the next task. So when the child writes the next result, we end up with the same deadlock. With your timeout solution, I'm also worried about performance. Suppose there are 1000 tasks which should take 10ms each. For a large number of tasks, with the current implementation the parent will try to immediately write all 1000 payloads into the pipes before reading anything from the child. So this deadlock situation is actually quite likely (for large data or large number of payloads), not an edge case. You'd end up taking 0.5 seconds on most task executions ( I think the key challenge is that the parent is trying to give the next task to a child before it has finished working on the previous task. I played around with modifying how the parent sends data. I wrote code to chunk up the pickled object into bytes equal to buffer size (which is difficult to measure/set), and had the parent poll the data coming from the child in between each chunk it writes to the child (and call .recv() if there's something there). This still resulted in some deadlocks, but I'm not sure why. Another idea I had was to write the actual payloads to a temporary file, and then pass a file handler (or file path). This reduces the size of data sent through the pipe. However for a sufficiently large number of (small) payloads, you could still fill up the buffer. I came up with a solution for For the async ones, since the |
I think I got this working with both
In
And flush:
There are three outcomes from
In the case of large jobs, there are two outcomes to a
I think that the I haven't run into any deadlocks with this code so far but I'm going to continue testing locally to see if I run into any. Thoughts? |
Thanks for your help on this. I still suspect there might still be a deadlock issue with this solution:
I don't have much time at the moment to work on this. I noticed that the behavior of map_async and starmap_async in this library do not match the standard library. They should return a single I've pushed a new branch (see #19) with these new tests, and my proposed solution for |
I actually don't think that should cause a deadlock. Step 4 has two possible outcomes:
For me it's urgent-ish? I'm also happy to make the changes myself and open a PR for them, if you don't have the time. I noticed that too (I was going to make a ticket for it, but I forgot...). I'm actually using the async functions but currently I just changed my code to work with a list of results. |
I've tried to implement your solution in #20. I get an error.
I think I'm not setting the timeout where you wanted to set it. I also tried to implement my proposed solution with an additional process in the middle that just acts as a buffer, reading all payloads from the parent before sending any to the child. That's more complicated than I thought, but still possible. (I've run out of time tonight though.) Since you're in a hurry, I could just merge #19 tomorrow, which fixes this deadlock bug for the non-async calls (but the bug is still present for async). If you need this bug resolved for async urgently, you could perhaps write a workaround to write each data payload to a temporary file (e.g. explicitly write to Note that with your timeout solution, I think that |
I had an idea for a solution that doesn't involve timeouts, middleman processes, temporary file etc. It also means that the The idea is that we send tasks to children in batches. Inside It's relatively simple. One concern I had about the timeout solution was that the code for So I've just implemented that simple batching solution. It passes the tests, and I've got no more time left to spend on this right now. I've released it now as version |
That error is interesting. I was setting my timeouts before I think that I'll definitely pull your new version and see how it goes for my use case. Fingers crossed that it will "just work" - I'll let you know if I run into any more issues. |
This is an intermittent issue that seems to happen in the following circumstances:
I believe this is because step (3) fills up the buffer in the pipe and therefore blocks until the data is read. Meanwhile, step (4) tries to write to the pipe with the full buffer and also blocks, waiting to be read.
The child process should check if there is data in the pipe (and read it if so) before sending the job results back to the parent.
The text was updated successfully, but these errors were encountered: