-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In-memory pipe implementation #12238
Comments
Indeed, this is needed for In #5430 I mentioned I had some code for this. I posted it in #12245 to show where I was going with it. I also have another implementation that isn't based on One thing that might be nice is to allow the buffer to be elastic — shrink it back down when the size:capacity ratio is reduced below some threshold. I have no idea if that's feasible or what the real-world performance looks like but a user-space queue won't yield the CPU on I/O because we're not doing any kernel-level I/O, so a whoooole lot of writes can happen before a single read does, potentially ballooning memory until OOM. Another mitigation strategy for that is to replace the |
We'll need to handle concurrent access correctly, so using a channel would be a good idea. |
If I stick with the |
Yeah, Channel does that for you already |
Yes, but that's not the only difference between the two. The way you're talking about it implies you are favoring one implementation based on a single factor. |
Managing concurrent access is hard. So I would start with channels, just because it's easier. And there's no point in reimplementing that just for the sake of it. |
There is more to what I said than “for the sake of it”.
I pointed out tradeoffs in this comment. |
Forgive me, I read that comment as a perfect argument for using channels. |
I wasn't making a case one way or the other, really. Just offering discussion. I don't think there was enough information at the time to draw any conclusions about a preferred implementation. There may still not be even after this post.
Basing a decision on my assumption would be a mistake, certainly. I would never suggest making decisions about performance without benchmarks. But I'd argue that your dismissal of my note about performance is a mistake, as well. And while I agree that correctness is more important than performance, channels provide little to no opportunity for optimization here. The API design explicitly prohibits that:
These were all intentional design choices for channels. So your comment about optimizing a channel-based implementation later seems a lot more optimistic than reality allows. I wrote up a quick benchmark passing 8KB at a time through a channel vs a deque, plus the current
Benchmark coderequire "benchmark"
size = 8192
channel = Channel(UInt8).new(size)
queue = Deque(UInt8).new(size)
p_read, p_write = IO.pipe(false, false)
slice = Bytes.new(size, '.'.ord.to_u8)
Benchmark.ips do |x|
x.report "channel" do
size.times { |i| channel.send slice[i] }
size.times do |i|
select
when byte = channel.receive?
if byte
slice[i] = byte
else
break
end
else
end
end
end
mutex = Mutex.new
x.report "deque" do
mutex.synchronize do
queue.copy_from slice
queue.copy_to slice
queue.pop(size)
end
end
x.report "FD pipe" do
p_write.write slice
p_read.read slice
end
end
class Deque(T)
def copy_from(slice : Slice(T))
if @buffer.null?
initialize slice.size
end
if @start == 0
slice.copy_to @buffer, @capacity
else
slice.copy_to @buffer + @start, @capacity - @start
slice.copy_to @buffer, @start - @capacity
end
end
def copy_to(slice : Slice(T))
if @start == 0
slice.copy_from @buffer, @capacity
else
slice.copy_from @buffer + @start, @capacity - @start
slice.copy_from @buffer, @start - @capacity
end
end
end Since each operation here is 8KB, that's:
This means passing bytes over a Getting the Maybe a similar optimization can be done for channels (since they're implemented on top of deques), and I'd be curious if it's feasible without bypassing a lot of TL;DR: I'd caution against zeroing in on a preferred implementation with too little information considered. |
Regarding the blocking issues in #8152, isn't that using a different pipe creation that is different from Another example of where pipes are currently used is multithreaded crystal, to enqueue fibers between threads. |
Regarding the benchmark: FWIW, |
The purpose of this discussion is to have an implementation that doesn't use file descriptors at all, for the same reason you might use
Please re-read the part of that post where I mentioned why
If you think it's not up to scratch, feel free to show me how it's done.
Please see this comment. You're more than welcome to keep using the implementation backed by file descriptors. It isn't going to be removed. |
This issue stemmed from the fact that the use of non-blocking |
Two years later, |
This issue has been mentioned on Crystal Forum. There might be relevant details there: |
This discussion came up in #5430 (comment)
IO.pipe
is currently implemented based on file descriptors, which has a couple downsides:A use-mode implementation would avoid these problens and be more performant because there's no switch to the kernel.
It can't be used for inter-process communication, though. So we should retain a pipe implementation based on file-descriptors for this use case.
The text was updated successfully, but these errors were encountered: