Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add user-space IO.pipe #12245

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jgaskins
Copy link
Contributor

This is very much a WIP (it doesn't even compile yet for make std_spec), but the intent here is to implement #12238.

Since Process and Signal currently depend on the file-descriptor-based implementation, we need to keep that functionality, but for most usage of it today (in the stdlib: IO::Stapled and Crystal::FiberChannel) We already have some of the right abstractions to hook up OS-level pipes to since we have IO::FileDescriptor. IO.pipe currently returns a pair of those, so we can actually hang that on IO::FileDescriptor directly:

read, write = IO::FileDescriptor.pipe # OS-level pipe
read, write = IO.pipe                 # User-space pipe

One nice thing is that, since IO::FileDescriptor inherits from IO, you can call IO::FileDescriptor.pipe today and get that functionality.

Closes #12238

@asterite
Copy link
Member

Note that because of backwards compatibility reasons, we can't change IO.pipe. It must be the other way around: we must add a new name that returns an in-memory pipe. For example IO.memory_pipe

@jgaskins
Copy link
Contributor Author

Note that because of backwards compatibility reasons, we can't change IO.pipe.

I wondered about this, since IO.pipe exposes the concrete types it returns.

@asterite
Copy link
Member

I wondered about this, since IO.pipe exposes the concrete types it returns

It's not about the type. People are probably using IO.pipe for inter-process communication. If we change that to a memory pipe it's going to break their code.

@jgaskins
Copy link
Contributor Author

Ah, that I hadn’t considered. Do you have any examples? I’m wondering if there’s a way we can make that work by stitching the IO::Pipe together with an FD-based pipe.

@asterite
Copy link
Member

I don't have any examples, but I'm sure people are using IO.pipe like that in the wild.

@jgaskins
Copy link
Contributor Author

It’s not that I don’t believe you. 😄 What I mean is that Process#stdio_to_fd seems to allow for non-FD-based IO by stitching it to an FD-based IO that it can pass to the other process. Am I reading that correctly? Are there other use cases to consider?

@asterite
Copy link
Member

I mean Process.fork

@asterite
Copy link
Member

#4350 (comment)

@jgaskins
Copy link
Contributor Author

Oh, I see. Since fork isn’t a documented method anymore, is this still a concern? It seems like the only documented interface is via Process methods that don’t rely on closures implicitly passing FDs to child processes.

@asterite
Copy link
Member

More of a question for @straight-shoota or @beta-ziliani

In my opinion we shouldn't have kept fork public. We should only have used it internally for Process.run. But I think it's public API and we can't undo that.

So I guess it all depends on whether we want to keep true backwards compatibility or not.

For instance, there's no fork in Go. I don't know why we decided to keep it for Crystal.

@jgaskins
Copy link
Contributor Author

More of a question for @straight-shoota or @beta-ziliani

That’s fair. And I may be projecting my own assumptions about what public API does and does not mean. Specifically, fork does not appear anywhere in the documentation, so I would expect it not to be supported. There’s also some discussion on that in #9136.

Copy link
Contributor

@yxhuvud yxhuvud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of if this is a bad idea or not, Crystal::FiberChannel also need to use the OS implementation.

BTW, this implementation also misses a very fundamental thing is that it does not block when there is nothing to read. It also have an infinite buffer, which is something real system pipes do not have - they start to block if you start to push a lot of data that doesn't get read.

So I'd say this isn't even an implementation of pipe. Like, at all.

Comment on lines +21 to +29
while index < slice.size
# TODO: Implement read blocking
if byte = @buffer.shift?
slice[index] = byte
count = (index += 1)
else
break
end
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the really big selling points of using a pipe over a raw deque or something is that it is threadsafe. This mean that all modifications to internal state need to be protected somehow. Probably by a mutex unless someone feels like implementing a lockfree queue implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see this comment.

Comment on lines +35 to +37
slice.each do |byte|
@buffer << byte
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this really, really needs to be threadsafe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see this comment.

Comment on lines +44 to +47
bytes = Bytes.new(@buffer.size)
@buffer.each_with_index do |byte, index|
bytes[index] = byte
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This make no sense. Since when do peek return a copy of the whole buffer?

(and it also needs thread safety)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This make no sense. Since when do peek return a copy of the whole buffer?

This is a draft and I mentioned in the PR description that this is still very much a WIP.

(and it also needs thread safety)

Please see this comment.

@jgaskins
Copy link
Contributor Author

Regardless of if this is a bad idea or not, Crystal::FiberChannel also need to use the OS implementation.

What's stopping it from using the implementation based on IO::FileDescriptor?

BTW, this implementation also misses a very fundamental thing is that it does not block when there is nothing to read. It also have an infinite buffer, which is something real system pipes do not have - they start to block if you start to push a lot of data that doesn't get read.

It's not missing it. It's just not implemented yet. This is a draft.

So I'd say this isn't even an implementation of pipe. Like, at all.

How is this contributing to the conversation? Like, at all?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

In-memory pipe implementation
4 participants