-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linux's io_uring IO interface (2x performance vs libevent) #10768
Draft
lbguilherme
wants to merge
88
commits into
crystal-lang:master
Choose a base branch
from
lbguilherme:feat/io_uring
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 2 commits
Commits
Show all changes
88 commits
Select commit
Hold shift + click to select a range
6b26c31
feat: initial Crystal::System::IoUring
lbguilherme 4bc3834
feat: implement connect() and linked timeouts
lbguilherme 2c2c817
fix: require RW_CUR_POS feature
lbguilherme 249c0de
fix: arm syscall support
lbguilherme 4161819
fix: drop uname syscall
lbguilherme f1350b7
fix: drop aarch64 syscall table. use arm instead.
lbguilherme 2a7b1ef
fix: io_uring error handling
lbguilherme 09cae8e
feat: add wait_readable/wait_writable
lbguilherme 4ecfff1
fix: remove unused fields from IoUringSqeInnerFlags
lbguilherme dba30b1
feat: add syscall interface for io_uring
lbguilherme e19d264
refactor: store information within the fiber itself. This will allow …
lbguilherme 1bbd9ac
feat: provide Syscall as a public module
lbguilherme 23b9802
fix: use pipe2 instead of pipe. aarch64 doesn't have it.
lbguilherme 4628eda
typo
lbguilherme c998d90
fix: mark Syscall as Experimental and don't define any by default
lbguilherme 29d1e2f
fix: amend previous commit
lbguilherme dacdc35
feat: Make Syscall.def_syscall a public macro
lbguilherme f1c2fc1
fix: docs
lbguilherme dc15356
Merge branch 'feature/syscall' into feat/io_uring
lbguilherme c717295
integrate with new syscall api
lbguilherme 66694ff
feat: integrate to eventloop
lbguilherme 65179d7
add CI test with -Dforce_iouring
lbguilherme 4241ed4
fix build
lbguilherme 1236e60
fix: Crystal::Event for Windows
lbguilherme 347be35
fix: add missing Crystal::IoUringEvent.delete
lbguilherme d2ab7cb
remove iouring CI. Github Actions runs with Kernel 5.4
lbguilherme c169388
feat(io_uring): implement timeout cancelation
lbguilherme f595f59
feat: enable io_uring for Linux 5.4, and enable CI
lbguilherme c218540
Improve IoUring#get_free_index logic
lbguilherme 686179c
feat: implement a LINK_TIMEOUT replacement for Linux 5.4
lbguilherme 9844c1d
fix: redundant cast
lbguilherme 502542a
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme a42a542
fix: check on io_uring_register return value
lbguilherme 49db779
fix: don't include "time", it cause load order issues with the hasher…
lbguilherme 5a353ba
Modify runtime detection to use a CRYSTAL_EVENTLOOP env
lbguilherme 9060a2a
Move IoVec to LibC
lbguilherme 2fdd8ba
Transform Crystal::EventLoop into a class
lbguilherme 08e00e9
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme 29e929f
fix: missing Crystal::IocpEventLoop#create_timeout_event
lbguilherme b9d23e2
Fix win32 Crystal::EventLoop.create
lbguilherme 3be6875
ci: use threads=1 on test_force_iouring
lbguilherme a45fdb0
fix: aarch64 CI (needs to build with Crystal 0.35)
lbguilherme 1cbf50c
fix: add missing aarch64 syscall numbers
lbguilherme 4f7a6bd
Merge branch 'feature/syscall' into feat/io_uring
lbguilherme 5e52f59
fix: use LibC.mmap/munmap/close
lbguilherme 9641194
fix: build on mac
lbguilherme 21c717e
Use ::Fiber.current.object_id
lbguilherme 8046ead
Merge branch 'master' into feat/io_uring
lbguilherme ed7428d
fix: IoVec usage
lbguilherme 33d391d
fix: ensure syscalls are always defined inside the Syscall module
lbguilherme 17d81fa
feat: update linux syscall list
lbguilherme c935dfa
fix: move documentation to the syscall module
lbguilherme 3224e10
chore: refactor syscalls
lbguilherme cf87393
fix: code in doc comments
lbguilherme 085599c
Merge branch 'feature/syscall' into feat/io_uring
lbguilherme ecae4f1
chore: move io_uring syscalls to Crystal::System::Syscall
lbguilherme 2e4fa76
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme 4135b3d
fix merge
lbguilherme 1893808
feat: update io_uring syscall enums
lbguilherme c5c345b
fix: Crystal::IocpEvent#add
lbguilherme 14f22eb
fix: undo changes to win32 port
lbguilherme 8702781
fix: iocp building
lbguilherme 8dcc770
chore: remove enums from syscall and use constants, more in line with…
lbguilherme 5d12d1a
chore: remove submit_and_forget
lbguilherme 55cbcaf
fix: keep a strong reference to all data sent into kernel to avoid GC
lbguilherme c734684
fix: simplify read/write fallback on older kernels
lbguilherme 0364870
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme a18cadd
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme fb72a84
fix: disable io_uring in the interpreter
lbguilherme 8248ffe
fix: correctly convert nanoseconds to timeval
lbguilherme 2a688fa
fix: use timespec64 (only makes a difference for linux32)
lbguilherme d9457df
chore: use to_i instead of total_seconds to avoid going into float an…
lbguilherme ab99c93
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme e0b8cf3
fix: adjust wasm event loop to match updated interface
lbguilherme a893985
chore: improve some io_uring comments
lbguilherme aa3c8fe
feat: add more io_uring ops from recent kernel
lbguilherme e57255d
feat: add a few more constants for future use
lbguilherme e4f0793
crystal tool format
lbguilherme a298d0e
fix: comment typo
lbguilherme 77087cb
Merge branch 'master' into feat/io_uring
lbguilherme bee7038
fix: fix warning by renaming time_span to timeout on some overloads o…
lbguilherme 776b38e
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme cb8fc96
Update src/crystal/system/unix/event_loop_io_uring.cr
lbguilherme 3f31d89
Update src/crystal/system/unix/event_io_uring.cr
lbguilherme 83a39d3
fix(win32): IO::Overlapped#schedule_overlapped
lbguilherme cf609fc
fix: schedule_overlapped should accept nil argument
lbguilherme 3c0ddb2
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme a226e1f
feat(WIP): refactor io_uring to be safer and simpler
lbguilherme File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,43 +1,48 @@ | ||
{% skip_file unless flag?(:linux) %} | ||
|
||
require "./io_uring" | ||
|
||
# :nodoc: | ||
struct Crystal::IoUringEvent < Crystal::Event | ||
struct Crystal::IoUring::Event < Crystal::EventLoop::Event | ||
enum Type | ||
Resume | ||
Timeout | ||
ReadableFd | ||
WritableFd | ||
end | ||
|
||
def initialize(@io_uring : Crystal::System::IoUring, @type : Type, @fd : Int32, &callback : Int32 ->) | ||
@callback = Box.box(callback) | ||
def initialize(@io_uring : Crystal::System::IoUring, @type : Type, @fd : Int32 = -1, &@callback : Int32 ->) | ||
@action_id = 0u64 | ||
end | ||
|
||
def free : Nil | ||
delete | ||
end | ||
|
||
def delete : Nil | ||
return if @action_id == 0u64 | ||
@io_uring.delete_completion_action(@action_id) | ||
if @type.timeout? | ||
@io_uring.timeout_remove(@callback) | ||
@io_uring.submit_timeout_remove(@action_id) | ||
end | ||
@action_id = 0u64 | ||
end | ||
|
||
def add(timeout : Time::Span?) : Nil | ||
delete | ||
@action_id = @io_uring.register_completion_action(@callback) | ||
|
||
timeout = nil if timeout == Time::Span::ZERO | ||
|
||
case @type | ||
in .resume?, .timeout? | ||
if timeout | ||
@io_uring.timeout(timeout, @callback) | ||
@io_uring.submit_timeout(timeout, action_id: @action_id) | ||
else | ||
@io_uring.nop(@callback) | ||
@io_uring.submit_nop(action_id: @action_id) | ||
end | ||
in .readable_fd? | ||
@io_uring.wait_readable(@fd, @callback, timeout: timeout) | ||
@io_uring.submit_poll_add(@fd, Crystal::System::Syscall::POLLIN, action_id: @action_id, timeout: timeout) | ||
in .writable_fd? | ||
@io_uring.wait_writable(@fd, @callback, timeout: timeout) | ||
@io_uring.submit_poll_add(@fd, Crystal::System::Syscall::POLLOUT, action_id: @action_id, timeout: timeout) | ||
end | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So hmm, if I understand this correctly, this make it so that instead of going through events and wait_read/writeable it just submits a poll instead? I see one issue with that and that is the case when several fibers wait on the same file descriptor - it will wake every fiber that has submitted a poll instead of just waking up the first fiber. This is a difference to how the non-uring fiber loop works and I'm not certain that behavior is something that is good enough as it may mean that when a fiber actually gets woken up the fd is no longer actionable as some other fiber may have acted on it.
Also known as "thundering herd problem"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the goal is to mimic the exact same behavior as libevent. This
action_id
there points to a callback (seecreate_fd_write_event
). It will callio.resume_write
that will wake only one single fiber.If the
action_id
were a Fiber, then it would wake up that fiber upon completion. It is more suitable for other operations like read/write/connect/send/etc. Currently only we only do poll with a callback.