Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux's io_uring IO interface (2x performance vs libevent) #10768

Draft
wants to merge 88 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
88 commits
Select commit Hold shift + click to select a range
6b26c31
feat: initial Crystal::System::IoUring
lbguilherme May 30, 2021
4bc3834
feat: implement connect() and linked timeouts
lbguilherme Jun 1, 2021
2c2c817
fix: require RW_CUR_POS feature
lbguilherme Jun 1, 2021
249c0de
fix: arm syscall support
lbguilherme Jun 2, 2021
4161819
fix: drop uname syscall
lbguilherme Jun 2, 2021
f1350b7
fix: drop aarch64 syscall table. use arm instead.
lbguilherme Jun 2, 2021
2a7b1ef
fix: io_uring error handling
lbguilherme Jun 2, 2021
09cae8e
feat: add wait_readable/wait_writable
lbguilherme Jun 2, 2021
4ecfff1
fix: remove unused fields from IoUringSqeInnerFlags
lbguilherme Jun 2, 2021
dba30b1
feat: add syscall interface for io_uring
lbguilherme Jun 2, 2021
e19d264
refactor: store information within the fiber itself. This will allow …
lbguilherme Jun 2, 2021
1bbd9ac
feat: provide Syscall as a public module
lbguilherme Jun 3, 2021
23b9802
fix: use pipe2 instead of pipe. aarch64 doesn't have it.
lbguilherme Jun 3, 2021
4628eda
typo
lbguilherme Jun 3, 2021
c998d90
fix: mark Syscall as Experimental and don't define any by default
lbguilherme Jun 4, 2021
29d1e2f
fix: amend previous commit
lbguilherme Jun 4, 2021
dacdc35
feat: Make Syscall.def_syscall a public macro
lbguilherme Jun 4, 2021
f1c2fc1
fix: docs
lbguilherme Jun 4, 2021
dc15356
Merge branch 'feature/syscall' into feat/io_uring
lbguilherme Jun 5, 2021
c717295
integrate with new syscall api
lbguilherme Jun 5, 2021
66694ff
feat: integrate to eventloop
lbguilherme Jun 5, 2021
65179d7
add CI test with -Dforce_iouring
lbguilherme Jun 5, 2021
4241ed4
fix build
lbguilherme Jun 5, 2021
1236e60
fix: Crystal::Event for Windows
lbguilherme Jun 5, 2021
347be35
fix: add missing Crystal::IoUringEvent.delete
lbguilherme Jun 5, 2021
d2ab7cb
remove iouring CI. Github Actions runs with Kernel 5.4
lbguilherme Jun 5, 2021
c169388
feat(io_uring): implement timeout cancelation
lbguilherme Jun 5, 2021
f595f59
feat: enable io_uring for Linux 5.4, and enable CI
lbguilherme Jun 5, 2021
c218540
Improve IoUring#get_free_index logic
lbguilherme Jun 5, 2021
686179c
feat: implement a LINK_TIMEOUT replacement for Linux 5.4
lbguilherme Jun 6, 2021
9844c1d
fix: redundant cast
lbguilherme Jun 6, 2021
502542a
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme Jun 6, 2021
a42a542
fix: check on io_uring_register return value
lbguilherme Jun 6, 2021
49db779
fix: don't include "time", it cause load order issues with the hasher…
lbguilherme Jun 6, 2021
5a353ba
Modify runtime detection to use a CRYSTAL_EVENTLOOP env
lbguilherme Jun 6, 2021
9060a2a
Move IoVec to LibC
lbguilherme Jun 8, 2021
2fdd8ba
Transform Crystal::EventLoop into a class
lbguilherme Jun 8, 2021
08e00e9
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme Jun 8, 2021
29e929f
fix: missing Crystal::IocpEventLoop#create_timeout_event
lbguilherme Jun 8, 2021
b9d23e2
Fix win32 Crystal::EventLoop.create
lbguilherme Jun 8, 2021
3be6875
ci: use threads=1 on test_force_iouring
lbguilherme Jun 8, 2021
a45fdb0
fix: aarch64 CI (needs to build with Crystal 0.35)
lbguilherme Jun 8, 2021
1cbf50c
fix: add missing aarch64 syscall numbers
lbguilherme Jun 8, 2021
4f7a6bd
Merge branch 'feature/syscall' into feat/io_uring
lbguilherme Jun 8, 2021
5e52f59
fix: use LibC.mmap/munmap/close
lbguilherme Jun 8, 2021
9641194
fix: build on mac
lbguilherme Jun 8, 2021
21c717e
Use ::Fiber.current.object_id
lbguilherme Jun 12, 2021
8046ead
Merge branch 'master' into feat/io_uring
lbguilherme Jun 29, 2021
ed7428d
fix: IoVec usage
lbguilherme Jun 30, 2021
33d391d
fix: ensure syscalls are always defined inside the Syscall module
lbguilherme Nov 6, 2021
17d81fa
feat: update linux syscall list
lbguilherme Nov 6, 2021
c935dfa
fix: move documentation to the syscall module
lbguilherme Nov 8, 2021
3224e10
chore: refactor syscalls
lbguilherme Nov 9, 2021
cf87393
fix: code in doc comments
lbguilherme Nov 10, 2021
085599c
Merge branch 'feature/syscall' into feat/io_uring
lbguilherme Nov 10, 2021
ecae4f1
chore: move io_uring syscalls to Crystal::System::Syscall
lbguilherme Nov 10, 2021
2e4fa76
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme Nov 10, 2021
4135b3d
fix merge
lbguilherme Nov 10, 2021
1893808
feat: update io_uring syscall enums
lbguilherme Nov 10, 2021
c5c345b
fix: Crystal::IocpEvent#add
lbguilherme Nov 10, 2021
14f22eb
fix: undo changes to win32 port
lbguilherme Nov 10, 2021
8702781
fix: iocp building
lbguilherme Nov 10, 2021
8dcc770
chore: remove enums from syscall and use constants, more in line with…
lbguilherme Nov 17, 2021
5d12d1a
chore: remove submit_and_forget
lbguilherme Nov 17, 2021
55cbcaf
fix: keep a strong reference to all data sent into kernel to avoid GC
lbguilherme Nov 18, 2021
c734684
fix: simplify read/write fallback on older kernels
lbguilherme Nov 18, 2021
0364870
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme Nov 18, 2021
a18cadd
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme Jan 12, 2022
fb72a84
fix: disable io_uring in the interpreter
lbguilherme Jan 12, 2022
8248ffe
fix: correctly convert nanoseconds to timeval
lbguilherme Jan 12, 2022
2a688fa
fix: use timespec64 (only makes a difference for linux32)
lbguilherme Jan 12, 2022
d9457df
chore: use to_i instead of total_seconds to avoid going into float an…
lbguilherme Jan 13, 2022
ab99c93
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme Mar 30, 2022
e0b8cf3
fix: adjust wasm event loop to match updated interface
lbguilherme Apr 3, 2022
a893985
chore: improve some io_uring comments
lbguilherme May 23, 2022
aa3c8fe
feat: add more io_uring ops from recent kernel
lbguilherme May 23, 2022
e57255d
feat: add a few more constants for future use
lbguilherme May 23, 2022
e4f0793
crystal tool format
lbguilherme May 23, 2022
a298d0e
fix: comment typo
lbguilherme May 24, 2022
77087cb
Merge branch 'master' into feat/io_uring
lbguilherme Sep 6, 2022
bee7038
fix: fix warning by renaming time_span to timeout on some overloads o…
lbguilherme Oct 20, 2022
776b38e
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme Oct 20, 2022
cb8fc96
Update src/crystal/system/unix/event_loop_io_uring.cr
lbguilherme Oct 20, 2022
3f31d89
Update src/crystal/system/unix/event_io_uring.cr
lbguilherme Oct 20, 2022
83a39d3
fix(win32): IO::Overlapped#schedule_overlapped
lbguilherme Oct 20, 2022
cf609fc
fix: schedule_overlapped should accept nil argument
lbguilherme Oct 20, 2022
3c0ddb2
Merge remote-tracking branch 'upstream/master' into feat/io_uring
lbguilherme Oct 29, 2022
a226e1f
feat(WIP): refactor io_uring to be safer and simpler
lbguilherme Nov 9, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Download Crystal source
uses: actions/checkout@v2
uses: actions/checkout@v3

- name: Prepare System
run: bin/ci prepare_system
Expand Down
2 changes: 1 addition & 1 deletion src/crystal/scheduler.cr
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ require "crystal/system/thread"
# Only the class methods are public and safe to use. Instance methods are
# protected and must never be called directly.
class Crystal::Scheduler
protected getter event_loop = Crystal::EventLoop.create
@event_loop = Crystal::EventLoop.create

def self.event_loop
Thread.current.scheduler.@event_loop
Expand Down
21 changes: 10 additions & 11 deletions src/crystal/system/event_loop.cr
Original file line number Diff line number Diff line change
Expand Up @@ -11,25 +11,24 @@ abstract class Crystal::EventLoop
{% end %}

# Create a new resume event for a fiber.
abstract def create_resume_event(fiber : Fiber) : Crystal::Event
abstract def create_resume_event(fiber : Fiber) : Event

# Creates a timeout_event.
abstract def create_timeout_event(fiber : Fiber) : Crystal::Event
abstract def create_timeout_event(fiber : Fiber) : Event

# Creates a write event for a file descriptor.
abstract def create_fd_write_event(io : IO::Evented, edge_triggered : Bool = false) : Crystal::Event
abstract def create_fd_write_event(io : IO::Evented, edge_triggered : Bool = false) : Event

# Creates a read event for a file descriptor.
abstract def create_fd_read_event(io : IO::Evented, edge_triggered : Bool = false) : Crystal::Event
end
abstract def create_fd_read_event(io : IO::Evented, edge_triggered : Bool = false) : Event

# :nodoc:
abstract struct Crystal::Event
# Frees the event.
abstract def free : Nil
abstract struct Event
# Frees the event.
abstract def free : Nil

# Adds a new timeout to this event.
abstract def add(timeout : Time::Span?) : Nil
# Adds a new timeout to this event.
abstract def add(timeout : Time::Span?) : Nil
end
end

{% if flag?(:wasi) %}
Expand Down
25 changes: 15 additions & 10 deletions src/crystal/system/unix/event_io_uring.cr
Original file line number Diff line number Diff line change
@@ -1,43 +1,48 @@
{% skip_file unless flag?(:linux) %}

require "./io_uring"

# :nodoc:
struct Crystal::IoUringEvent < Crystal::Event
struct Crystal::IoUring::Event < Crystal::EventLoop::Event
enum Type
Resume
Timeout
ReadableFd
WritableFd
end

def initialize(@io_uring : Crystal::System::IoUring, @type : Type, @fd : Int32, &callback : Int32 ->)
@callback = Box.box(callback)
def initialize(@io_uring : Crystal::System::IoUring, @type : Type, @fd : Int32 = -1, &@callback : Int32 ->)
@action_id = 0u64
end

def free : Nil
delete
end

def delete : Nil
return if @action_id == 0u64
@io_uring.delete_completion_action(@action_id)
if @type.timeout?
@io_uring.timeout_remove(@callback)
@io_uring.submit_timeout_remove(@action_id)
end
@action_id = 0u64
end

def add(timeout : Time::Span?) : Nil
delete
@action_id = @io_uring.register_completion_action(@callback)

timeout = nil if timeout == Time::Span::ZERO

case @type
in .resume?, .timeout?
if timeout
@io_uring.timeout(timeout, @callback)
@io_uring.submit_timeout(timeout, action_id: @action_id)
else
@io_uring.nop(@callback)
@io_uring.submit_nop(action_id: @action_id)
end
in .readable_fd?
@io_uring.wait_readable(@fd, @callback, timeout: timeout)
@io_uring.submit_poll_add(@fd, Crystal::System::Syscall::POLLIN, action_id: @action_id, timeout: timeout)
in .writable_fd?
@io_uring.wait_writable(@fd, @callback, timeout: timeout)
@io_uring.submit_poll_add(@fd, Crystal::System::Syscall::POLLOUT, action_id: @action_id, timeout: timeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So hmm, if I understand this correctly, this make it so that instead of going through events and wait_read/writeable it just submits a poll instead? I see one issue with that and that is the case when several fibers wait on the same file descriptor - it will wake every fiber that has submitted a poll instead of just waking up the first fiber. This is a difference to how the non-uring fiber loop works and I'm not certain that behavior is something that is good enough as it may mean that when a fiber actually gets woken up the fd is no longer actionable as some other fiber may have acted on it.

Also known as "thundering herd problem"

Copy link
Contributor Author

@lbguilherme lbguilherme Nov 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the goal is to mimic the exact same behavior as libevent. This action_id there points to a callback (see create_fd_write_event). It will call io.resume_write that will wake only one single fiber.

If the action_id were a Fiber, then it would wake up that fiber upon completion. It is more suitable for other operations like read/write/connect/send/etc. Currently only we only do poll with a callback.

end
end
end
4 changes: 2 additions & 2 deletions src/crystal/system/unix/event_libevent.cr
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ require "./lib_event2"
{% end %}

# :nodoc:
struct Crystal::LibEventEvent < Crystal::Event
struct Crystal::LibEvent::Event < Crystal::EventLoop::Event
VERSION = String.new(LibEvent2.event_get_version)

def self.callback(&block : Int32, LibEvent2::EventFlags, Void* ->)
Expand Down Expand Up @@ -53,7 +53,7 @@ struct Crystal::LibEventEvent < Crystal::Event

def new_event(s : Int32, flags : LibEvent2::EventFlags, data, &callback : LibEvent2::Callback)
event = LibEvent2.event_new(@base, s, flags, callback, data.as(Void*))
LibEventEvent.new(event)
Crystal::LibEvent::Event.new(event)
end

def run_loop : Nil
Expand Down
8 changes: 4 additions & 4 deletions src/crystal/system/unix/event_loop.cr
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,18 @@
require "./event_loop_libevent"
{% end %}

class Crystal::EventLoop
abstract class Crystal::EventLoop
def self.create : Crystal::EventLoop
{% if flag?(:linux) && !flag?(:interpreted) && flag?(:force_iouring) %}
return IoUringEventLoop.new
return Crystal::IoUring::EventLoop.new
{% else %}
{% if flag?(:linux) && !flag?(:interpreted) && flag?(:preview_iouring) %}
if ENV["CRYSTAL_EVENTLOOP"]? == "io_uring" || Crystal::System::IoUring.available?
return IoUringEventLoop.new
return Crystal::IoUring::EventLoop.new
end
{% end %}

return LibEventEventLoop.new
return Crystal::LibEvent::EventLoop.new
{% end %}
end
end
35 changes: 14 additions & 21 deletions src/crystal/system/unix/event_loop_io_uring.cr
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
{% skip_file unless flag?(:linux) %}

require "./event_io_uring"
require "weak_ref"

# :nodoc:
class Crystal::IoUringEventLoop < Crystal::EventLoop
class Crystal::IoUring::EventLoop < Crystal::EventLoop
private getter(io_uring) { Crystal::System::IoUring.new(128) }

{% unless flag?(:preview_mt) %}
Expand All @@ -19,15 +18,15 @@ class Crystal::IoUringEventLoop < Crystal::EventLoop
end

# Create a new resume event for a fiber.
def create_resume_event(fiber : Fiber) : Crystal::Event
Crystal::IoUringEvent.new(io_uring, :resume, 0) do |res|
def create_resume_event(fiber : Fiber) : Crystal::EventLoop::Event
Crystal::IoUring::Event.new(io_uring, :resume) do |res|
Crystal::Scheduler.enqueue fiber
end
end

# Creates a timeout_event.
def create_timeout_event(fiber) : Crystal::Event
Crystal::IoUringEvent.new(io_uring, :timeout, 0) do |res|
def create_timeout_event(fiber) : Crystal::EventLoop::Event
Crystal::IoUring::Event.new(io_uring, :timeout) do |res|
next if res == -Errno::ECANCELED.value
if select_action = fiber.timeout_select_action
fiber.timeout_select_action = nil
Expand All @@ -39,24 +38,18 @@ class Crystal::IoUringEventLoop < Crystal::EventLoop
end

# Creates a write event for a file descriptor.
def create_fd_write_event(io : IO::Evented, edge_triggered : Bool = false) : Crystal::Event
Crystal::IoUringEvent.new(io_uring, :writable_fd, io.fd) do |res|
if res == -Errno::ECANCELED.value
io.resume_write(timed_out: true)
else
io.resume_write
end
def create_fd_write_event(io : IO::Evented, edge_triggered : Bool = false) : Crystal::EventLoop::Event
io_ref = WeakRef.new(io)
Crystal::IoUring::Event.new(io_uring, :writable_fd, io.fd) do |res|
io_ref.value.try &.resume_write(timed_out: res == -Errno::ECANCELED.value)
end
end

# Creates a read event for a file descriptor.
def create_fd_read_event(io : IO::Evented, edge_triggered : Bool = false) : Crystal::Event
Crystal::IoUringEvent.new(io_uring, :readable_fd, io.fd) do |res|
if res == -Errno::ECANCELED.value
io.resume_read(timed_out: true)
else
io.resume_read
end
def create_fd_read_event(io : IO::Evented, edge_triggered : Bool = false) : Crystal::EventLoop::Event
io_ref = WeakRef.new(io)
Crystal::IoUring::Event.new(io_uring, :readable_fd, io.fd) do |res|
io_ref.value.try &.resume_read(timed_out: res == -Errno::ECANCELED.value)
end
end
end
14 changes: 6 additions & 8 deletions src/crystal/system/unix/event_loop_libevent.cr
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
{% skip_file if flag?(:linux) && !flag?(:interpreted) && flag?(:force_iouring) %}

require "./event_libevent"

# :nodoc:
class Crystal::LibEventEventLoop < Crystal::EventLoop
private getter(event_base) { Crystal::LibEventEvent::Base.new }
class Crystal::LibEvent::EventLoop < Crystal::EventLoop
private getter(event_base) { Crystal::LibEvent::Event::Base.new }

{% unless flag?(:preview_mt) %}
# Reinitializes the event loop after a fork.
Expand All @@ -19,14 +17,14 @@ class Crystal::LibEventEventLoop < Crystal::EventLoop
end

# Create a new resume event for a fiber.
def create_resume_event(fiber : Fiber) : Crystal::Event
def create_resume_event(fiber : Fiber) : Crystal::EventLoop::Event
event_base.new_event(-1, LibEvent2::EventFlags::None, fiber) do |s, flags, data|
Crystal::Scheduler.enqueue data.as(Fiber)
end
end

# Creates a timeout_event.
def create_timeout_event(fiber) : Crystal::Event
def create_timeout_event(fiber) : Crystal::EventLoop::Event
event_base.new_event(-1, LibEvent2::EventFlags::None, fiber) do |s, flags, data|
f = data.as(Fiber)
if (select_action = f.timeout_select_action)
Expand All @@ -39,7 +37,7 @@ class Crystal::LibEventEventLoop < Crystal::EventLoop
end

# Creates a write event for a file descriptor.
def create_fd_write_event(io : IO::Evented, edge_triggered : Bool = false) : Crystal::Event
def create_fd_write_event(io : IO::Evented, edge_triggered : Bool = false) : Crystal::EventLoop::Event
flags = LibEvent2::EventFlags::Write
flags |= LibEvent2::EventFlags::Persist | LibEvent2::EventFlags::ET if edge_triggered

Expand All @@ -54,7 +52,7 @@ class Crystal::LibEventEventLoop < Crystal::EventLoop
end

# Creates a read event for a file descriptor.
def create_fd_read_event(io : IO::Evented, edge_triggered : Bool = false) : Crystal::Event
def create_fd_read_event(io : IO::Evented, edge_triggered : Bool = false) : Crystal::EventLoop::Event
flags = LibEvent2::EventFlags::Read
flags |= LibEvent2::EventFlags::Persist | LibEvent2::EventFlags::ET if edge_triggered

Expand Down
Loading