Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

capi: switch to fd-based API #34

Merged
merged 9 commits into from
Jul 18, 2024
Merged

capi: switch to fd-based API #34

merged 9 commits into from
Jul 18, 2024

Conversation

cyphar
Copy link
Member

@cyphar cyphar commented Jul 18, 2024

Using leaked Boxs with quite complicated semantics for handling
errors and freeing doesn't make much sense when we are mostly operating
on file descriptors. Switching to a kernel-like API where fds are
returned as numerical values and errors are negative values is a nicer
interface for C programs (and file descriptors are easier to write
bindings for -- especially for Go since *os.File is quite magical).

However, a key question is how to handle errors. Having per-thread
errors (a-la errno) makes it far annoying to write bindings for
green-thread-based languages like Go. So, return a random negative
number that is used as a reference in a global mapping of error numbers
to underlying errors. The error is removed from the map when the C user
gets the error information. This could cause issues if a C user never
checked their error messages, but that can easily be avoided by not
doing that.

There are some possible lock contention issues and ideally we would use
a hashmap with internal key-based locking (which would eliminate lock
contention entirely) but since this is only for the error path we can
look at optimising this later.

In short, the locking has gone from

Arc / RwLock / GC
-- FFI Boundary --
RwLock
  inner: Option<T>
Mutex
  last_error: HashMap<ThreadId, Option<Error>>

to

fd: no locking in libpathrs
err: global Mutex<HashMap<c_int, Error>>

We also have to drop the configuration API because we no longer have
opaque objects that can can have config settings attached to them. We
will need to re-add this in the future (most likely in the form of a
pathrs_..._c family of functions that take a config struct as an
argument) but for now this should be okay.

Implements #33
Suggested-by: Christian Brauner [email protected]
Signed-off-by: Aleksa Sarai [email protected]

@cyphar cyphar force-pushed the rework-capi-fd branch 2 times, most recently from d7c3bd4 to 2d09a91 Compare July 18, 2024 06:29
@cyphar cyphar mentioned this pull request Jul 18, 2024
2 tasks
There's no benefit to having this special-casing for regular files.
mknodat does what we want and we don't have to deal with allocating an
fd on top.

Signed-off-by: Aleksa Sarai <[email protected]>
Using leaked Box<T>s with quite complicated semantics for handling
errors and freeing doesn't make much sense when we are mostly operating
on file descriptors. Switching to a kernel-like API where fds are
returned as numerical values and errors are negative values is a nicer
interface for C programs (and file descriptors are easier to write
bindings for -- especially for Go since *os.File is quite magical).

However, a key question is how to handle errors. Having per-thread
errors (a-la errno) makes it far annoying to write bindings for
green-thread-based languages like Go. So, return a random negative
number that is used as a reference in a global mapping of error numbers
to underlying errors. The error is removed from the map when the C user
gets the error information. This could cause issues if a C user never
checked their error messages, but that can easily be avoided by not
doing that.

There are some possible lock contention issues and ideally we would use
a hashmap with internal key-based locking (which would eliminate lock
contention entirely) but since this is only for the error path we can
look at optimising this later.

In short, the locking has gone from

  Arc / RwLock / GC [1]
  -- FFI Boundary --
    RwLock
      inner: Option<T>
    Mutex
      last_error: HashMap<ThreadId, Option<Error>>

to

  fd: no locking in libpathrs
  err: global Mutex<HashMap<c_int, Error>>

We also have to drop the configuration API because we no longer have
opaque objects that can can have config settings attached to them. We
will need to re-add this in the future (most likely in the form of a
pathrs_..._c family of functions that take a config struct as an
argument) but for now this should be okay.

Suggested-by: Christian Brauner <[email protected]>
Signed-off-by: Aleksa Sarai <[email protected]>
Some users might not want O_EXCL, and since we aren't going to return an
O_PATH descriptor we might as well let them configure everything.

Signed-off-by: Aleksa Sarai <[email protected]>
The new API is so different that it requires a complete rewrite and
removes almost all of the cruft that was needed to avoid double-closes
and thread-migration with Go.

As this is a complete rewrite, the copyright line of the original author
of the bindings is removed (arguably the bindings were completely
changed with the last rewrite but in this case they were actually
completely rewritten).

Also, change the import path to the slightly nicer

  import "github.com/openSUSE/libpathrs/go-pathrs"

We don't have proper pkg-config definitions nor install scripts yet, so
the Go binding still needs to do some manual fiddling with CFLAGS and
LDFLAGS for now.

Signed-off-by: Aleksa Sarai <[email protected]>
The relative-path import tricks doesn't work now that go-pathrs is a
module, not to mention that the example is actually copy-pastable if we
move the relative-path setting into a replace directive in go.mod.

Signed-off-by: Aleksa Sarai <[email protected]>
We can't use fdopen() for the new file descriptor API for two reasons:

 1. fdopen() doesn't work for directory file descriptors.
 2. io.IOBase file descriptors are always closed when garbage collected,
    but some users may want to leak the file in order to just use the
    file descriptor directly.

So we need to have our own WrappedFd that does GC-based closing by
default but lets you leak the file descriptor. When constructing a
WrappedFd we clone the underlying file descriptor if the source was an
io.IOBase to ensure the GC won't close the file from underneath us.

The constructors for Root and Handle are now a little more python-like
(you can pass any file-like thing and they will handle it properly, and
you can still pass a string path to Root to open a new root).

Signed-off-by: Aleksa Sarai <[email protected]>
The Python examples don't need any adjustment for the new API, but it
would be nice to not get exceptions when dealing with directories. This
also shows the need for Handle.reopen_raw() in the Python API.

Signed-off-by: Aleksa Sarai <[email protected]>
Backtraces are nice for debugging, but including them in the C API
overcomplicates the API for very little benefit.

The original intention was to be able to splice our Rust backtraces into
backtraces for binding languages (like Python and Go). However, it turns
out this is not really possible for those langauges and it seems
unlikely that anybody will actually make use of the backtrace
information we provide.

This also removes a lot of the annoying code we had for converting the
C representation of backtraces into something the binding language cares
about.

In the worst case, users that need to access backtraces for debugging
can always write a Rust version of the reproducer. libpathrs is a small
enough library that this should be possible for almost all reproducers
(not that backtraces would be necessary in most cases anyway).

Signed-off-by: Aleksa Sarai <[email protected]>
These methods were only ever used internally, but just to be safe make
sure they're properly marked as unsafe.

Signed-off-by: Aleksa Sarai <[email protected]>
@cyphar cyphar merged commit e1b3176 into openSUSE:main Jul 18, 2024
5 checks passed
@cyphar cyphar deleted the rework-capi-fd branch July 18, 2024 08:38
@cyphar cyphar restored the rework-capi-fd branch July 18, 2024 08:38
@cyphar cyphar deleted the rework-capi-fd branch July 18, 2024 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant