Skip to content

Openness to a low-level getdents64 directory iterator? #451

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SUPERCILEX opened this issue Nov 16, 2022 · 4 comments · Fixed by #457
Closed

Openness to a low-level getdents64 directory iterator? #451

SUPERCILEX opened this issue Nov 16, 2022 · 4 comments · Fixed by #457

Comments

@SUPERCILEX
Copy link
Contributor

The stdlib and nix use libc's readdir to do directory iteration while rustix uses getdents, all of which are slow. readdir is C, so it includes locking to support multithreading (which I'm pretty sure isn't necessary in rust due to the ownership model). nix is faster than the stdlib because it doesn't allocate anything while iterating whereas the stdlib allocates path names. rustix suffers from the same allocation problem, but is quite a bit faster than nix or the stdlib because it uses getdents.

A zero-compromise getdents implementation is still ~15% faster than rustix:

Benchmark 1: ./dents /tmp/ftzz-test
  Time (mean ± σ):     166.8 ms ±   1.4 ms    [User: 5.2 ms, System: 161.5 ms]
  Range (min … max):   165.4 ms … 169.8 ms    17 runs
 
Benchmark 2: ./nix /tmp/ftzz-test
  Time (mean ± σ):     201.7 ms ±   1.6 ms    [User: 41.4 ms, System: 160.3 ms]
  Range (min … max):   199.7 ms … 205.6 ms    14 runs
 
Benchmark 3: ./stdlib /tmp/ftzz-test
  Time (mean ± σ):     209.4 ms ±   1.2 ms    [User: 45.6 ms, System: 163.7 ms]
  Range (min … max):   207.8 ms … 212.6 ms    14 runs
 
Benchmark 4: ./rustix /tmp/ftzz-test
  Time (mean ± σ):     191.3 ms ±   2.1 ms    [User: 25.0 ms, System: 166.3 ms]
  Range (min … max):   186.6 ms … 194.9 ms    15 runs
 
Summary
  './dents /tmp/ftzz-test' ran
    1.15 ± 0.02 times faster than './rustix /tmp/ftzz-test'
    1.21 ± 0.01 times faster than './nix /tmp/ftzz-test'
    1.25 ± 0.01 times faster than './stdlib /tmp/ftzz-test'

I have an implementation open for nix: nix-rust/nix#1856. The API requires a buffer and makes you handle resizing if you need that. The API is currently safe, but I need to see what happens if you seek a directory and if that's worth an unsafe (rustix's existing Dir should have the same problem of potentially garbage offsets since you can dup fds). I can explain more around the reasoning of the API if there's interest.

Would you guys also be interested? If so, I'll look into opening up a PR in the next few days.

@SUPERCILEX
Copy link
Contributor Author

Oh, I should also mention that while the zero-compromise implementation is ~15% faster overall, it's actually a whopping 5x faster in userspace where we can control the code.

@sunfishcode
Copy link
Member

Yes, we're interested!

One consideration here is that one of the main users of rustix's DirEntry is cap_std::fs::DirEntry, and it wants a DirEntry with no lifetime parameters, because it's following std::fs::DirEntry which has no lifetime parameters. When I looked into this a while ago, I wasn't able to find a reasonable way to implement this higher-level API on top of a lower-level API without using owned name strings. If you can find a way to do it, that'd be great, but otherwise, perhaps rustix could have two directory iteration APIs, one with owned name strings and one without.

The other is that in the linked API, the main example shows a fixed-sized buffer with size 2048. I'd prefer to avoid an API where users have magic buffer sizes in their code. If we end up with two directory iteration APIs, perhaps the lower-level one can drop support for resizing and just take a &[u8; DIR_BUF_LEN] instead of a &[u8], where DIR_BUF_LEN is some constant that the API defines, which is known to be large enough for any filename on the host.

@SUPERCILEX
Copy link
Contributor Author

rustix could have two directory iteration APIs, one with owned name strings and one without.

Yup, that's pretty much the approach I took. In theory, the higher level API should be able to be built on top of one that uses lifetimes by to_owned-ing everything when necessary.

perhaps the lower-level one can drop support for resizing and just take a &[u8; DIR_BUF_LEN]

I think providing a suggested constant for the buffer size is a great idea since we can do some benchmarking and find the optimal one. That said, we'd still want to accept slices so as not to prevent users from doing buffer resizing if they so desire (also slices work great with Vec::spare_capacity_mut or a small-ish stack buffer).

@SUPERCILEX
Copy link
Contributor Author

I'll get to this hopefully in the next few days or over the weekend. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants