Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up close_fds with the new close_range() Linux/FreeBSD syscall #189

Open
nh2 opened this issue Aug 12, 2020 · 2 comments · May be fixed by #255
Open

Speed up close_fds with the new close_range() Linux/FreeBSD syscall #189

nh2 opened this issue Aug 12, 2020 · 2 comments · May be fixed by #255

Comments

@nh2
Copy link
Member

nh2 commented Aug 12, 2020

Background:

As written in

close_fds :: Bool, -- ^ Close all file descriptors except stdin, stdout and stderr in the new process (on Windows, only works if std_in, std_out, and std_err are all Inherit). This implementation will call close an every fd from 3 to the maximum of open files, which can be slow for high maximum of open files.

This implementation will call close() an every fd from 3 to the maximum of open files, which can be slow for high maximum of open files.

The new close_range() syscall solves this, closing them all in 1 go. According to the LWN link, it is very fast, and you can give it MAXINT.

The code that needs to be augmented (with CPP):

if (close_fds) {
int i;
if (max_fd == 0) {
#if HAVE_SYSCONF
max_fd = sysconf(_SC_OPEN_MAX);
if (max_fd == -1) {
max_fd = 256;
}
#else
max_fd = 256;
#endif
}
// XXX Not the pipe
for (i = 3; i < max_fd; i++) {
if (i != forkCommunicationFds[1]) {
close(i);
}
}
}

@brauner
Copy link

brauner commented Aug 14, 2020

Since you're closing all fds you could call it with the CLOSE_RANGE_UNSHARE flag, i.e.

close_range(4, UINT_MAX, CLOSE_RANGE_UNSHARE)

The kernel will detect that you're closing all file descriptors and will make a copy of only the first three file descriptors and doesn't need to do any actual work closing all the others. Obviously if you do this in threaded environment than you can't use it if you want to close the fds for all threads. :)

@thomasjm
Copy link
Contributor

I've just come upon a really pathological behavior surrounding this code which happens when the call to sysconf(_SC_OPEN_MAX) returns a huge number.

On my Ubuntu machine, getconf OPEN_MAX returns 1048576. Fine, my machine can do 1 million superfluous close file descriptor calls without a noticeable delay.

But then I found a system (Kind Kubernetes environment on NixOS) where that variable is 1073741816! Now a call to createProcess takes 3.5 minutes and rails every CPU on my machine that entire time while the loop counts to a billion. (Interestingly, it rails all CPUs on GHC 9.0 and only a single CPU on GHC 9.2.)

So I'd request two things:

  1. Please let's use close_range on supported systems (apparently it became available in Linux 5.9)
  2. While researching this I learned the "normal" way to close file descriptors is to look through /proc/fd to find the file descriptors to close, and only if that fails do you fall back to the sysconf call and loop. I think taking this step when close_range is not available would be much better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants