Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot run under NixOS #1482

Closed
jonahbron opened this issue Dec 8, 2023 · 14 comments
Closed

Cannot run under NixOS #1482

jonahbron opened this issue Dec 8, 2023 · 14 comments

Comments

@jonahbron
Copy link

Running workerd-linux-64 under linux should be possible with just installing glibc according to the readme. However when I try to execute it, I get this error:

node_modules/@cloudflare/workerd-linux-64/bin$ ls -la
total 86052
-rwxr-xr-x 2 jonah users 88115856 Dec  8 05:51 workerd
node_modules/@cloudflare/workerd-linux-64/bin$ ./workerd
bash: line 1: ./workerd: cannot execute: required file not found

This is running in a Nix shell with nixpkgs.glibc present (and verified by running the binaries it provides)

@jonahbron
Copy link
Author

This is also reported by someone else in the workers-cli repo

cloudflare/workers-sdk#4483

@mikea
Copy link
Collaborator

mikea commented Dec 8, 2023

what does ldd workerd say for you?

@jonahbron
Copy link
Author

$ ldd node_modules/.bin/workerd
        linux-vdso.so.1 (0x00007ffcbb7c1000)
        libdl.so.2 => /nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib/libdl.so.2 (0x00007f324fb34000)
        libpthread.so.0 => /nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib/libpthread.so.0 (0x00007f324fb2f000)
        libm.so.6 => /nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib/libm.so.6 (0x00007f324fa4f000)
        libc.so.6 => /nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib/libc.so.6 (0x00007f324f867000)
        /lib64/ld-linux-x86-64.so.2 => /nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib64/ld-linux-x86-64.so.2 (0x00007f3254111000)
        librt.so.1 => /nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib/librt.so.1 (0x00007f324f862000)

@kentonv
Copy link
Member

kentonv commented Dec 13, 2023

Hmm it sure looks like the libraries are there. I am not sure what would cause bash to produce the error cannot execute: required file not found. But when I Googled this error a bit I came across this stack exchange question which implies that non-Nix binaries cannot directly be executed on NixOS? Can this be the problem?

https://unix.stackexchange.com/questions/522822/different-methods-to-run-a-non-nixos-executable-on-nixos

@jonahbron
Copy link
Author

In that answer under solution one (manual), part of what's happening is that a not-found library is being resolved, while in our case here all of the binaries are present and have known paths.

@jonahbron
Copy link
Author

Following along further with that SE answer, I'm trying to run it with steam-run but encountering another issue first lol

nix-community/NixOS-WSL#328

@ckiee
Copy link

ckiee commented Dec 16, 2023

The problem can be located with the file command:

$ file ………/bin/workerd
node_modules/@cloudflare/workerd-linux-64/bin/workerd: ELF 64-bit LSB pie executable, x86-64, 
version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[xxHash]=bb2f4709f5fab1f7, not stripped

/lib64/ld-linux-x86-64.so.2 is not valid on NixOS. Workaround: wrap workerd with a shell script:

#!/bin/sh
export SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt # https://github.com/cloudflare/workers-sdk/issues/3264
exec steam-run /path/to/workerd.orig "$@"

Cloudflare should compile workerd as a static binary, as it has no interesting dependencies anyway.

@PsychoLlama
Copy link

@ckiee your comment led me to figure out what happened. Thank you!

/lib64/ld-linux-x86-64.so.2 is not valid on NixOS

Checking the statically linked interpreter of hello gives me almost the exact same interpreter path that ldd ./workerd resolves:

$ file $(which hello)
# (output truncated)
# /nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib/ld-linux-x86-64.so.2

vs

$ ldd ./workerd
# (output truncated)
# /nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib64/ld-linux-x86-64.so.2

The difference between those two interpreter paths is /lib64 vs /lib, but /lib64 is just a symlink back to /lib. The paths are identical. I wasn't sure how execve felt about symlinks in the interpreter path so I tried patching the ELF:

# swaps `/lib64` to the resolved `/lib`
patchelf --set-interpreter /nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib/ld-linux-x86-64.so.2 ./node_modules/@cloudflare/workerd-linux-64/bin/workerd

The binary becomes executable 🎉

I'm able to run wrangler dev now and serve requests.

@kentonv
Copy link
Member

kentonv commented Dec 18, 2023

Cloudflare should compile workerd as a static binary, as it has no interesting dependencies anyway.

Unfortunately, linking statically against glibc does not work, because glibc uses dlopen() to dynamically load some of its own functionality, e.g. for DNS resolution. The dlopen()ed components must be from exactly the same version of glibc. But if you statically link glibc and then try to run the binary on some other distribution, the dynamically loaded components will almost certainly not match. This is a well-known problem with static linking on Linux, it's frustrating, but it's not something we can do much about.

Therefore, we statically link everything except glibc. But it seems that NixOS isn't happy with such binaries, unless you use an extra wrapper.

If there's something we could do that would make things easier on Nix without harming usability on other distros, we're happy to accept PRs. Unfortunately our team doesn't have the bandwidth to work on Nix support directly.

@jonahbron
Copy link
Author

@PsychoLlama Would you mind clarifying the solution a bit? Is patchelf a command that you run once to fix the linking, then workerd starts working properly? Where do I get the interpreter path for my machine, does that come from the file command or somewhere else?

@PsychoLlama
Copy link

TL;DR: I only need to patch it once. I got the expected path from ldd ./workerd.

Longer answer

Sure, I'll tell you what I can, but I think I'm still missing parts of the picture. If my first post was correct then patching the interpreter to use /lib/ld-linux-x86-64.so.2 should have fixed the problem, but it did not. I have to provide the full nix store path or it does not work.

I also tried overriding the glibc package to copy the /lib64 files instead of symlinking them, as my original assumption was that symlinks were the problem. That would've been easily committed to a flake file. Nope, no luck.

I no longer believe symlinking is the problem. I don't know why it fails. I'm still a noob when it comes to native stuff like this, so I might be missing something obvious.

EDIT: While writing this response it occurs to me that /lib64 is not part of the search path. It's a hard-coded path. Running ldd in a ubuntu container shows that path actually exists on disk. Maybe libraries can query a search path, but the interpreter must be static? I queried every command in my OS and they all provide a static path for the ELF interpreter... I dunno. I'm out of my depth 🤷

Is patchelf a command that you run once to fix the linking [...]?

Yes, I only need to run it once after the first npm install, or any time the executable changes (e.g. updating the wrangler package).

Where do I get the interpreter path for my machine [...]?

I pulled it from ldd ./workerd. It prints the linked interpreter, but it doesn't actually seem to use it.

# ...
libc.so.6 => /nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib/libc.so.6
/lib64/ld-linux-x86-64.so.2 => /nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib64/ld-linux-x86-64.so.2
librt.so.1 => /nix/store/qn3ggz5sf3hkjs2c797x.........

I just took the path that it printed and used that in the patchelf command. Then, because I'm afraid I'd forget what happened the next time around, I plugged it into a shell hook in my flake file:

{
  description = "Development environment";

  outputs = { self, nixpkgs }:
    let inherit (nixpkgs) lib;

    in {
      devShell = lib.genAttrs lib.systems.flakeExposed (system:
        let pkgs = import nixpkgs { inherit system; };
        in pkgs.mkShell {
          nativeBuildInputs = [ pkgs.nodejs ];

          # See: https://github.com/cloudflare/workerd/issues/1482
          shellHook = ''
            __patchTarget="./node_modules/@cloudflare/workerd-linux-64/bin/workerd"
            if [[ -f "$__patchTarget" ]]; then
              ${pkgs.patchelf}/bin/patchelf --set-interpreter ${pkgs.glibc}/lib/ld-linux-x86-64.so.2 "$__patchTarget"
            fi
          '';
        });
    };
}

Now every time I enter my dev shell it will auto-correct the workerd interpreter path. You may need to change __patchTarget if your package manager installs it somewhere else.

@PsychoLlama
Copy link

@jonahbron Ha, while poking around, I discovered a simpler alternative: remove wrangler from your package.json and install it through pkgs.wrangler instead.

I think through troubleshooting this issue, I rediscovered why the fixup phase of stdenv exists 😅

@jonahbron
Copy link
Author

jonahbron commented Dec 29, 2023

Oh my god you're a life saver @PsychoLlama, installing with pkgs.nodePackages.wrangler worked great! Thank you thank you. The only hitch I ran into was I had the flake in a parent directory of the package.json, and it didn't like not having a package.json around.

error: builder for '/nix/store/lvb0wayv3z0qly8jkql3ya0rb0nrffym-wrangler-3.16.0.drv' failed with exit code 1;
       last 10 log lines:
       > npm ERR!               ^
       > npm ERR!
       > npm ERR! Error: Failed to install package "@cloudflare/workerd-linux-64"
       > npm ERR!     at checkAndPreparePackage (/nix/store/rpywngw6pvm8nd4iny8iidsh2nmdqrwp-wrangler-3.16.0/lib/node_modules/wrangler/node_modules/workerd/install.js:242:15)
       > npm ERR!     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
       > npm ERR!
       > npm ERR! Node.js v18.18.2
       >
       > npm ERR! A complete log of this run can be found in: /build/.npm/_logs/2023-12-29T07_01_26_669Z-debug-0.log

Moving the flake down into the child directory fixed that issue. Only drawback is I have to invoke wrangler directly (can't have it in a package.json) script because it will search in node_modules/.bin first.

@jasnell
Copy link
Member

jasnell commented Jan 2, 2024

I'm going to move this issue to a discussion as it does not appear as if this is actually an issue to resolve in workerd itself.

@cloudflare cloudflare locked and limited conversation to collaborators Jan 2, 2024
@jasnell jasnell converted this issue into discussion #1515 Jan 2, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants