Add which function for finding executables in PATH #2440

0xzhzh · 2024-10-25T06:48:56Z

Closes #2109.

I considered calling it executable_exists but that seemed limiting---which also returns the path of the executable, not just whether it exists. Then one can implement executable_exists(name) using which(name) == "".

I wasn't sure whether to add more tests. Most of the interesting test cases are the responsibility of the which crate to get right (eg executable existing in multiple directories in PATH, executable existing in PATH but not having executable permissions, etc). At the end of the day this function is just a wrapper around which::which, so there is not much more to test here than what should already be tested by which::which.

Also not sure if documentation was added in the right place.

Closes casey#2109 (but with a function name that is shorter and more familiar)

casey · 2024-10-30T23:43:02Z

Thanks for the PR! I this is definitely useful.

One thought is that perhaps this function should actually fail, as in, produce an error and terminate execution, if the binary does not exist? It's very common that if a binary isn't available, you can't do anything useful, and so you just want to give up. In that case, you would be required to check that which wasn't producing the empty string before calling any command it returns.

Tests which are just testing the functionality of which aren't necessary. In general, I try to avoid testing dependencies, and just assume they work (assuming that they're relatively popular, seem well maintained, etc).

Some thoughts:

Should we use which or which_global? Is the only difference that which("foo/bar") will consider the current directory, and which("foo/bar") will not? Is there only a difference when the path contains / or ./?
Should we use which or which_re? I can actually see which_re being very useful, since you could do things like which("g?make") to find make or gmake, and regular expression special characters are vanishingly uncommon in binary names, so it doesn't seem like it would be annoying if you didn't want regular expressions.
If we make it error if it can't find the executable, should we give it another name? require(BIN)?
Should we provide both versions, like which(BIN) and require(BIN)? They both seem useful, so maybe?

0xzhzh · 2024-10-31T02:11:02Z

One thought is that perhaps this function should actually fail, as in, produce an error and terminate execution, if the binary does not exist?

The use cases I have in mind don't involve throwing an error. For instance, the example from #2109 suggests using nala and falling back to apt; I was thinking of using rg and falling back to grep.

If the user wants to give up, they could write, for instance:

git := if which("git") == "" { error("git is not installed") } else { which("git") }

An alternative is to allow which(cmd) to fail, and then have an alternate version like which(cmd, fallback).

Tests which are just testing the functionality of which aren't necessary. In general, I try to avoid testing dependencies, and just assume they work (assuming that they're relatively popular, seem well maintained, etc).

That makes sense. Should I remove the tests I currently have?

Should we use which or which_global? Is the only difference that which("foo/bar") will consider the current directory, and which("foo/bar") will not? Is there only a difference when the path contains / or ./?

I think that's right, though I haven't tested it myself to confirm. I chose to use which because it seems to have strictly more functionality than which_global (i.e., it can resolve relative paths), but I'm not entirely sure if that's necessary.

Should we use which or which_re? I can actually see which_re being very useful, since you could do things like which("g?make") to find make or gmake, and regular expression special characters are vanishingly uncommon in binary names, so it doesn't seem like it would be annoying if you didn't want regular expressions.

Yeah which_re does seem very useful, although one notable exception is g++ and clang++. Perhaps both should be provided.

If we make it error if it can't find the executable, should we give it another name? require(BIN)?

Should we provide both versions, like which(BIN) and require(BIN)? They both seem useful, so maybe?

I like the word choice of require, because it makes it clear what happens in the command is missing. But then again, which makes it clear what the return value is when the command is present; it's unclear what require should return. i.e.,

x := require("ls")

@test:
    echo {{x}}  # what does this print?

I guess it could just return the full path like which, but I'm not sure how much additional value that adds, at the expense of adding to the built-in functions' real estate. Another argument for not adding require now: we can always add it later.

Design-wise, I'm starting to lean toward having which(cmd) and which(cmd, fallback_string) (or maybe even an arbitrary number of fallbacks which(cmd, fallback_cmd1, fallback_cmd2, ..., fallback_string)), because it seems to be a balance of being concise and expressive for a variety of use cases. But ultimately it's up to you, and I'm happy to adjust the PR according to whatever design you think makes the most sense.

casey · 2024-10-31T03:35:09Z

The use cases I have in mind don't involve throwing an error. For instance, the example from #2109 suggests using nala and falling back to apt; I was thinking of using rg and falling back to grep.

Gotcha, that's good to know. In that case I think returning the empty string is ideal. I'm actually thinking about adding Python-style and and or, spelled && and || in just, since the grammar won't permit using an identifier as an operator, LHS || RHS returns LHS if it is non-empty, and RHS if it is, so with || you could do:

git := which('git') || error(…)

Or, if you have a fallback:

grep := which('rg') || which('grep') || error(…)

If we added require, which I agree we can add later:

grep := which('rg') || require('grep')

(require would behave like error() if a binary wasn't found, so execution would stop with an error message, and it would not return a value)

That makes sense. Should I remove the tests I currently have?

I think the tests you have are reasonable, and we should have at least one test, not to test the dependency, but test the function implementation, i.e. that we're calling the right dependency.

I think that's right, though I haven't tested it myself to confirm. I chose to use which because it seems to have strictly more functionality than which_global (i.e., it can resolve relative paths), but I'm not entirely sure if that's necessary.

I think this is good, and we should use which, since users just not pass relative paths if they don't want to consider the working directory.

Yeah which_re does seem very useful, although one notable exception is g++ and clang++. Perhaps both should be provided.

Ooo, good call. Yah, g++ and clang++ are common enough that we definitely shouldn't make which_re the default. We can add which_re later, if needed. (And the fallback with a hypothetical || operator makes it easy to do which("gmake") || which("make), which is probably better than which("g?make") since the precedence is not ambiguous.)

I guess it could just return the full path like which, but I'm not sure how much additional value that adds, at the expense of adding to the built-in functions' real estate. Another argument for not adding require now: we can always add it later.

Yah, I agree. And which(…) || error(…) isn't bad, and let's the user specify an error message.

Design-wise, I'm starting to lean toward having which(cmd) and which(cmd, fallback_string) (or maybe even an arbitrary number of fallbacks which(cmd, fallback_cmd1, fallback_cmd2, ..., fallback_string)), because it seems to be a balance of being concise and expressive for a variety of use cases. But ultimately it's up to you, and I'm happy to adjust the PR according to whatever design you think makes the most sense.

I think we should just do which(cmd) for now. Fallbacks could be done with which(…) || which(…), and we could add the multi-argument version later, since it would be backwards compatible.

casey

Random comments.

README.md

src/function.rs

tests/lib.rs

casey · 2024-10-31T21:56:33Z

I took a look at the which crate, and I think we should actually just write our own. The which crate makes a number of choices that I'm not entirely comfortable with, like swallowing I/O errors, appending windows executable extensions, reading an environment variable called PATHEXT to control whether or not an extension is added, and the like. I would really prefer surface I/O errors, and not do anything fancy with extensions. Let me know if you are up for this!

0xzhzh · 2024-11-04T04:40:32Z

I took a look at the which crate, and I think we should actually just write our own. ... I would really prefer surface I/O errors, and not do anything fancy with extensions. Let me know if you are up for this!

Yeah, I agree. I'm happy to give it a shot, though I'm not familiar with how executables work on Windows, nor what the conventions are, so I'll start based on what the which crate does and then ask for your advice.

I'm actually thinking about adding Python-style and and or, spelled && and || in just, since the grammar won't permit using an identifier as an operator...

Yeah I saw you had mentioned that in another issue somewhere, and I really like that proposal. Besides, it would be nice to clarify what the semantics of conditional expressions and "Booleans" are---introducing those operators would act as a forcing function for that clarification.

0xzhzh · 2024-11-04T08:58:53Z

I just pushed a draft implementation, but I haven't written any new tests (for which coverage is probably more important now). This implementation doesn't use the which crate anymore, but delegates to is_executable to determine whether a path refers to an executable file. Looking forward to your thoughts and feedback.

Please note that is_executable does still swallow I/O errors. However, I think this actually makes some sense, because I don't expect which to complain if I have PATH=/bin1:/bin2 and /bin1/cmd is an invalid path. In fact, neither sh nor which (on my system) seem to care whether /bin1 is an unreadable path or a broken symlink, which I checked with the script folded in the folded details block.

The following script runs /tmp/<path>/bin/cmd as expected, without reporting any errors.

#!/usr/bin/env bash

set -e

tmpd="$(mktemp -d)"
pushd "$tmpd" >/dev/null

# Add an ordinary directory with an ordinary executable to PATH
mkdir bin
printf '#!/bin/sh\necho "this is expected"\n' > bin/cmd
chmod 755 bin/cmd
NEWPATH="$(pwd)/bin"

# An unreadable empty directory to PATH
mkdir unreadable-dir
chmod 000 unreadable-dir
NEWPATH="$(pwd)/unreadable-dir:$NEWPATH"

# Add a broken symlink to PATH
ln -s nowhere broken-dir
NEWPATH="$(pwd)/broken-dir:$NEWPATH"

# Add a directory with an unreadable file to PATH
mkdir unreadable
printf '#!/bin/sh\necho "this is unexpected"\n' > unreadable/cmd
chmod 000 unreadable/cmd
NEWPATH="$(pwd)/unreadable:$NEWPATH"

# Add a directory with a broken symlink to PATH
mkdir broken
ln -s nowhere broken/cmd
NEWPATH="$(pwd)/broken:$NEWPATH"

sh="$(which sh)"
which="$(sh -c 'which which')"

printf "Executing \`cmd'...\n\t"
PATH="$NEWPATH" "$sh" -c "cmd"
# executes ./bin/cmd

printf "Running \`which cmd'...\n\t"
PATH="$NEWPATH" "$sh" -c "$which cmd"
# prints the absolute path of ./bin/cmd

popd >/dev/null
rm -rf "$tmpd"

The shadowing use case you discuss in this comment is interesting, but I think it can be pretty hard to distinguish between an unmounted directory or unreadable file from unrelated directory in PATH. e.g., $(HOME)/.cargo/bin is going to be early in my PATH but I won't have ls in there.

That said, I think was still worthwhile to rewrite a simplified version of the which crate for just, for the following reasons:

My implementation does not swallow errors related to paths with invalid unicode, consistent with the rest of the builtin functions. It will also complain if PATH is not set.
I noticed that the which crate performs tilde_expansion for paths in PATH, which I don't think it should (and besides, it only expands ~ and does not handle ~username). I'll file a separate issue there about this when I get the chance.
When given a relative path, or when PATH contains a relative path, the which crate does not convert those to absolute paths. This is consistent with the behavior of the which behavior, but I personally think just is more robust resolving relative paths relative to justfile_directory() (and complaining if justfile doesn't have a parent directory).
My implementation trades off a dependency on which for a dependency on is_executable, which is much smaller and was simpler for me to understand (ok I'm not entirely sure about what it's doing for Windows, but that's more so because I just don't know Windows at all).

Something else I wanted to respond to:

reading an environment variable called PATHEXT to control whether or not an extension is added

I'm only just learning about this myself, but PATHEXT seems to be a standard thing: https://superuser.com/questions/1027078/what-is-the-default-value-of-the-pathext-environment-variable-for-windows. So we should probably respect that, and that is what is_executable does as well.

0xzhzh · 2024-12-01T23:05:23Z

@casey I finally got around to updating the tests to try out the internal which implementation.

I've only run this on my local macOS machine; I'm wondering if you could run those tests in CI to see if my implementation works on other OSes (Windows especially)?

casey

Finally took another look at this, sorry for the delay!

I set the tests to run, and it looks like they all pass.

I think delegating to is_executable is fine, and that what you wrote about swallowing I/O errors is reasonable, given that it's what the external which command does.

Good to know that PATHEXT is a standard windows thing, I wasn't familiar with it at all.

I didn't do a super in-depth review, my only comment is about trying to avoid a dependency on either.

We could also consider adding a variant of which(), in a follow-up PR (doesn't have to be by you!) which fails if the binary isn't found. I think it's pretty common to need a bunch of commands available, and not want to run if they aren't present.

src/function.rs

tests/lib.rs

0xzhzh · 2024-12-13T23:19:04Z

Finally took another look at this, sorry for the delay!

No worries! Thanks for running the CI tests, I'm glad that everything works on Windows too.

Good to know that PATHEXT is a standard windows thing, I wasn't familiar with it at all.

In all honesty I'm not familiar with this either... I will ask some of my friends from the Windows universe whether we are handling this correctly.

We could also consider adding a variant of which(), in a follow-up PR (doesn't have to be by you!) which fails if the binary isn't found. I think it's pretty common to need a bunch of commands available, and not want to run if they aren't present.

Yeah, maybe something like requires()? I think that's a good idea, but it's also something that could be accomplished via #1059 if that ever gets worked on (though I understand that that's a whole bag of wormholes in and of itself). I'm happy to follow up this PR with one that implements requires().

casey

Nice! Left a bunch of comments, check them out.

src/function.rs

tests/lib.rs

tests/which_exec.rs

src/function.rs

- use Path::new(s) instead of PathBuf::from(s) - revert early handling of empty path - use PathBuf::join() instead of PathBuf::push()

0xzhzh · 2025-01-17T06:47:16Z

What do you think about changing this PR to add require(…), which prints out an error message on failure, instead of which(…). Almost exactly the same logic, but since require(…) doesn't return anything on failure, there aren't any issues relating to the value of true and false to think about.

Sure, happy to add that. Just to make sure I understand: is your thinking here to make require() stable, and keep the return value of which() unstable until you finalize the design of string arrays as values?

Sorry to keep requesting changes, I really want to get this in, but I'm very paranoid about anything related to backwards compatibility and painting myself into a corner 😅

No worries, I totally understand!

casey · 2025-01-17T14:38:31Z

Sure, happy to add that. Just to make sure I understand: is your thinking here to make require() stable, and keep the return value of which() unstable until you finalize the design of string arrays as values?

I was thinking we wouldn't add which(), and would only add require(), but adding which() as unstable is also an option, it's up to you. If we do, you'll need to add a new UnstableFeature enum variant.

laniakea64 · 2025-01-17T18:28:42Z

adding which() as unstable is also an option,

Please do, and please make it (initially) return the empty string when program is not found.

A real example from one my actual justfiles showing the sort of construct which() would be useful for:

fdfind := `which fd 2>/dev/null || which fdfind`

could be written in pure just as

fdfind := which('fd') || require('fdfind')

casey · 2025-01-17T19:26:07Z

I'm not at all opposed to adding which() as unstable, but we could consider doing it in a follow-up PR, to get this one in and keep it small. Either way is fine with me.

0xzhzh · 2025-01-17T21:28:23Z

I'm not at all opposed to adding which() as unstable, but we could consider doing it in a follow-up PR, to get this one in and keep it small. Either way is fine with me.

I made which() unstable, by adding to unstable_features during parsing (similar to what you do for the logical operators). Does that work?

nogweii · 2025-01-20T16:09:15Z

I'm definitely excited to see require() added, as that would also resolve my earlier request.

Would it stop execution, or would it only log an error and keep moving on?

casey · 2025-01-22T01:09:20Z

Nice! This looks good, although you'll need to add the --unstable argument or set unstable for the tests. Can you also add a test that doesn't add --unstable, and checks that we get the correct error message?

casey · 2025-01-22T01:09:50Z

Also it looks like formatting failed, that should be fixable by running cargo fmt.

Blacksmoke16 · 2025-01-22T04:11:48Z

The new require function should also get some docs/tests.

Blacksmoke16 · 2025-01-22T04:33:14Z

Sorry for the double post, but I'm looking forward to this feature and think I found a bug?

I have a justfile like:

KCOV := which('kcov')

@test:
    just _test-{{ if KCOV != '' { 'with-coverage' } else { 'without-coverage' } }}

@_test-with-coverage:
    echo 'with kcov'

@_test-without-coverage:
    echo 'no kcov'

And running this results in:

$ ~/dev/just/target/debug/just --unstable test
error: Call to unknown function `which`
 ——▶ justfile:1:9
  │
1 │ KCOV := which('kcov')
  │         ^^^^^
error: Recipe `test` failed on line 4 with exit code 1

However if I update the test task to be:

@test:
    echo {{ if KCOV != '' { 'with-coverage' } else { 'without-coverage' } }}

I get:

$ ~/dev/just/target/debug/just --unstable test
with-coverage

So seems to be this one specific context that results in the error?

0xzhzh · 2025-01-22T07:21:49Z

@Blacksmoke16 I think you need to invoke the recursive just invocation to use the --unstable flag. Alternatively, you can set the JUST_UNSTABLE environment, which should automatically propagate down (e.g., JUST_UNSTABLE=1 ~/dev/just/target/debug/just --unstable test).

0xzhzh · 2025-01-22T07:40:22Z

@casey Sorry for missing those! I was in a rush when I made the which() function unstable and forgot to test it...

I just pushed some updates to fix the issues. Let me know if there's anything else to address!

0xzhzh · 2025-01-22T07:45:28Z

@nogweii :

Would it stop execution, or would it only log an error and keep moving on?

I ran the following test to see:

$ cat justfile
before_cmd := `touch before_cmd.ran`
failure := `touch before_expr.ran` + require("does-not-exist") + `touch after_expr.ran`
after_cmd := `touch after_cmd.ran`

target:
  touch target.ran

$ rm -f *.ran ; ./just ; ls *.ran
error: Call to function `require` failed: could not find required executable: `does-not-exist`
 ——▶ justfile:2:38
  │
2 │ failure := `touch before_expr.ran` + require("does-not-exist") + `touch after_expr.ran`
  │                                      ^^^^^^^
 after_cmd.ran   before_cmd.ran   before_expr.ran

It seems like all variable-binding expressions evaluate, regardless of failures, whereas expressions stop evaluating as soon as a subexpression fails. I'm not sure if this execution order is specified somewhere, though I haven't really looked very hard. Perhaps @casey knows?

casey · 2025-01-22T18:16:32Z

@0xzhzh The order of evaluation of assignments in a justfile is not linear. just actually evaluates assignments in alphabetical order, since it holds them in a sorted b-tree. However, if any assignments depend on other assignments, the order changes. E.g., if foo depends on bar, bar will be evaluated before foo.

If an assignment fails, then just should stop evaluating assignments. (Try naming them a b and c.)

The order of evaluation of arguments to + and / wasn't defined, but I just merged #2593, which forces them to evaluate from left to right.

casey

LGTM! Tweaked the docs a little, and moved them into their own section.

Add which function for finding executables in PATH

6fb2400

Closes casey#2109 (but with a function name that is shorter and more familiar)

casey mentioned this pull request Oct 30, 2024

[Feature idea] Add support for common operators in conditional expressions (and maybe elsewhere?) #2411

Open

casey requested changes Oct 31, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

src/function.rs Outdated Show resolved Hide resolved

tests/lib.rs Outdated Show resolved Hide resolved

Change version of which function to master branch

f1980b5

Use internal implementation of which()

0c6f5e8

0xzhzh added 2 commits December 1, 2024 17:47

Add tests for internal implementation of which()

389b2ae

Remove stray addition to justfile

2a535c0

0xzhzh requested a review from casey December 1, 2024 23:00

casey requested changes Dec 12, 2024

View reviewed changes

src/function.rs Outdated Show resolved Hide resolved

tests/lib.rs Outdated Show resolved Hide resolved

Remove dependency on either

34f2ea6

0xzhzh requested a review from casey December 20, 2024 07:33

Merge remote-tracking branch 'origin/master' into add-which

740c0ae

casey requested changes Dec 20, 2024

View reviewed changes

0xzhzh added 8 commits December 28, 2024 09:14

Handle empty command string up front

c89e182

Resolve relative paths relative to working directory

aad3831

Clean up implementation of which()

5aa3c07

- use Path::new(s) instead of PathBuf::from(s) - revert early handling of empty path - use PathBuf::join() instead of PathBuf::push()

Rename which_exec -> which_function

e7e9d56

Use single quotes to avoid r# strings

6a98c39

Remove use of temptree! macro

1f40ec3

Add some tests for relative paths

d642b1d

Merge branch 'master' into add-which

ea958df

0xzhzh requested a review from casey December 30, 2024 13:35

Add require

0beadec

0xzhzh requested a review from casey January 17, 2025 06:51

Make which() function unstable

f5c5809

Merge remote-tracking branch 'origin/master' into add-which

2ac4966

0xzhzh added 3 commits January 21, 2025 23:22

Format

a8e8ea4

Fix tests + add unstable test

87a315b

Add documentation

f4501dd

casey added 5 commits January 22, 2025 10:20

Merge remote-tracking branch 'origin/master' into add-which

e63fb7d

Remove a few unnecessary uses of format!()

2e17660

Expected stdout defaults to empty string

86753cb

Change readme section

42c4e5a

Adapt

6ccf803

casey enabled auto-merge (squash) January 22, 2025 18:43

casey approved these changes Jan 22, 2025

View reviewed changes

casey added 2 commits January 22, 2025 10:48

Fix Windows test

1f9abb5

Try to fix Windows tests

2f40780

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add which function for finding executables in PATH #2440

Add which function for finding executables in PATH #2440

0xzhzh commented Oct 25, 2024

casey commented Oct 30, 2024

0xzhzh commented Oct 31, 2024

casey commented Oct 31, 2024 •

edited

Loading

casey left a comment

casey commented Oct 31, 2024

0xzhzh commented Nov 4, 2024

0xzhzh commented Nov 4, 2024

0xzhzh commented Dec 1, 2024

casey left a comment

0xzhzh commented Dec 13, 2024

casey left a comment

0xzhzh commented Jan 17, 2025

casey commented Jan 17, 2025

laniakea64 commented Jan 17, 2025

casey commented Jan 17, 2025

0xzhzh commented Jan 17, 2025

nogweii commented Jan 20, 2025

casey commented Jan 22, 2025

casey commented Jan 22, 2025

Blacksmoke16 commented Jan 22, 2025 •

edited

Loading

Blacksmoke16 commented Jan 22, 2025 •

edited

Loading

0xzhzh commented Jan 22, 2025

0xzhzh commented Jan 22, 2025

0xzhzh commented Jan 22, 2025

casey commented Jan 22, 2025

casey left a comment

Add which function for finding executables in PATH #2440

Are you sure you want to change the base?

Add which function for finding executables in PATH #2440

Conversation

0xzhzh commented Oct 25, 2024

casey commented Oct 30, 2024

0xzhzh commented Oct 31, 2024

casey commented Oct 31, 2024 • edited Loading

casey left a comment

Choose a reason for hiding this comment

casey commented Oct 31, 2024

0xzhzh commented Nov 4, 2024

0xzhzh commented Nov 4, 2024

0xzhzh commented Dec 1, 2024

casey left a comment

Choose a reason for hiding this comment

0xzhzh commented Dec 13, 2024

casey left a comment

Choose a reason for hiding this comment

0xzhzh commented Jan 17, 2025

casey commented Jan 17, 2025

laniakea64 commented Jan 17, 2025

casey commented Jan 17, 2025

0xzhzh commented Jan 17, 2025

nogweii commented Jan 20, 2025

casey commented Jan 22, 2025

casey commented Jan 22, 2025

Blacksmoke16 commented Jan 22, 2025 • edited Loading

Blacksmoke16 commented Jan 22, 2025 • edited Loading

0xzhzh commented Jan 22, 2025

0xzhzh commented Jan 22, 2025

0xzhzh commented Jan 22, 2025

casey commented Jan 22, 2025

casey left a comment

Choose a reason for hiding this comment

casey commented Oct 31, 2024 •

edited

Loading

Blacksmoke16 commented Jan 22, 2025 •

edited

Loading

Blacksmoke16 commented Jan 22, 2025 •

edited

Loading