Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for REPLs #2723

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from
Draft

Conversation

philsc
Copy link
Contributor

@philsc philsc commented Apr 1, 2025

This patch adds three different ways to invoke a REPL for a given
target, each with slightly unique use cases.

Deployed binaries: Sometimes it's really useful to start the REPL
for a binary that's already deployed in a docker container. You can do
this with the RULES_PYTHON_BOOTSTRAP_REPL environment variable. For
example:

$ RULES_PYTHON_BOOTSTRAP_REPL=1 bazel run --//python/config_settings:bootstrap_impl=script //tools:wheelmaker
Python 3.11.1 (main, Jan 16 2023, 22:41:20) [Clang 15.0.7 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import tools.wheelmaker
>>>

py_{library,binary,test} targets: These targets will now
auto-generate additional <name>.repl targets.

$ bazel run --//python/config_settings:bootstrap_impl=script //python/runfiles:runfiles.repl
Python 3.11.1 (main, Jan 16 2023, 22:41:20) [Clang 15.0.7 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import python.runfiles
>>>

Arbitrary PyInfo providers: Spawn a REPL for any target that
provides PyInfo like this:

$ bazel run --//python/config_settings:bootstrap_impl=script //python/bin:repl --//python/bin:repl_dep=//tools:wheelmaker
Python 3.11.1 (main, Jan 16 2023, 22:41:20) [Clang 15.0.7 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import tools.wheelmaker
>>>

This patch adds three different ways to invoke a REPL for a given
target, each with slightly unique use cases.

Deployed binaries: Sometimes it's really useful to start the REPL
for a binary that's already deployed in a docker container. You can do
this with the `RULES_PYTHON_BOOTSTRAP_REPL` environment variable. For
example:

    $ RULES_PYTHON_BOOTSTRAP_REPL=1 bazel run --//python/config_settings:bootstrap_impl=script //tools:wheelmaker
    Python 3.11.1 (main, Jan 16 2023, 22:41:20) [Clang 15.0.7 ] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    (InteractiveConsole)
    >>> import tools.wheelmaker
    >>>

`py_{library,binary,test}` targets: These targets will now
auto-generate additional `<name>.repl` targets.

    $ bazel run --//python/config_settings:bootstrap_impl=script //python/runfiles:runfiles.repl
    Python 3.11.1 (main, Jan 16 2023, 22:41:20) [Clang 15.0.7 ] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    (InteractiveConsole)
    >>> import python.runfiles
    >>>

Arbitrary `PyInfo` providers: Spawn a REPL for any target that
provides `PyInfo` like this:

    $ bazel run --//python/config_settings:bootstrap_impl=script //python/bin:repl --//python/bin:repl_dep=//tools:wheelmaker
    Python 3.11.1 (main, Jan 16 2023, 22:41:20) [Clang 15.0.7 ] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    (InteractiveConsole)
    >>> import tools.wheelmaker
    >>>
@arrdem
Copy link
Contributor

arrdem commented Apr 1, 2025

Broadly in favor of this, had a similar macro feature aligned with the verbs pattern at a former employer.

I don't love having magical behavior that runs ipython if it's on the path, IMO an ipython shell or any other notebook-like thing is a different kind of target than a vanilla shell and we shouldn't overload the one with try-import behavior which is generally an antipattern. IMO the kind of shell to run should be explicitly selected either by having separate targets or by having a feature flag we can select() on.

While REPL is the more familiar term to me from the Lisp world, the Python ecosystem appears to prefer the words "shell" or "console". My previous macros emitted .shell and .ishell verbs which I would suggest is slightly better aligned.

@aignas
Copy link
Collaborator

aignas commented Apr 1, 2025

What is the difference between 2 and 3? Could we just get away with 3?

@philsc
Copy link
Contributor Author

philsc commented Apr 4, 2025

I don't love having magical behavior that runs ipython if it's on the path, IMO an ipython shell or any other notebook-like thing is a different kind of target than a vanilla shell

I generally agree, but everywhere I've implemented this (3 places), one of the first questions is "how do I get ipython?". So I've just given up on that front and made these automatically use that. I can be convinced to drop it, but I suspect without it, folks are just going to implement it themselves which reduces the value of having it in a shared place.

IMO the kind of shell to run should be explicitly selected either by having separate targets or by having a feature flag we can select() on.

Perhaps we could add an additional label_flag to let the user specify a .py file that invokes the actual shell (to replace code.interact()) in addition to the dep that pulls in ipython.

@philsc
Copy link
Contributor Author

philsc commented Apr 4, 2025

What is the difference between 2 and 3? Could we just get away with 3?

The difference is convenience. I have found that folks picked up the "bazel run this library with .repl at the end" a whole lot easier and they even taught it to other folks.

@philsc philsc requested a review from rickeylev April 4, 2025 05:42
@philsc
Copy link
Contributor Author

philsc commented Apr 4, 2025

While REPL is the more familiar term to me from the Lisp world, the Python ecosystem appears to prefer the words "shell" or "console". My previous macros emitted .shell and .ishell verbs which I would suggest is slightly better aligned.

Hmm. Perhaps I'm old 😛

REPL does appear in the docs: https://docs.python.org/3/tutorial/appendix.html#tut-interac

There are two variants of the interactive REPL. The classic basic interpreter is supported on all platforms with minimal line control capabilities.

And there's even an environment variable for configuration:
https://docs.python.org/3/using/cmdline.html#envvar-PYTHON_BASIC_REPL

I'm happy to change it to ".shell" or similar if you all want, but I don't think REPL is un-Pythonic so to speak.

@aignas
Copy link
Collaborator

aignas commented Apr 8, 2025

One extra question about what is the difference between your //python/bin:repl target and the existing //python/bin:python target? I have a usecase where I need a handle to the python interpreter with all of the deps available and importable. The current //python/bin:python only gives me a clean interpreter but the deps are not be found in my PYTHONPATH.

Should we have the approach where we:

  1. Fix the //python/bin:python to have the deps?
  2. If not, have a //python/bin:repl target to start with that just unblocks us from having a working interpreter.
  3. Add the env var to the bootstrap (which could be super useful but the API might need to be discussed further.
  4. Discuss the macro documentation, etc later.

@philsc
Copy link
Contributor Author

philsc commented Apr 8, 2025

One extra question about what is the difference between your //python/bin:repl target and the existing //python/bin:python target? I have a usecase where I need a handle to the python interpreter with all of the deps available and importable. The current //python/bin:python only gives me a clean interpreter but the deps are not be found in my PYTHONPATH.

Can you elaborate on this a bit? Is the use case to do something like subprocess.run([sys.executable, "path/to/file.py"])?

The goal with this PR here is to create a REPL in an environment identical to what you would see when running your code in a py_binary. I.e. all the things that the first and second stage bootstrap script do. The three that I can think of:

  1. venv creation
  2. safe path handling
  3. setting up runfiles env vars

You don't get that with only the interpreter, even if it could import the deps.

I think if we can figure out how to move all those things into a pre-generated venv, then that could work. But it's not obvious to me that we can do that at this time.

Should we have the approach where we:

1. Fix the `//python/bin:python` to have the deps?

2. If not, have a `//python/bin:repl` target to start with that just unblocks us from having a working interpreter.

3. Add the `env var` to the bootstrap (which could be super useful but the API might need to be discussed further.

4. Discuss the macro documentation, etc later.

Removed the env var in the bootstrap script and the macro stuff for future PRs.

@aignas
Copy link
Collaborator

aignas commented Apr 9, 2025

The concrete usecase is that I am using dagster and dagster allows users to include user code deployments via specifying in a YAML directory a list of venv with a path to the python interpreter.

Right now I am hacking this around by having a py_library with all the deps and then a py_binary that has the main as the following file:

import subprocess
import sys

if __name__ == "__main__":
    subprocess.run([sys.executable] + sys.argv[1:])  # noqa: S603

This basically re-execs the interpreter from the environment with any parameters that may be passed to the target. If I run this target I get the REPL. And I can do import dagster in the said vanila python REPL.

If on the other hand I pass the py_library to the @rules_python//python/bin:python target via a the label flag and try doing import dagster it fails because the deps are not wired in.

My question here is:

  • Why do we have @rules_python//python/bin:python in addition to this new REPL target?
  • If there is a reason to have both, where should the user use one over the other?

@groodt
Copy link
Collaborator

groodt commented Apr 9, 2025

My 2c here. I would prefer to have this in user-space. We have a similar need at work, and we added a small macro to create an ipython py_console_script_binary with the dependencies. (It occurs to me with the new main_module support from #2671 there might also be an additional way to do it... )

This deals with most of the concerns listed above?

  • venv creation ✅ - I presume you really mean the runfiles or PYTHONPATH? But broadly, when the rules support site-packages layout, you can get this for free with a user-space solution
  • safe path handling ✅ - A user-space solution is launched with the same SAFEPATH handling
  • setting up runfiles env vars ✅ - A user-space solution sets up runfiles

Im worried about adding more complexity into the toolchains and bootstrapping areas at the moment.

If your main motivation is convenience for users, it's unclear why you couldn't create some user facing macros in your repository (automated or not via gazelle) that wrap the native py_library and py_binary targets with an ipython repl and targetnames of your choosing?

@aignas
Copy link
Collaborator

aignas commented Apr 9, 2025

I am curious if having the main_module with a select statement something like:

py_binary(
    name = "my_py_binary",
    srcs = ...,
    deps = ...,
    main_module = select({
        "@rules_python//python/config_settings:launch_repl": "$(PYTHON3)",
        "//conditions:default": "",
    }),
)

I agree with @groodt here that running IPython should be solved in the user space, but launching vanilla python should be a part of the core rules IMHO, but I am not sure exactly how.

I personally liked the already existing target @rules_python//python/bin:python but currently it is not setting up the PYTHONPATH yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants