Allow using arbitrary wheel files with arbitrary constraints in pypi integration

Sometimes you want to use a custom wheel that can't be represented by Python packaging constraints, or simply want to use a particular whl at a particular location for your own testing, development, or reasons.

The particular case I have in mind is pytorch and building pytorch from source. Doing this presents some problems

There are (at least) 5 different distributions of PyTorch for different accelerators (cuda 11.8, cuda 12.6, cuda 12.8, rocm 6.3, and cpu). Unfortunately, environment markers can't represent these conditions, so it's not possible to express which of these "torch" should resolve to in a requirements or pylock file.

The above have public URLs, but using a local file is also desirable in some cases.
* In JAX: In some tests, they want to build a wheel (using bazel), and then use that wheel in some tests (also run by Bazel)
* In Pytorch XLA: They want to build Pytorch manually, then use that for the torch dependency.
* I've seen various slack posts of people building torch (or other ML, C++ heavy things) manually.

Local files would also be helpful for our own testing -- we could generate exactly what we needed without incurring the overhead of remote fetching.

-----

To make this work, we basically need to mixin additional settings to the hub's routing aliases. Ultimately, we want to generate something like this in the hub:

```
# File: @pypi//torch:BUILD.bazel

alias(
  name = "torch",
  actual = select({
    "@user//:is_torch_1": "@pypi_torch_cuda_11.8//:pkg",
    "@user//:is_torch_2": "@pypi_torch_cpu//:pkg",
    "//conditions:default": ":_default",
  })
)

alias(
  name = "_default",
  actual = <select that is generated today>
)
```

I'm not entirely sure how to end up there, though. I'm thinking a pip.override() API that takes the conditions and their destinations

```
pip.parse(
  hub_name = "my_pypi",
  requirements = "requirements.txt"
)

pip.override(
  hub_name = "my_pypi",
  package = "torch",
  config_setting = ["@user//:is_torch_cuda_11.8"],
  urls = ["https://torch.com/torch-cuda-11.8.whl"],
)
pip.override(
  hub_name = "my_pypi",
  package = "torch",
  config_setting = ["@user//:is_torch_cpu"],
  wheel = "@user//:torch-cpu.whl"
)
```

Under the hood, each `wheel/url` value turns into a repo, generating a whl_library-compatible repo (i.e downloads and extracts it). The config_settings are fed into whatever generates the hub code's select routing.

Alternative: don't do the repo creation part. Just plumb through the config condition and the repo name. Forcing users to create the repo doesn't feel ideal. We'd probably want to provide some sort of helper for that (but not whl_library directly -- its API is full of internal details).

----

There's two other pieces of the system where I'm not sure how the interaction will work:

(1) experimental_index_url. IIUC, this works by traversing the simpleapi graph to find a whl that satisfies. If we are providing our own wheels separately, how do those fit into the process?

For example, maybe the simpleapi doesn't find a compatible wheel, but that's expected because we're providing our own wheel?

Or: if we know we're going to use our own wheel, then traversing through the index (for that package) is wasted effort.

(2) To support conditions that python packaging can't express, an idea we had was having multiple requirements.txt files with a select() layer that chooses between them. e.g.

```
pip.parse(
  hub_name = "bla",
  requirements = "requirements-cpu.txt"
  condition = "@//:is_accelerator_cpu"
)
pip.parse(
  hub_name = "bla",
  requirements = "requirements-cuda.txt"
  condition = "@//:is_accelerator_cpu"
)
```

and `@bla//somepkg` routes to `@bla_X_somepkg` or `@bla_Y_somepkg`.

Which looks pretty similar to my proposal above, just a different level of granularity.

cc @aignas 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Allow using arbitrary wheel files with arbitrary constraints in pypi integration #2986

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Allow using arbitrary wheel files with arbitrary constraints in pypi integration #2986

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions