Skip to content

How to do metadata-only typecheck build of a rust_library? #428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dtolnay opened this issue Oct 12, 2020 · 16 comments
Open

How to do metadata-only typecheck build of a rust_library? #428

dtolnay opened this issue Oct 12, 2020 · 16 comments

Comments

@dtolnay
Copy link
Contributor

dtolnay commented Oct 12, 2020

It looks like rules_rust unconditionally passes --emit=dep-info,link. This means Bazel builds cannot benefit from the biggest rustc speedup since 1.0.0 (https://blog.rust-lang.org/2017/03/16/Rust-1.16.html#whats-in-1160-stable). 😿

args.add("--emit=dep-info,link")

Is there any idiomatic way to perform non-code-generating builds which type-check only? Equivalent to Cargo's cargo check or cargo rustc --profile=check, rustc's rustc --emit=metadata, or Buck's #check flavored builds (buck build :lib#check; https://buck.build/concept/flavors.html#analysis_http).

@mfarrugi
Copy link
Collaborator

mfarrugi commented Oct 12, 2020 via email

@dtolnay
Copy link
Contributor Author

dtolnay commented Oct 12, 2020

or turn rust_library into a macro that creates a 'check' target in addition to the build target.

This sounds bad if it means ... builds will build those both. Would there be a way to not do that? I'd expect a way to run a whole ... build in the "check" compilation mode or the non-check mode.

@mfarrugi
Copy link
Collaborator

mfarrugi commented Oct 12, 2020 via email

@mfarrugi
Copy link
Collaborator

mfarrugi commented Oct 12, 2020 via email

@dtolnay
Copy link
Contributor Author

dtolnay commented Oct 12, 2020

For comparison, the behavior of Buck is that buck build ... will do codegen of any rust_library that is pulled in transitively by any binary included in the build, and do check-only build of any rust_library that is included in the ... but not pulled in by any binary. That seems like the best behavior. How sure are we that there would be no way to reproduce that in Bazel, or that we wouldn't want to?

@mfarrugi
Copy link
Collaborator

mfarrugi commented Oct 13, 2020 via email

@dtolnay
Copy link
Contributor Author

dtolnay commented Oct 13, 2020

What is the use-case for having libraries with no dependents?

Well if I have a library in lib/ which is used by a binary in bin/, and I am iterating on the library today, then bazel build lib/... shouldn't be needing to do LLVM codegen and linking.

This is why cargo check has been so important. It's not that people are making libraries that are never to be used by anything. It's that sometimes people work on the library itself.

@mfarrugi
Copy link
Collaborator

mfarrugi commented Oct 13, 2020 via email

@mfarrugi
Copy link
Collaborator

mfarrugi commented Oct 13, 2020 via email

@dtolnay
Copy link
Contributor Author

dtolnay commented Oct 13, 2020

Oh sure, but I meant for the ... case. In that scenario, it seems like it would work to have a separate check target.

For the ... case, I'll share one use case from https://github.com/dtolnay/cxx -- in that repo we have two ways to plug cxx's C++ code generator into a build depending on your build system: one based on a CLI command which would be called e.g. from genrule/run_binary in Buck/Bazel, and one based on a library which would be called from a Cargo build script. I develop locally mainly with Bazel and when doing bazel build ... locally or in CI I want the whole repo typechecked including my library (which I maintain a Bazel target for but is called only from Cargo-only build.rs).

dae added a commit to ankitects/rules_rust that referenced this issue Oct 20, 2020
I have a crate that uses pyo3 to create a Python extension module.
On macOS, special link flags are required:

    rustc_flags = selects.with_or({
        (
            "@io_bazel_rules_rust//rust/platform:x86_64-apple-darwin",
        ): [
            "-Clink-arg=-undefined",
            "-Clink-arg=dynamic_lookup",
        ],
        "//conditions:default": [],
    }),

Without them, the linker on macOS fails, as the Python API is not
available until runtime.

rules_rust's clippy implementation was passing --emit=dep-info,link
to the clippy invocation, causing a clippy run to fail with linking
errors. This patch changes the invocation to use
--emit=dep-info,metadata instead, which is what cargo uses. It also
shaves a bit of time off the check, as linking no longer needs to happen.

Tangentially related to bazelbuild#428 and bazelbuild#421 - currently the clippy aspect
seems to be falling back on a non-worker compile, so it's still
noticeably slower than running cargo clippy directly when minor
changes have been made.
dae added a commit to ankitects/rules_rust that referenced this issue Oct 21, 2020
I have a crate that uses pyo3 to create a Python extension module.
On macOS, special link flags are required:

    rustc_flags = selects.with_or({
        (
            "@io_bazel_rules_rust//rust/platform:x86_64-apple-darwin",
        ): [
            "-Clink-arg=-undefined",
            "-Clink-arg=dynamic_lookup",
        ],
        "//conditions:default": [],
    }),

Without them, the linker on macOS fails, as the Python API is not
available until runtime.

rules_rust's clippy implementation was passing --emit=dep-info,link
to the clippy invocation, causing a clippy run to fail with linking
errors. This patch changes the invocation to use
--emit=dep-info,metadata instead, which is what cargo uses. It also
shaves a bit of time off the check, as linking no longer needs to happen.

Tangentially related to bazelbuild#428 and bazelbuild#421 - currently the clippy aspect
seems to be falling back on a non-worker compile, so it's still
noticeably slower than running cargo clippy directly when minor
changes have been made.
dae added a commit to ankitects/rules_rust that referenced this issue Oct 22, 2020
I have a crate that uses pyo3 to create a Python extension module.
On macOS, special link flags are required:

    rustc_flags = selects.with_or({
        (
            "@io_bazel_rules_rust//rust/platform:x86_64-apple-darwin",
        ): [
            "-Clink-arg=-undefined",
            "-Clink-arg=dynamic_lookup",
        ],
        "//conditions:default": [],
    }),

Without them, the linker on macOS fails, as the Python API is not
available until runtime.

rules_rust's clippy implementation was passing --emit=dep-info,link
to the clippy invocation, causing a clippy run to fail with linking
errors. This patch changes the invocation to use
--emit=dep-info,metadata instead, which is what cargo uses. It also
shaves a bit of time off the check, as linking no longer needs to happen.

Tangentially related to bazelbuild#428 and bazelbuild#421 - currently the clippy aspect
seems to be falling back on a non-worker compile, so it's still
noticeably slower than running cargo clippy directly when minor
changes have been made.
dae added a commit to ankitects/rules_rust that referenced this issue Oct 25, 2020
I have a crate that uses pyo3 to create a Python extension module.
On macOS, special link flags are required:

    rustc_flags = selects.with_or({
        (
            "@io_bazel_rules_rust//rust/platform:x86_64-apple-darwin",
        ): [
            "-Clink-arg=-undefined",
            "-Clink-arg=dynamic_lookup",
        ],
        "//conditions:default": [],
    }),

Without them, the linker on macOS fails, as the Python API is not
available until runtime.

rules_rust's clippy implementation was passing --emit=dep-info,link
to the clippy invocation, causing a clippy run to fail with linking
errors. This patch changes the invocation to use
--emit=dep-info,metadata instead, which is what cargo uses. It also
shaves a bit of time off the check, as linking no longer needs to happen.

Tangentially related to bazelbuild#428 and bazelbuild#421 - currently the clippy aspect
seems to be falling back on a non-worker compile, so it's still
noticeably slower than running cargo clippy directly when minor
changes have been made.
dae added a commit to ankitects/rules_rust that referenced this issue Nov 5, 2020
I have a crate that uses pyo3 to create a Python extension module.
On macOS, special link flags are required:

    rustc_flags = selects.with_or({
        (
            "@io_bazel_rules_rust//rust/platform:x86_64-apple-darwin",
        ): [
            "-Clink-arg=-undefined",
            "-Clink-arg=dynamic_lookup",
        ],
        "//conditions:default": [],
    }),

Without them, the linker on macOS fails, as the Python API is not
available until runtime.

rules_rust's clippy implementation was passing --emit=dep-info,link
to the clippy invocation, causing a clippy run to fail with linking
errors. This patch changes the invocation to use
--emit=dep-info,metadata instead, which is what cargo uses. It also
shaves a bit of time off the check, as linking no longer needs to happen.

Tangentially related to bazelbuild#428 and bazelbuild#421 - currently the clippy aspect
seems to be falling back on a non-worker compile, so it's still
noticeably slower than running cargo clippy directly when minor
changes have been made.
dae added a commit to ankitects/rules_rust that referenced this issue Nov 5, 2020
I have a crate that uses pyo3 to create a Python extension module.
On macOS, special link flags are required:

    rustc_flags = selects.with_or({
        (
            "@io_bazel_rules_rust//rust/platform:x86_64-apple-darwin",
        ): [
            "-Clink-arg=-undefined",
            "-Clink-arg=dynamic_lookup",
        ],
        "//conditions:default": [],
    }),

Without them, the linker on macOS fails, as the Python API is not
available until runtime.

rules_rust's clippy implementation was passing --emit=dep-info,link
to the clippy invocation, causing a clippy run to fail with linking
errors. This patch changes the invocation to use
--emit=dep-info,metadata instead, which is what cargo uses. It also
shaves a bit of time off the check, as linking no longer needs to happen.

Tangentially related to bazelbuild#428 and bazelbuild#421 - currently the clippy aspect
seems to be falling back on a non-worker compile, so it's still
noticeably slower than running cargo clippy directly when minor
changes have been made.
dae added a commit to ankitects/rules_rust that referenced this issue Nov 13, 2020
I have a crate that uses pyo3 to create a Python extension module.
On macOS, special link flags are required:

    rustc_flags = selects.with_or({
        (
            "@io_bazel_rules_rust//rust/platform:x86_64-apple-darwin",
        ): [
            "-Clink-arg=-undefined",
            "-Clink-arg=dynamic_lookup",
        ],
        "//conditions:default": [],
    }),

Without them, the linker on macOS fails, as the Python API is not
available until runtime.

rules_rust's clippy implementation was passing --emit=dep-info,link
to the clippy invocation, causing a clippy run to fail with linking
errors. This patch changes the invocation to use
--emit=dep-info,metadata instead, which is what cargo uses. It also
shaves a bit of time off the check, as linking no longer needs to happen.

Tangentially related to bazelbuild#428 and bazelbuild#421 - currently the clippy aspect
seems to be falling back on a non-worker compile, so it's still
noticeably slower than running cargo clippy directly when minor
changes have been made.
dae added a commit to ankitects/rules_rust that referenced this issue Nov 13, 2020
I have a crate that uses pyo3 to create a Python extension module.
On macOS, special link flags are required:

    rustc_flags = selects.with_or({
        (
            "@io_bazel_rules_rust//rust/platform:x86_64-apple-darwin",
        ): [
            "-Clink-arg=-undefined",
            "-Clink-arg=dynamic_lookup",
        ],
        "//conditions:default": [],
    }),

Without them, the linker on macOS fails, as the Python API is not
available until runtime.

rules_rust's clippy implementation was passing --emit=dep-info,link
to the clippy invocation, causing a clippy run to fail with linking
errors. This patch changes the invocation to use
--emit=dep-info,metadata instead, which is what cargo uses. It also
shaves a bit of time off the check, as linking no longer needs to happen.

Tangentially related to bazelbuild#428 and bazelbuild#421 - currently the clippy aspect
seems to be falling back on a non-worker compile, so it's still
noticeably slower than running cargo clippy directly when minor
changes have been made.
dae added a commit to ankitects/rules_rust that referenced this issue Dec 1, 2020
I have a crate that uses pyo3 to create a Python extension module.
On macOS, special link flags are required:

    rustc_flags = selects.with_or({
        (
            "@io_bazel_rules_rust//rust/platform:x86_64-apple-darwin",
        ): [
            "-Clink-arg=-undefined",
            "-Clink-arg=dynamic_lookup",
        ],
        "//conditions:default": [],
    }),

Without them, the linker on macOS fails, as the Python API is not
available until runtime.

rules_rust's clippy implementation was passing --emit=dep-info,link
to the clippy invocation, causing a clippy run to fail with linking
errors. This patch changes the invocation to use
--emit=dep-info,metadata instead, which is what cargo uses. It also
shaves a bit of time off the check, as linking no longer needs to happen.

Tangentially related to bazelbuild#428 and bazelbuild#421 - currently the clippy aspect
seems to be falling back on a non-worker compile, so it's still
noticeably slower than running cargo clippy directly when minor
changes have been made.
@hlopko
Copy link
Member

hlopko commented Feb 9, 2021

Hi all, so I believe what David suggests is doable - we'll have to register 2 actions, one doing the check and producing the throw away metadata file (and this file will be put into the DefaultInfo provider so it's built when the target is requested on the bazel command line) and other one doing the link (and the output file will be put into CrateInfo provider so it will be used from other rust rules).

Related, I'll try to run some experiments with pipelining. Metadata produced by --emit=metadata cannot be used for downstream compilations, only supported configuration AFAIK is --emit=metadata,link. So for pipelining we will have to have 3 actions:

  1. --emit=metadata for check-only
  2. --emit=metadata,link but kill the process the moment metadata are produced :)
  3. --emit=link

David, since we're touching this, do you know if 2) is what Buck does (assuming Buck has pipelining enabled)?

@dtolnay
Copy link
Contributor Author

dtolnay commented Feb 9, 2021

I believe it does not pipeline the Rust builds. We've talked about it and 2 is what we would do, but not implemented yet.

@mfarrugi
Copy link
Collaborator

mfarrugi commented Feb 9, 2021 via email

@hlopko
Copy link
Member

hlopko commented Feb 9, 2021

Dependent rust_library should only need the metadata to build and/or
typecheck as well.

Are you sure? I tried it in a tiny example where the depenent target was actually rust_binary, so maybe that's the reason, but I couldn't successfully compile using only metadata from --emit=metadata. Using metadata from --emit=metadata,mir was enough to fix the build. I have not tried using metadata from --emit=metadata from dependencies to generate metadata using --emit=metadata from the dependent target.

Let's come up with simpler terminology :)

  • thin metadata = produced by --emit=metadata, only useful for type checking
  • fat metadata = produced by --emit=metadata,mir or link, useful for full compilations

Doing what Cargo does in Bazel is borderline infeasible. I'd like to see the impact of pipelining on build speed when using remote execution and approach 2) that duplicates work.

@ddeville
Copy link
Contributor

ddeville commented Dec 27, 2021

I'm a heavy user of cargo check but the biggest pain point of using bazel vs cargo for us today is actually around clippy (where cargo clippy mostly uses --emit=metadata, "mostly" as I'll explained below).

We have a CI job on diffs that runs cargo clippy and that we expect to complete fairly quickly to get feedback, which means that it has to be much faster than the full build. When using bazel, the clippy aspect does use --emit=metadata for the top-level target but it doesn't alter the dependencies: all the dependencies are still being built with --emit=link. For a project containing hundreds of crates, linting the top level target is basically game over, it's pretty much a full build at this point and is at least 3x slower than cargo clippy (for our use case).

I've been thinking a bit about how this could be addressed (and potentially address the thin metadata typecheck use case too), but nothing seems great so far. Here's what I've considered so far:

  • update the clippy aspect to change dependencies to also use --emit=metadata but that's not helpful since the parallel dependency graph is still using --emit=link so it ends up building that anyway. I'm sure there might be way for aspects to somewhat not depend on the original dependency but I'm far from familiar enough with them to even make a guess :)
  • add a check_only config that the rules could check in order to decide whether to emit an rlib, an rmeta or nothing for things that don't get linked. This way one could run clippy with that flag and all targets in the dependency graph would be built as metadata only. I've actually got somewhat far (see ddeville@5f4b0f2 for a POC) and got it to work nicely for simplish projects but got stuck when trying to deal with cargo build binaries. Build script targets (and proc macros fwiw) have to be built fully since the library depends on their output, even when built with metadata only. That by itself is not too tricky to deal with but the fact that build binaries themselves can depend on libraries in the tree means that we cannot just blindly use --emit=metadata for everything. Knowing what library will end up being used in order to run a build script and thus need to generate an .rlib is where I eventually got stuck.

Registering 2 actions sounds interesting, although I don't think it would help with the clippy use case (or for those libraries that happen to depend on a bunch of others in a world where their build artifacts haven't been cached yet).

Has there been any progress / new thinking around this? Would love to help.

@googleson78
Copy link
Contributor

This is now somewhat possible, see #1649

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants