Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Forward" cross-crate PDG construction #153

Closed
wants to merge 98 commits into from
Closed

"Forward" cross-crate PDG construction #153

wants to merge 98 commits into from

Conversation

JustusAdam
Copy link
Collaborator

@JustusAdam JustusAdam commented May 28, 2024

What Changed?

Implements a composable per-crate analysis which generates PDGs that span across crates in the "forward" direction, meaning that the partial PDG from a function defined by a dependency composes with its callers.

This further changes internal handling of functions to remove FnResolution. Now we always assume that all generics can be substituted, an invariant we guarantee by instantiating unknown generic arguments always as dyn <constraints>.

Why Does It Need To?

Non-trivial Rust projects often distribute the business logic across multiple crates, for instance a models, api and utils crate (Lemmy) or a library and separate applications for web and desktop (Atomic Data).

To ensure Paralegal can see all relevant data interactions it's analysis must cover all relevant crates.

Caveats

There is a "backwards" direction where a trait method implemented in the caller crate composes with functions in the dependency parameterized over the trait. This is future work.

The tracking of generic arguments during construction (via the monos map) is wasteful and could likely be optimized and calculated on the fly.

Checklist

  • Above description has been filled out so that upon quash merge we have a
    good record of what changed.
  • New functions, methods, types are documented. Old documentation is updated
    if necessary
  • Documentation in Notion has been updated
  • Tests for new behaviors are provided
    • New test suites (if any) ave been added to the CI tests (in
      .github/workflows/rust.yml) either as compiler test or integration test.
      Or justification for their omission from CI has been provided in this PR
      description.

@JustusAdam JustusAdam marked this pull request as draft July 23, 2024 18:03
JustusAdam added a commit that referenced this pull request Jul 29, 2024
## What Changed?

Adds the capability to load bodies and borrowcheck facts from non-local
crates to facilitate analysis across the boundaries of the local crate.
The design of this implementation is based on a similar mechanism in a
purity analyzer by @artemagvanian.

Adds the `--include` command line argument to specify names of crates
that should be loadable in addition to the analysis target.

The basic mechanism is that, in every selected crate (either main target
or `--include`) we run a visitor after expansion that uses
`get_body_with_borrowck_facts`, sanitizes that body by removing
crate-local data and writes it plus relevant lifetime information to
disk using `Encodable`. Downstream crates can locate the written data
and load it back in using `Decodable`. A nicety of this approach is that
we always do this, even or the main target crate, which means there is
only one code path instead of a separate local and remote one.

I the process of implementing this I realize that our points-to analysis
actually doesn't need the whole `BodyWithBorrockFacts` but only
`input_facts.subset_base` (and the body itself). I parameterized the
points-to analysis, such that it can accept a reduced input, and we only
store the relevant data.

Also fully supports marker annotations in foreign crates.

The implicit body cache `MIR_BODIES` we used before is now replaced by
an explicit `BodyCache`.

A large part of the changes is just changing `LocalDefId` to `DefId` in
call strings and the like as well as ensuring that all places that used
to call `rustc_utils`' `get_body_with_borrowck_facts` is replaced with
loading from the `BodyCache`.

### Caveats

The statistics about how many functions were seen etc are disabled for
now (always 0).

## Why Does It Need To?

Analyzing only the local crate is a severe limitation of the tool which
would be lifted by this PR.

There is a concurrent attempt (#153) to address the same issue. This is
a simpler, but potentially less scalable approach. In addition this
approach has full support for "both directions" of cross crate. The
"forward direction" extends the PDG from the local crate (main analysis
target) such that it also models functions in dependencies. The
"backwards" direction ensures that function in the dependency, which are
parameterized by traits where the impl is in the local trait, that impl
composes with the function in the dependency.

## Checklist

- [x] Above description has been filled out so that upon quash merge we
have a
  good record of what changed.
- [x] New functions, methods, types are documented. Old documentation is
updated
  if necessary
- [ ] Documentation in Notion has been updated
- [x] Tests for new behaviors are provided
  - [x] New test suites (if any) ave been added to the CI tests (in
`.github/workflows/rust.yml`) either as compiler test or integration
test.
*Or* justification for their omission from CI has been provided in this
PR
    description.
@JustusAdam JustusAdam closed this Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants