Skip to content

Pointer static analysis #536

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 38 commits into from
Jul 29, 2022
Merged

Pointer static analysis #536

merged 38 commits into from
Jul 29, 2022

Conversation

spernsteiner
Copy link
Collaborator

This is an initial prototype of a static analysis for turning raw pointers into safe references. It computes a new (hopefully safer) type for each pointer-typed local variable in each function, and determines where to insert casts or other conversions to make the resulting program typecheck. It can produce & and &mut references, slices, and has partial support for using &Cell<T> for shared mutable references.

The analysis consists of basic dataflow analysis to find pointers that are later written or passed to offset (pointer arithmetic), and a separate pass that invokes the Polonius borrow checker to find places where it's legal to use &mut references. It computes flags for each pointer, indicating whether the pointer may be read, written, or offset, whether it's unique (&mut-compatible), and some other properties. Based on these flags, it chooses new safe reference types and generates casts.

To run it on an example:

cargo run --bin c2rust-analyze -- tests/filecheck/insertion_sort.rs \
    -L "$(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/lib" \
    --crate-type rlib

This will print a bunch of debug output, including a table of type and expression rewrites the analysis inferred for each function.

To run the test suite, use cargo test. The tests rely on LLVM's FileCheck tool - if FileCheck is installed at a non-default path (such as /usr/bin/FileCheck-11 instead of /usr/bin/FileCheck), then set FILECHECK=path/to/FileCheck before running the tests.

Copy link
Contributor

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to merge this as is, as it's quite a lot of code, but fully separate at the moment, and then work on integrating it with the rest of the codebase and reviewing it more.

For integration into the codebase, here are some things we need to do:

  • Add c2rust-analyze to the workspace.
  • Gate c2rust-analyze behind a feature like c2rust-dynamic-instrumentation currently is, as I think it's important to keep cargo +stable build working. Note that we should do this for c2rust-pdg, too.
    • This isn't immediately necessary. Only when we add pdg and analyze as subcommands to c2rust.
  • Use rustc_private_link for the logic in build.rs as it's mostly the same. This will also allow you to avoid having to pass the -L .../rustlib/.../lib at runtime.
    • The only difference is the use of $HOST instead of $TARGET. Why is $HOST used?
  • Get cargo test to run fine in CI, so make sure FileCheck is installed.
  • Get cargo test to work out-of-the-box as much as possible. FileCheck-14 on $PATH should just work, just like llvm-config-14 on $PATH just works. llvm-config --bindir may be able to work for this.
  • Fix all the compile warnings, which is mostly a lot of unused stuff. A large portion of this can be fixed with cargo fix.
  • Fix all the clippy warnings.
  • cargo fmt.

There are a couple of other little issues I commented on as well, some of which I repeated above, too.

Also, one other thing: what are facts?

@@ -0,0 +1,11 @@
```sh
rustc --crate-type rlib instrument_support.rs
cargo run -- fib.rs -L ~/.rustup/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See rustc_private_link::SysRoot::link_rustc_private for how we automated this -L ~/.rustup/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/ part in build.rss. It might be able to avoid having to specify this here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the readme (which was horribly out of date) but left the big -L path in there. I'm not sure what's the right solution for setting the default library paths (e.g. do we care about cross-compiling?), but dynamic_instrumentation has the same issue, and I figure whenever it becomes a real problem, we can come up with a general solution and apply it to both at once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we're just adding ~/.rustup/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib to rpath, and that seems to work (and get rid of needing -L or setting LD_LIBRARY_PATH). Cross-compiling is trickier, though. I think we should ignore cross-compiling for now, and when we get to it, cross-compile only as a cargo subcommand, which should set the right LD_LIBRARY_PATH for us.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you're describing is a different situation. There are three different times when we care about the sysroot / lib / rustlib dirs:

  1. When building c2rust-analyze / c2rust-instrument, the linker needs the rustlib path in order to link against the various rustc_* libraries
  2. When running c2rust-analyze / c2rust-instrument, the dynamic loader needs the lib path in order to find librustc_driver.so
  3. When compiling another program using c2rust-analyze / c2rust-instrument, the tool needs the rustlib path in order to find libcore/libstd, which are dependencies of the program being compiled.

build.rs calling rustc_private_link solves (1); setting the rpath solves (2); this flag in the readme covers (3).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's what it's being used for. Doesn't cargo find libstd by itself normally?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rustc can find libstd on its own if everything is installed normally. I haven't checked, but I would guess that cargo (outside -Z build-std mode) just lets rustc find libstd as it usually does. In this case, we don't have working cargo integration for c2rust-analyze at the moment, and our rustc wrapper (the c2rust-analyze binary) is not installed in the usual way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For working cargo integration, which I think we definitely want, see #554 for a better way to do it. We may be able to share some of that code.

let lib_dir = env::var("C2RUST_TARGET_LIB_DIR").unwrap();
let lib_dir = &lib_dir;

let filecheck_bin = env::var("FILECHECK").unwrap_or_else(|_| "FileCheck".into());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to add FileCheck to CI and the Docker containers/set FILECHECK to the installed one. We should get this working out-of-the-box if possible, so that cargo test just works.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the docker setup defined? At least on Debian, FileCheck is in a separate package llvm-tools, which will need to be added to the dependencies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In scripts/provision_{arch,deb,dnf,mac,yum}.sh.

target/ is already covered by a rule from the parent .gitignore.
@kkysen
Copy link
Contributor

kkysen commented Jul 28, 2022

@spernsteiner, can we get this merged? It seems just the FileCheck CI integration is left, but we can do that separately, as it'll need a Docker container push anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants