Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paper Submission (OSDI 25) Sprint Improvements #170

Merged
merged 47 commits into from
Dec 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
675eca6
Readded line counting
JustusAdam Nov 28, 2024
9f40e6d
Debug weird spans
JustusAdam Nov 28, 2024
1b9e867
Warn that shortest path is buggy
JustusAdam Nov 28, 2024
8793b1c
make span encoding more efficient
JustusAdam Nov 29, 2024
02b999c
Workaround for check-cfg error
JustusAdam Nov 29, 2024
f0c9180
Use single artifact and gzip it
JustusAdam Nov 30, 2024
259604f
Deleted the wrong file
JustusAdam Nov 30, 2024
7f8a383
Need to overwrite
JustusAdam Nov 30, 2024
8a30c9e
Workaround actually unnecessary
JustusAdam Nov 30, 2024
711e479
Measure total time spent in just flow analyzer
JustusAdam Dec 1, 2024
dc23529
Improve span emission
JustusAdam Dec 2, 2024
67e89fc
make sure targets and their async closures are in analzed spans
JustusAdam Dec 2, 2024
ffd90be
Move analysis earlier and load local bodies lazily
JustusAdam Dec 4, 2024
7949b7c
Add mir emission stats
JustusAdam Dec 4, 2024
c18c092
Optional compression and more efficient body fetching
JustusAdam Dec 4, 2024
2f1c24c
Merge branch 'counting-returns' of github.com:brownsys/paralegal into…
JustusAdam Dec 4, 2024
7eef598
Fix duplicate async fiunction emission
JustusAdam Dec 4, 2024
fa394ec
Fix policy for reachable markers
JustusAdam Dec 4, 2024
ecd4b8c
Fix async trait detection predicate
JustusAdam Dec 4, 2024
bcab329
Testing if marked functions are detected when stubbed
JustusAdam Dec 5, 2024
e6d95b8
Nailed down the blocking marker error
JustusAdam Dec 5, 2024
cac3e5b
Implement handling for FnPtrShims
JustusAdam Dec 5, 2024
26f53ad
Extended syntax for external markers
JustusAdam Dec 5, 2024
b5949de
Better error mesage for refinement on types
JustusAdam Dec 5, 2024
783319b
Rename context
JustusAdam Dec 6, 2024
1f73b2e
Allow flows to return the reached sink
JustusAdam Dec 6, 2024
6a41cc9
Test cases for async_trait and tracing::instrument
JustusAdam Dec 6, 2024
4454b48
Cross-procedure control flow and control flow predicate fix
JustusAdam Dec 7, 2024
07cc49b
Fix influencers and influencees, run and debug test suite
JustusAdam Dec 7, 2024
93c6139
Handle chims in marker reachabiliy
JustusAdam Dec 7, 2024
f491a09
Collect timings for dumped crates
JustusAdam Dec 7, 2024
b60ac57
Measure serialization
JustusAdam Dec 8, 2024
834f331
Optional cache disabling
JustusAdam Dec 8, 2024
967e0f2
Measure tycheck and dep time
JustusAdam Dec 8, 2024
081d98f
make determine_call_handling a bit lighter
JustusAdam Dec 9, 2024
8c5e1e1
Use buffered readers and writers
JustusAdam Dec 9, 2024
aeaeab2
Cleanup and remove unused features
JustusAdam Dec 18, 2024
cf61181
Merge branch 'main' into counting-returns
JustusAdam Dec 18, 2024
f832de5
Fix test utils
JustusAdam Dec 18, 2024
15888c1
Formatting and clippy
JustusAdam Dec 18, 2024
4a91946
Output stats also when analyzing
JustusAdam Dec 18, 2024
ca18d53
Count the time correctly
JustusAdam Dec 18, 2024
e6914fc
Fix documentation
JustusAdam Dec 18, 2024
f7706ff
Allow unused field
JustusAdam Dec 18, 2024
6db5b45
Apparently the github action now requires force here?
JustusAdam Dec 19, 2024
e284abc
Fix some typos
JustusAdam Dec 19, 2024
5f425c7
Argument order
JustusAdam Dec 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ target/

/rust-dfpp-patch
flow-graph.json
flow-graph.stat.json

# Local cargo configuration
.cargo/config.toml
Expand Down
4 changes: 3 additions & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ rustc_plugin = "=0.7.4-nightly-2023-08-25"
[workspace.dependencies.flowistry]
# path = "../flowistry/crates/flowistry"
git = "https://github.com/brownsys/flowistry"
rev = "08c4ad9587b3251a8f7c64aa60be31404e6e04c0"
rev = "57627ed24802cb76c745118bbc46f83f402a1e88"
default-features = false


Expand Down
6 changes: 3 additions & 3 deletions Makefile.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
# - `fix-all` tries to fix all the formatting and code style issues. This is
# most likely the command you want to run.
# - `check-all` runs all the formatting and code-style checks we also run in CI
# This performs all teh same operations as `fix-ll` but complains instead of
# This performs all the same operations as `fix-ll` but complains instead of
# solving them.
# - `pdg-tests` which runs our current complete test suite
#
Expand All @@ -20,7 +20,7 @@ default_to_workspace = false

[tasks.install]
command = "cargo"
args = ["install", "--locked", "--path", "crates/paralegal-flow"]
args = ["install", "--locked", "-f", "--path", "crates/paralegal-flow"]

[tasks.ci-tests]
description = "The suite of tests we run in CI"
Expand All @@ -35,7 +35,7 @@ description = """\
Attempt to perform any formatting and code structure related fixes.

Note: This uses the clippy-fix command under the hood. This can introduce
breaking changes and so this tscript only passes `--allow-staged` to the
breaking changes and so this script only passes `--allow-staged` to the
fix command. What this means for you is that you need to stage (or commit)
your changes in git for the fix command to work. This is a precaution that
allows you to inspect, test and potentially reject clippy's changes before
Expand Down
3 changes: 0 additions & 3 deletions crates/flowistry_pdg_construction/src/async_support.rs
Original file line number Diff line number Diff line change
Expand Up @@ -138,9 +138,6 @@ fn match_pin_box_dyn_ty(lang_items: &rustc_hir::LanguageItems, t: ty::Ty) -> boo
let ty::TyKind::Dynamic(pred, _, ty::DynKind::Dyn) = t_a.boxed_ty().kind() else {
return false;
};
if pred.len() != 2 {
return false;
}
pred.iter().any(|p| {
let ty::ExistentialPredicate::Trait(t) = p.skip_binder() else {
return false;
Expand Down
119 changes: 79 additions & 40 deletions crates/flowistry_pdg_construction/src/body_cache.rs
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
use std::path::PathBuf;
use std::{
cell::RefCell,
path::PathBuf,
time::{Duration, Instant},
};

use flowistry::mir::FlowistryInput;

use polonius_engine::FactTypes;
use rustc_borrowck::consumers::{ConsumerOptions, RustcFacts};
use rustc_hash::FxHashMap;
use rustc_hir::{
def_id::{CrateNum, DefId, LocalDefId, LOCAL_CRATE},
def_id::{CrateNum, DefId, DefIndex, LocalDefId, LOCAL_CRATE},
intravisit::{self},
};
use rustc_macros::{Decodable, Encodable, TyDecodable, TyEncodable};
Expand Down Expand Up @@ -36,33 +41,40 @@ impl<'tcx> CachedBody<'tcx> {
let mut body_with_facts = rustc_borrowck::consumers::get_body_with_borrowck_facts(
tcx,
local_def_id,
ConsumerOptions::PoloniusInputFacts,
ConsumerOptions::RegionInferenceContext,
);

clean_undecodable_data_from_body(&mut body_with_facts.body);

Self {
body: body_with_facts.body,
input_facts: FlowistryFacts {
subset_base: body_with_facts.input_facts.unwrap().subset_base,
subset_base: body_with_facts
.region_inference_context
.outlives_constraints()
.map(|constraint| (constraint.sup, constraint.sub))
.collect(),
},
}
}
}

impl<'tcx> FlowistryInput<'tcx> for &'tcx CachedBody<'tcx> {
impl<'tcx> FlowistryInput<'tcx, 'tcx> for &'tcx CachedBody<'tcx> {
fn body(self) -> &'tcx Body<'tcx> {
&self.body
}

fn input_facts_subset_base(
self,
) -> &'tcx [(
<RustcFacts as FactTypes>::Origin,
<RustcFacts as FactTypes>::Origin,
<RustcFacts as FactTypes>::Point,
)] {
&self.input_facts.subset_base
) -> Box<
dyn Iterator<
Item = (
<RustcFacts as FactTypes>::Origin,
<RustcFacts as FactTypes>::Origin,
),
> + 'tcx,
> {
Box::new(self.input_facts.subset_base.iter().copied())
}
}

Expand All @@ -73,48 +85,70 @@ pub struct FlowistryFacts {
pub subset_base: Vec<(
<RustcFacts as FactTypes>::Origin,
<RustcFacts as FactTypes>::Origin,
<RustcFacts as FactTypes>::Point,
)>,
}

pub type LocationIndex = <RustcFacts as FactTypes>::Point;

type BodyMap<'tcx> = FxHashMap<DefIndex, CachedBody<'tcx>>;

/// Allows loading bodies from previosly written artifacts.
///
/// Ensure this cache outlives any flowistry analysis that is performed on the
/// bodies it returns or risk UB.
pub struct BodyCache<'tcx> {
tcx: TyCtxt<'tcx>,
cache: Cache<DefId, CachedBody<'tcx>>,
cache: Cache<CrateNum, BodyMap<'tcx>>,
local_cache: Cache<DefIndex, CachedBody<'tcx>>,
timer: RefCell<Duration>,
}

impl<'tcx> BodyCache<'tcx> {
pub fn new(tcx: TyCtxt<'tcx>) -> Self {
Self {
tcx,
cache: Default::default(),
local_cache: Default::default(),
timer: RefCell::new(Duration::ZERO),
}
}

pub fn timer(&self) -> Duration {
*self.timer.borrow()
}

/// Serve the body from the cache or read it from the disk.
///
/// Returns `None` if the policy forbids loading from this crate.
pub fn get(&self, key: DefId) -> &'tcx CachedBody<'tcx> {
let cbody = self.cache.get(key, |_| load_body_and_facts(self.tcx, key));
let body = if let Some(local) = key.as_local() {
self.local_cache.get(local.local_def_index, |_| {
let start = Instant::now();
let res = CachedBody::retrieve(self.tcx, local);
*self.timer.borrow_mut() += start.elapsed();
res
})
} else {
self.cache
.get(key.krate, |_| load_body_and_facts(self.tcx, key.krate))
.get(&key.index)
.expect("Invariant broken, body for this is should exist")
};

// SAFETY: Theoretically this struct may not outlive the body, but
// to simplify lifetimes flowistry uses 'tcx anywhere. But if we
// actually try to provide that we're risking race conditions
// (because it needs global variables like MIR_BODIES).
//
// So until we fix flowistry's lifetimes this is good enough.
unsafe { std::mem::transmute(cbody) }
unsafe { std::mem::transmute(body) }
}
}

/// A visitor to collect all bodies in the crate and write them to disk.
struct DumpingVisitor<'tcx> {
tcx: TyCtxt<'tcx>,
target_dir: PathBuf,
targets: Vec<LocalDefId>,
}

/// Some data in a [Body] is not cross-crate compatible. Usually because it
Expand Down Expand Up @@ -150,20 +184,7 @@ impl<'tcx> intravisit::Visitor<'tcx> for DumpingVisitor<'tcx> {
_: rustc_span::Span,
local_def_id: rustc_hir::def_id::LocalDefId,
) {
let to_write = CachedBody::retrieve(self.tcx, local_def_id);

let dir = &self.target_dir;
let path = dir.join(
self.tcx
.def_path(local_def_id.to_def_id())
.to_filename_friendly_no_crate(),
);

if !dir.exists() {
std::fs::create_dir(dir).unwrap();
}

encode_to_file(self.tcx, path, &to_write);
self.targets.push(local_def_id);

intravisit::walk_fn(
self,
Expand All @@ -179,14 +200,30 @@ impl<'tcx> intravisit::Visitor<'tcx> for DumpingVisitor<'tcx> {
/// calculating the necessary borrowcheck facts to store for later points-to
/// analysis.
///
/// Ensure this gets called early in the compiler before the unoptimmized mir
/// Ensure this gets called early in the compiler before the unoptimized mir
/// bodies are stolen.
pub fn dump_mir_and_borrowck_facts(tcx: TyCtxt) {
pub fn dump_mir_and_borrowck_facts<'tcx>(tcx: TyCtxt<'tcx>) -> (Duration, Duration) {
let mut vis = DumpingVisitor {
tcx,
target_dir: intermediate_out_dir(tcx, INTERMEDIATE_ARTIFACT_EXT),
targets: vec![],
};
tcx.hir().visit_all_item_likes_in_crate(&mut vis);

let tc_start = Instant::now();
let bodies: BodyMap<'tcx> = vis
.targets
.iter()
.map(|local_def_id| {
let to_write = CachedBody::retrieve(tcx, *local_def_id);

(local_def_id.local_def_index, to_write)
})
.collect();
let tc_time = tc_start.elapsed();
let dump_time = Instant::now();
let path = intermediate_out_dir(tcx, INTERMEDIATE_ARTIFACT_EXT);
encode_to_file(tcx, path, &bodies);
(tc_time, dump_time.elapsed())
}

const INTERMEDIATE_ARTIFACT_EXT: &str = "bwbf";
Expand All @@ -206,16 +243,18 @@ pub fn local_or_remote_paths(krate: CrateNum, tcx: TyCtxt, ext: &str) -> Vec<Pat
}

/// Try to load a [`CachedBody`] for this id.
fn load_body_and_facts(tcx: TyCtxt<'_>, def_id: DefId) -> CachedBody<'_> {
let paths = local_or_remote_paths(def_id.krate, tcx, INTERMEDIATE_ARTIFACT_EXT);
fn load_body_and_facts(tcx: TyCtxt<'_>, krate: CrateNum) -> BodyMap<'_> {
let paths = local_or_remote_paths(krate, tcx, INTERMEDIATE_ARTIFACT_EXT);
for path in &paths {
let path = path.join(tcx.def_path(def_id).to_filename_friendly_no_crate());
if let Ok(data) = decode_from_file(tcx, path) {
return data;
};
if !path.exists() {
continue;
}

let data = decode_from_file(tcx, path).unwrap();
return data;
}

panic!("No facts for {def_id:?} found at any path tried: {paths:?}");
panic!("No facts for {krate:?} found at any path tried: {paths:?}");
}

/// Create the name of the file in which to store intermediate artifacts.
Expand Down
31 changes: 23 additions & 8 deletions crates/flowistry_pdg_construction/src/calling_convention.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,11 @@ use rustc_middle::{
ty::TyCtxt,
};

use crate::{async_support::AsyncInfo, local_analysis::CallKind, utils};
use crate::{
async_support::AsyncInfo,
local_analysis::CallKind,
utils::{self, ShimType},
};

/// Describes how the formal parameters of a given function call relate to the
/// actual parameters.
Expand All @@ -21,7 +25,7 @@ pub enum CallingConvention<'tcx> {
/// tuple that contains the actual argument to the call of the closure
/// function.
Indirect {
once_shim: bool,
shim: Option<ShimType>,
closure_arg: Operand<'tcx>,
tupled_arguments: Operand<'tcx>,
},
Expand Down Expand Up @@ -72,8 +76,8 @@ impl<'tcx> CallingConvention<'tcx> {
match kind {
CallKind::AsyncPoll(poll) => CallingConvention::Async(poll.generator_data),
CallKind::Direct => CallingConvention::Direct(args.into()),
CallKind::Indirect { once_shim } => CallingConvention::Indirect {
once_shim: *once_shim,
CallKind::Indirect { shim: once_shim } => CallingConvention::Indirect {
shim: *once_shim,
closure_arg: args[0].clone(),
tupled_arguments: args[1].clone(),
},
Expand Down Expand Up @@ -184,13 +188,24 @@ impl<'a, 'tcx> PlaceTranslator<'a, 'tcx> {
// Map closure captures to the first argument.
// Map formal parameters to the second argument.
CallingConvention::Indirect {
once_shim,
shim,
closure_arg,
tupled_arguments,
} => {
if child.local.as_usize() == 1 {
// Accounting fot FnPtrShim
//
// The shim gets an extra first argument (the function pointer)
// but we replace it with the function iself which doesn't have
// that argument, so we need to adjust the indices
let local = if matches!(shim, Some(ShimType::FnPtr)) && child.local != RETURN_PLACE
{
child.local + 1
} else {
child.local
};
if local.as_usize() == 1 {
// Accounting for shims
let next_idx = if *once_shim {
let next_idx = if matches!(shim, Some(ShimType::Once)) {
// If this is a once shim then the signature of the
// function and its call don't match fully. (We are
// calling a closure that takes it's `self` by reference
Expand All @@ -214,7 +229,7 @@ impl<'a, 'tcx> PlaceTranslator<'a, 'tcx> {
} else {
let tuple_arg = tupled_arguments.place()?;
let _projection = child.projection.to_vec();
let field = FieldIdx::from_usize(child.local.as_usize() - 2);
let field = FieldIdx::from_usize(local.as_usize() - 2);
let field_ty = tuple_arg
.ty(self.parent_body, self.tcx)
.field_ty(self.tcx, field);
Expand Down
Loading
Loading