Skip to content

Commit

Permalink
Implement serialization protection via magic (#169)
Browse files Browse the repository at this point in the history
## What Changed?

This adds a u64 magic number to the beginning of the output artifact
from `paralegal-flow` which is verified upon loading the artifact.

The magic number is a hash of the modification dates of all files in the
`paralegal-spdg` crate.

## Why Does It Need To?

At present there is no protection that the format expected by a policy
application conforms to the format written by the flow analyzer.
Specifically the analyzer may be compiled against an older or newer
version of `paralegal-spdg`. In such cases the derived `serde`
implementation for the output artifact may differ. `bincode`, the
serialization library we use, does not offer protection against this
case, leading to out-of-memory errors and memory leaks in
deserialization.

The magic hash added detects this version mismatch and reports it.

## Checklist

- [x] Above description has been filled out so that upon quash merge we
have a
  good record of what changed.
- [x] New functions, methods, types are documented. Old documentation is
updated
  if necessary
- [ ] Documentation in Notion has been updated
- [ ] Tests for new behaviors are provided
  - [ ] New test suites (if any) ave been added to the CI tests (in
`.github/workflows/rust.yml`) either as compiler test or integration
test.
*Or* justification for their omission from CI has been provided in this
PR
    description.
  • Loading branch information
JustusAdam authored Dec 18, 2024
1 parent c52fc5c commit 52a64bd
Show file tree
Hide file tree
Showing 3 changed files with 97 additions and 2 deletions.
3 changes: 3 additions & 0 deletions crates/paralegal-spdg/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,6 @@ dot = { git = "https://github.com/JustusAdam/dot-rust", rev = "ff2b42ceda98c639c
serde_json = { version = "1" }
bincode = { version = "1.1.3", optional = true }
anyhow = { workspace = true }

[build-dependencies]
anyhow = "1"
70 changes: 70 additions & 0 deletions crates/paralegal-spdg/build.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
use anyhow::Context;
use anyhow::Result;
use std::collections::hash_map::DefaultHasher;
use std::env;
use std::fs;
use std::hash::{Hash, Hasher};
use std::path::{Path, PathBuf};
use std::time::SystemTime;

fn rustup_toolchain_path() -> PathBuf {
let rustup_home = env::var("RUSTUP_HOME").unwrap();
let rustup_tc = env::var("RUSTUP_TOOLCHAIN").unwrap();
[&rustup_home, "toolchains", &rustup_tc]
.into_iter()
.collect()
}

fn get_rustup_lib_path() -> PathBuf {
let mut rustup_lib = rustup_toolchain_path();
rustup_lib.push("lib");
rustup_lib
}

/// Helper for calculating a hash of all (modification dates of) Rust files in this crate.
fn visit_dirs(dir: &Path, hasher: &mut DefaultHasher) -> Result<()> {
if !dir.is_dir() {
return Ok(());
}
for entry in fs::read_dir(dir)? {
let entry = entry?;
let path = entry.path();
if path.is_dir() {
visit_dirs(&path, hasher)?;
} else if path.extension().map_or(false, |ext| ext == "rs") {
let metadata = entry.metadata()?;
let modified = metadata.modified()?;

let duration = modified.duration_since(SystemTime::UNIX_EPOCH)?;
duration.as_secs().hash(hasher);
// Tell Cargo to rerun if this source file changes
println!("cargo:rerun-if-changed={}", path.display());
}
}
Ok(())
}

/// Calculate a hash of all (modification dates of) Rust files in this crate.
fn calculate_source_hash() -> u64 {
let mut hasher = DefaultHasher::new();

// Start from the src directory
visit_dirs(Path::new("src"), &mut hasher)
.with_context(|| "Calculating source hash")
.unwrap();

hasher.finish()
}

fn main() {
let magic = calculate_source_hash();

// Emit the hash as an environment variable
println!("cargo:rustc-env=SER_MAGIC={:0}", magic);

// Original linux-specific code
if cfg!(target_os = "linux") {
let rustup_lib = get_rustup_lib_path();
println!("cargo:rustc-link-search=native={}", rustup_lib.display());
}
}
26 changes: 24 additions & 2 deletions crates/paralegal-spdg/src/ser.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,11 @@
//!
use anyhow::{Context, Ok, Result};
use cfg_if::cfg_if;
use std::{fs::File, path::Path};
use std::{
fs::File,
io::{Read, Write},
path::Path,
};

use crate::ProgramDescription;

Expand All @@ -15,11 +19,20 @@ cfg_if! {
}
}

/// A magic hash number used to verify version compatibility of output
/// artifacts. Used in reading and writing the [`ProgramDescription`] See
/// build.rs for how this number is created.
fn ser_magic() -> u64 {
const SER_MAGIC: &str = env!("SER_MAGIC");
SER_MAGIC.parse().unwrap()
}

impl ProgramDescription {
/// Write `self` using the configured serialization format
pub fn canonical_write(&self, path: impl AsRef<Path>) -> Result<()> {
let path = path.as_ref();
let mut out_file = File::create(path)?;
out_file.write_all(&ser_magic().to_le_bytes())?;
cfg_if! {
if #[cfg(feature = "binenc")] {
let write = bincode::serialize_into(
Expand Down Expand Up @@ -47,7 +60,16 @@ impl ProgramDescription {
/// Read `self` using the configured serialization format
pub fn canonical_read(path: impl AsRef<Path>) -> Result<Self> {
let path = path.as_ref();
let in_file = File::open(path)?;
let mut in_file = File::open(path)?;
let magic = {
let mut buf = [0u8; 8];
in_file.read_exact(&mut buf).context("Reading magic")?;
u64::from_le_bytes(buf)
};
let ser_magic = ser_magic();
if magic != ser_magic {
anyhow::bail!("Magic number mismatch: Expected {ser_magic:x}, got {magic:x}. Likely this application was compiled against a different version of the paralegal-spdg library then used by the flow analyzer.");
}
cfg_if! {
if #[cfg(feature = "binenc")] {
let read = bincode::deserialize_from(
Expand Down

0 comments on commit 52a64bd

Please sign in to comment.