diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 3f620c18afa7..cc0039d851b7 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -222,6 +222,9 @@ - [Mutable Static Variables](unsafe-rust/mutable-static.md) - [Unions](unsafe-rust/unions.md) - [Unsafe Functions](unsafe-rust/unsafe-functions.md) + - [Unsafe Rust Functions](unsafe-rust/unsafe-functions/rust.md) + - [Unsafe External Functions](unsafe-rust/unsafe-functions/extern-c.md) + - [Calling Unsafe Functions](unsafe-rust/unsafe-functions/calling.md) - [Unsafe Traits](unsafe-rust/unsafe-traits.md) - [Exercise: FFI Wrapper](unsafe-rust/exercise.md) - [Solution](unsafe-rust/solution.md) diff --git a/src/unsafe-rust/unsafe-functions.md b/src/unsafe-rust/unsafe-functions.md index 95bfa52607a8..1025cee284b5 100644 --- a/src/unsafe-rust/unsafe-functions.md +++ b/src/unsafe-rust/unsafe-functions.md @@ -1,100 +1,19 @@ --- -minutes: 5 +minutes: 15 --- # Unsafe Functions -## Calling Unsafe Functions - A function or method can be marked `unsafe` if it has extra preconditions you -must uphold to avoid undefined behaviour: - -```rust,editable -extern "C" { - fn abs(input: i32) -> i32; -} - -fn main() { - let emojis = "🗻∈🌏"; - - // SAFETY: The indices are in the correct order, within the bounds of the - // string slice, and lie on UTF-8 sequence boundaries. - unsafe { - println!("emoji: {}", emojis.get_unchecked(0..4)); - println!("emoji: {}", emojis.get_unchecked(4..7)); - println!("emoji: {}", emojis.get_unchecked(7..11)); - } - - println!("char count: {}", count_chars(unsafe { emojis.get_unchecked(0..7) })); - - // SAFETY: `abs` doesn't deal with pointers and doesn't have any safety - // requirements. - unsafe { - println!("Absolute value of -3 according to C: {}", abs(-3)); - } - - // Not upholding the UTF-8 encoding requirement breaks memory safety! - // println!("emoji: {}", unsafe { emojis.get_unchecked(0..3) }); - // println!("char count: {}", count_chars(unsafe { - // emojis.get_unchecked(0..3) })); -} - -fn count_chars(s: &str) -> usize { - s.chars().count() -} -``` - -## Writing Unsafe Functions +must uphold to avoid undefined behaviour. -You can mark your own functions as `unsafe` if they require particular -conditions to avoid undefined behaviour. +There are two main categories: -```rust,editable -/// Swaps the values pointed to by the given pointers. -/// -/// # Safety -/// -/// The pointers must be valid and properly aligned. -unsafe fn swap(a: *mut u8, b: *mut u8) { - let temp = *a; - *a = *b; - *b = temp; -} - -fn main() { - let mut a = 42; - let mut b = 66; - - // SAFETY: ... - unsafe { - swap(&mut a, &mut b); - } - - println!("a = {}, b = {}", a, b); -} -``` +- Rust functions declared unsafe with `unsafe fn`. +- Foreign functions in `extern "C"` blocks.
-## Calling Unsafe Functions - -`get_unchecked`, like most `_unchecked` functions, is unsafe, because it can -create UB if the range is incorrect. `abs` is unsafe for a different reason: it -is an external function (FFI). Calling external functions is usually only a -problem when those functions do things with pointers which might violate Rust's -memory model, but in general any C function might have undefined behaviour under -any arbitrary circumstances. - -The `"C"` in this example is the ABI; -[other ABIs are available too](https://doc.rust-lang.org/reference/items/external-blocks.html). - -## Writing Unsafe Functions - -We wouldn't actually use pointers for a `swap` function - it can be done safely -with references. - -Note that unsafe code is allowed within an unsafe function without an `unsafe` -block. We can prohibit this with `#[deny(unsafe_op_in_unsafe_fn)]`. Try adding -it and see what happens. This will likely change in a future Rust edition. +We will look at the two kinds of unsafe functions next.
diff --git a/src/unsafe-rust/unsafe-functions/calling.md b/src/unsafe-rust/unsafe-functions/calling.md new file mode 100644 index 000000000000..a1fd2b5aa635 --- /dev/null +++ b/src/unsafe-rust/unsafe-functions/calling.md @@ -0,0 +1,48 @@ +# Calling Unsafe Functions + +Failing to uphold the safety requirements breaks memory safety! + +```rust,editable +#[derive(Debug)] +#[repr(C)] +struct KeyPair { + pk: [u16; 4], // 8 bytes + sk: [u16; 4], // 8 bytes +} + +const PK_BYTE_LEN: usize = 8; + +fn log_public_key(pk_ptr: *const u16) { + let pk: &[u16] = unsafe { std::slice::from_raw_parts(pk_ptr, PK_BYTE_LEN) }; + println!("{pk:?}"); +} + +fn main() { + let key_pair = KeyPair { pk: [1, 2, 3, 4], sk: [0, 0, 42, 0] }; + log_public_key(key_pair.pk.as_ptr()); +} +``` + +Always include a safety comment for each `unsafe` block. It must explain why the +code is actually safe. This example is missing a safety comment and is unsound. + +
+ +Key points: + +- The second argument to `slice::from_raw_parts` is the number of _elements_, + not bytes! This example demonstrates unexpected behavior by reading past the + end of one array and into another. +- This is not actually undefined behaviour, as `KeyPair` has a defined + representation (due to `repr(C)`) and no padding, so the contents of the + second array is also valid to read through the same pointer. +- `log_public_key` should be unsafe, because `pk_ptr` must meet certain + prerequisites to avoid undefined behaviour. A safe function which can cause + undefined behaviour is said to be `unsound`. What should its safety + documentation say? +- The standard library contains many low-level unsafe functions. Prefer the safe + alternatives when possible! +- If you use an unsafe function as an optimization, make sure to add a benchmark + to demonstrate the gain. + +
diff --git a/src/unsafe-rust/unsafe-functions/extern-c.md b/src/unsafe-rust/unsafe-functions/extern-c.md new file mode 100644 index 000000000000..36b145a33369 --- /dev/null +++ b/src/unsafe-rust/unsafe-functions/extern-c.md @@ -0,0 +1,44 @@ +# Unsafe External Functions + +Functions in a foreign language may also be unsafe: + +```rust,editable +use std::ffi::c_char; + +unsafe extern "C" { + // `abs` doesn't deal with pointers and doesn't have any safety requirements. + safe fn abs(input: i32) -> i32; + + /// # Safety + /// + /// `s` must be a pointer to a NUL-terminated C string which is valid and + /// not modified for the duration of this function call. + unsafe fn strlen(s: *const c_char) -> usize; +} + +fn main() { + println!("Absolute value of -3 according to C: {}", abs(-3)); + + unsafe { + // SAFETY: We pass a pointer to a C string literal which is valid for + // the duration of the program. + println!("String length: {}", strlen(c"String".as_ptr())); + } +} +``` + +
+ +- Rust used to consider all extern functions unsafe, but this changed in Rust + 1.82 with `unsafe extern` blocks. +- `abs` must be explicitly marked as `safe` because it is an external function + (FFI). Calling external functions is usually only a problem when those + functions do things with pointers which which might violate Rust's memory + model, but in general any C function might have undefined behaviour under any + arbitrary circumstances. +- The `"C"` in this example is the ABI; + [other ABIs are available too](https://doc.rust-lang.org/reference/items/external-blocks.html). +- Note that there is no verification that the Rust function signature matches + that of the function definition -- that's up to you! + +
diff --git a/src/unsafe-rust/unsafe-functions/rust.md b/src/unsafe-rust/unsafe-functions/rust.md new file mode 100644 index 000000000000..7d6bb01bbccc --- /dev/null +++ b/src/unsafe-rust/unsafe-functions/rust.md @@ -0,0 +1,43 @@ +# Unsafe Rust Functions + +You can mark your own functions as `unsafe` if they require particular +preconditions to avoid undefined behaviour. + +```rust,editable +/// Swaps the values pointed to by the given pointers. +/// +/// # Safety +/// +/// The pointers must be valid, properly aligned, and not otherwise accessed for +/// the duration of the function call. +unsafe fn swap(a: *mut u8, b: *mut u8) { + let temp = *a; + *a = *b; + *b = temp; +} + +fn main() { + let mut a = 42; + let mut b = 66; + + // SAFETY: The pointers must be valid, aligned and unique because they came + // from references. + unsafe { + swap(&mut a, &mut b); + } + + println!("a = {}, b = {}", a, b); +} +``` + +
+ +We wouldn't actually use pointers for a `swap` function --- it can be done +safely with references. + +Note that unsafe code is allowed within an unsafe function without an `unsafe` +block. We can prohibit this with `#[deny(unsafe_op_in_unsafe_fn)]`. Try adding +it and see what happens. This will +[change in the 2024 Rust edition](https://github.com/rust-lang/rust/issues/120535). + +