From 50f250d4e30acfb200649cc5907931630b7ac7c8 Mon Sep 17 00:00:00 2001 From: Connor Horman Date: Tue, 10 Dec 2024 11:36:48 -0500 Subject: [PATCH] Change definition of memory.encoding in response to PR comments --- src/memory-model.md | 33 +++++++++++++++++++++++++++------ 1 file changed, 27 insertions(+), 6 deletions(-) diff --git a/src/memory-model.md b/src/memory-model.md index adac052fe..4f7c9f1e1 100644 --- a/src/memory-model.md +++ b/src/memory-model.md @@ -15,31 +15,52 @@ The most basic unit of memory in Rust is a byte. All values in Rust are computed > While bytes in Rust are typically lowered to hardware bytes, they may contain additional values, > such as being uninitialized, or storing part of a pointer. +r[memory.byte.contents] +Each byte may have one of the following values: + r[memory.byte.init] -Each byte may be initialized, and contain a value of type `u8`, as well as an optional pointer fragment. When present, the pointer fragment carries [provenance][type.pointer.provenance] information. +* An initialized byte containing a `u8` value and optional [provenance][type.pointer.provenance], r[memory.byte.uninit] -Each byte may be uninitialized. +* An uninitialized byte. > [!NOTE] > Uninitialized bytes do not have a value and do not have a pointer fragment. +> [!NOTE] +> The above list is not yet guaranteed to be exhaustive. + ## Value Encoding r[memory.encoding] r[memory.encoding.intro] -Each type in Rust has 0 or more values, which can have operations performed on them +Each type in Rust has 0 or more values, which can have operations performed on them. Values are represented in memory by encoding them > [!NOTE] > `0u8`, `1337i16`, and `Foo{bar: "baz"}` are all values r[memory.encoding.op] -Each value of a type can be encoded into a sequence of bytes, and decoded from a sequence of bytes, which has a length equal to the size of the type. -The operation to encode or decode a value is determined by the representation of the type. +Each type defines a pair of properties which, together, define the representation of values of the type. The *encode* operation takes a value of the type and converts it into a sequence of bytes equal in length to the size of the type, and the *decode* operation takes such a sequence of bytes and optionally converts it into a value. Encoding occurs when a value is written to memory, and decoding occurs when a value is read from memory. + +> [!NOTE] +> Only certain byte sequences may decode into a value of a given type. For example, a byte sequence consisting of all zeroes does not decode to a value of a reference type. + +r[memory.encoding.representation] +A sequence of bytes is said to represent a value of a type, if the decode operation for that type produces that value from that sequence of bytes. The representation of a type is the partial relation between byte sequences and values those sequences represent. > [!NOTE] > Representation is related to, but is not the same property as, the layout of the type. +r[memory.encoding.symmetric] +The result of encoding a given value of a type is a sequence of bytes that represents that value. + +> [!NOTE] +> This means that a value can be copied into memory and copied out and the result is the same value. +> The reverse is not necessarily true, a sequence of bytes read as a value then written to another location (called a typed copy) will not necessarily yield the same sequence of bytes. For example, a typed copy of a struct type will leave the padding bytes of that struct uninitialized. + r[memory.encoding.decode] -If a value of type `T` is decoded from a sequence of bytes that does not correspond to a defined value, the behavior is undefined. If a value of type `T` is decoded from a sequence of bytes that contain pointer fragments, which are not used to represent the value, the pointer fragments are ignored. +If a value of type `T` is decoded from a sequence of bytes that does not represent any value, the behavior is undefined. + +> [!NOTE] +> For example, it is undefined behavior to read a `0x02` byte as `bool`.