NOTE: This specification is primarily defined in the context of Rust, but aims to be implementable across different programming languages.
- Variant: A specific constructor or case of an enum type.
- Variant Payload: The associated data of a specific enum variant.
- Discriminant: A unique identifier for an enum variant, typically represented as an integer.
- Basic Types: Primitive types that have a direct, well-defined binary representation.
By default, this serialization format uses little-endian byte order for basic numeric types. This means multi-byte values are encoded with their least significant byte first.
Endianness can be configured with the following methods, allowing for big-endian serialization when required:
- Multi-byte values (integers, floats) are affected by endianness
- Single-byte values (u8, i8) are not affected
- Struct and collection serialization order is not changed by endianness
- Encoded as a single byte
false
is represented by0
true
is represented by1
- During deserialization, values other than 0 and 1 will result in an error
DecodeError::InvalidBooleanValue
- Encoded based on the configured IntEncoding
- Signed integers use 2's complement representation
- Floating point types use IEEE 754-2008 standard
f32
: 4 bytes (binary32)f64
: 8 bytes (binary64)
- Subnormal numbers are preserved
- Also known as denormalized numbers
- Maintain their exact bit representation
NaN
values are preserved- Both quiet and signaling
NaN
are kept as-is - Bit pattern of
NaN
is maintained exactly
- Both quiet and signaling
- No normalization or transformation of special values occurs
- Serialization and deserialization do not alter the bit-level representation
- Consistent with IEEE 754-2008 standard for floating-point arithmetic
char
is encoded as a 32-bit unsigned integer representing its Unicode Scalar Value- Valid Unicode Scalar Value range:
- 0x0000 to 0xD7FF (Basic Multilingual Plane)
- 0xE000 to 0x10FFFF (Supplementary Planes)
- Surrogate code points (0xD800 to 0xDFFF) are not valid
- Invalid Unicode characters can be acquired via unsafe code, this is handled as:
- during serialization: data is written as-is
- during deserialization: an error is raised
DecodeError::InvalidCharEncoding
- No additional metadata or encoding scheme beyond the raw code point value
All tuples have no additional bytes, and are encoded in their specified order, e.g.
let tuple = (u32::min_value(), i32::max_value()); // 8 bytes
let encoded = bincode::encode_to_vec(tuple, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
0, 0, 0, 0, // 4 bytes for first type: u32
255, 255, 255, 127 // 4 bytes for second type: i32
]);
Bincode currently supports 2 different types of IntEncoding
. With the default config, VarintEncoding
is selected.
Encoding an unsigned integer v (of any type excepting u8/i8) works as follows:
- If
u < 251
, encode it as a single byte with that value. - If
251 <= u < 2**16
, encode it as a literal byte 251, followed by a u16 with valueu
. - If
2**16 <= u < 2**32
, encode it as a literal byte 252, followed by a u32 with valueu
. - If
2**32 <= u < 2**64
, encode it as a literal byte 253, followed by a u64 with valueu
. - If
2**64 <= u < 2**128
, encode it as a literal byte 254, followed by a u128 with valueu
.
usize
is being encoded/decoded as a u64
and isize
is being encoded/decoded as a i64
.
See the documentation of VarintEncoding for more information.
- Fixed size integers are encoded directly
- Enum discriminants are encoded as u32
- Lengths and usize are encoded as u64
See the documentation of FixintEncoding for more information.
Enums are encoded with their variant first, followed by optionally the variant fields. The variant index is based on the IntEncoding
during serialization.
Both named and unnamed fields are serialized with their values only, and therefore encode to the same value.
#[derive(bincode::Encode)]
pub enum SomeEnum {
A,
B(u32),
C { value: u32 },
}
// SomeEnum::A
let encoded = bincode::encode_to_vec(SomeEnum::A, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
0, 0, 0, 0, // first variant, A
// no extra bytes because A has no fields
]);
// SomeEnum::B(0)
let encoded = bincode::encode_to_vec(SomeEnum::B(0), bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
1, 0, 0, 0, // second variant, B
0, 0, 0, 0 // B has 1 unnamed field, which is an u32, so 4 bytes
]);
// SomeEnum::C { value: 0u32 }
let encoded = bincode::encode_to_vec(SomeEnum::C { value: 0u32 }, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
2, 0, 0, 0, // third variant, C
0, 0, 0, 0 // C has 1 named field which is a u32, so 4 bytes
]);
Option<T>
is always serialized using a single byte for the discriminant, even in Fixint
encoding (which normally uses a u32
for discriminant).
let data: Option<u32> = Some(123);
let encoded = bincode::encode_to_vec(data, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
1, 123, 0, 0, 0 // the Some(..) tag is the leading 1
]);
let data: Option<u32> = None;
let encoded = bincode::encode_to_vec(data, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
0 // the None tag is simply 0
]);
Collections are encoded with their length value first, followed by each entry of the collection. The length value is based on the configured IntEncoding
.
- Length is always serialized first
- Entries are serialized in the order they are returned from the iterator implementation.
- Iteration order depends on the collection type
- Ordered collections (e.g.,
Vec
): Iteration from lowest to highest index - Unordered collections (e.g.,
HashMap
): Implementation-defined iteration order
- Ordered collections (e.g.,
- Iteration order depends on the collection type
- Duplicate keys are not checked in bincode, but may be resulting in an error when decoding a container from a list of pairs.
- Serialized by iterating from lowest to highest index
- Length prefixed
- Each item serialized sequentially
let list = vec![0u8, 1u8, 2u8];
let encoded = bincode::encode_to_vec(list, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
3, 0, 0, 0, 0, 0, 0, 0, // length of 3u64
0, // entry 0
1, // entry 1
2, // entry 2
]);
- Serialized as a sequence of key-value pairs
- Iteration order is implementation-defined
- Each entry is a tuple of (key, value)
- Bincode will serialize the entries based on the iterator order.
- Deserialization is deterministic but the collection implementation might not guarantee the same order as serialization.
Note: Fixed-length arrays do not have their length encoded. See Arrays for details.
- Strings are encoded as UTF-8 byte sequences
- No null terminator is added
- No Byte Order Mark (BOM) is written
- Unicode non-characters are preserved
- Length is encoded first using the configured
IntEncoding
- Raw UTF-8 bytes follow the length
- Supports the full range of valid UTF-8 sequences
U+0000
and other code points can appear freely within the string
- During serialization, the string is encoded as a sequence of the given bytes.
- Rust strings are UTF-8 encoded by default, but this is not enforced by bincode
- No normalization or transformation of text
- If an invalid UTF-8 sequence is encountered during decoding, an
DecodeError::Utf8
error is raised
let str = "Hello 🌍"; // Mixed ASCII and Unicode
let encoded = bincode::encode_to_vec(str, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
10, 0, 0, 0, 0, 0, 0, 0, // length of the string, 10 bytes
b'H', b'e', b'l', b'l', b'o', b' ', 0xF0, 0x9F, 0x8C, 0x8D // UTF-8 encoded string
]);
- Treated similarly to
Vec<u8>
in serialization - See Collections for more information about length and entry encoding
Array length is never encoded.
Note that &[T]
is encoded as a Collection.
let arr: [u8; 5] = [10, 20, 30, 40, 50];
let encoded = bincode::encode_to_vec(arr, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
10, 20, 30, 40, 50, // the bytes
]);
This applies to any type T
that implements Encode
/Decode
#[derive(bincode::Encode)]
struct Foo {
first: u8,
second: u8
};
let arr: [Foo; 2] = [
Foo {
first: 10,
second: 20,
},
Foo {
first: 30,
second: 40,
},
];
let encoded = bincode::encode_to_vec(&arr, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
10, 20, // First Foo
30, 40, // Second Foo
]);
Tuple fields are serialized in first-to-last declaration order, with no additional metadata.
- No length prefix is added
- Fields are encoded sequentially
- No padding or alignment adjustments are made
- Order of serialization is deterministic and matches the tuple's declaration order
Struct fields are serialized in first-to-last declaration order, with no metadata representing field names.
- No length prefix is added
- Fields are encoded sequentially
- No padding or alignment adjustments are made
- Order of serialization is deterministic and matches the struct's field declaration order
- Both named and unnamed fields are serialized identically
Enum variants are encoded with a discriminant followed by optional variant payload.
- Discriminants are automatically assigned by the derive macro in declaration order
- First variant starts at 0
- Subsequent variants increment by 1
- Explicit discriminant indices are currently not supported
- Discriminant is always represented as a
u32
during serialization. See Discriminant Representation for more details. - Maintains the original enum variant semantics during encoding
- Tuple variants: Fields serialized in declaration order
- Struct variants: Fields serialized in declaration order
- Unit variants: No additional data encoded
- Always encoded as a
u32
- Encoding method depends on the configured
IntEncoding
VarintEncoding
: Variable-length encodingFixintEncoding
: Fixed 4-byte representation
- Payload is serialized immediately after the discriminant
- No additional metadata about field names or types
- Payload structure matches the variant's definition