Serialization Specification

NOTE: This specification is primarily defined in the context of Rust, but aims to be implementable across different programming languages.

Definitions

Variant: A specific constructor or case of an enum type.
Variant Payload: The associated data of a specific enum variant.
Discriminant: A unique identifier for an enum variant, typically represented as an integer.
Basic Types: Primitive types that have a direct, well-defined binary representation.

Endianness

By default, this serialization format uses little-endian byte order for basic numeric types. This means multi-byte values are encoded with their least significant byte first.

Endianness can be configured with the following methods, allowing for big-endian serialization when required:

with_big_endian
with_little_endian

Byte Order Considerations

Multi-byte values (integers, floats) are affected by endianness
Single-byte values (u8, i8) are not affected
Struct and collection serialization order is not changed by endianness

Basic Types

Boolean Encoding

Encoded as a single byte
false is represented by 0
true is represented by 1
During deserialization, values other than 0 and 1 will result in an error DecodeError::InvalidBooleanValue

Numeric Types

Encoded based on the configured IntEncoding
Signed integers use 2's complement representation
Floating point types use IEEE 754-2008 standard
- f32: 4 bytes (binary32)
- f64: 8 bytes (binary64)

Floating Point Special Values

Subnormal numbers are preserved
- Also known as denormalized numbers
- Maintain their exact bit representation
NaN values are preserved
- Both quiet and signaling NaN are kept as-is
- Bit pattern of NaN is maintained exactly
No normalization or transformation of special values occurs
Serialization and deserialization do not alter the bit-level representation
Consistent with IEEE 754-2008 standard for floating-point arithmetic

Character Encoding

char is encoded as a 32-bit unsigned integer representing its Unicode Scalar Value
Valid Unicode Scalar Value range:
- 0x0000 to 0xD7FF (Basic Multilingual Plane)
- 0xE000 to 0x10FFFF (Supplementary Planes)
Surrogate code points (0xD800 to 0xDFFF) are not valid
Invalid Unicode characters can be acquired via unsafe code, this is handled as:
- during serialization: data is written as-is
- during deserialization: an error is raised DecodeError::InvalidCharEncoding
No additional metadata or encoding scheme beyond the raw code point value

All tuples have no additional bytes, and are encoded in their specified order, e.g.

let tuple = (u32::min_value(), i32::max_value()); // 8 bytes
let encoded = bincode::encode_to_vec(tuple, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
    0,   0,   0,   0,  // 4 bytes for first type:  u32
    255, 255, 255, 127 // 4 bytes for second type: i32
]);

IntEncoding

Bincode currently supports 2 different types of IntEncoding. With the default config, VarintEncoding is selected.

VarintEncoding

Encoding an unsigned integer v (of any type excepting u8/i8) works as follows:

If u < 251, encode it as a single byte with that value.
If 251 <= u < 2**16, encode it as a literal byte 251, followed by a u16 with value u.
If 2**16 <= u < 2**32, encode it as a literal byte 252, followed by a u32 with value u.
If 2**32 <= u < 2**64, encode it as a literal byte 253, followed by a u64 with value u.
If 2**64 <= u < 2**128, encode it as a literal byte 254, followed by a u128 with value u.

usize is being encoded/decoded as a u64 and isize is being encoded/decoded as a i64.

See the documentation of VarintEncoding for more information.

FixintEncoding

Fixed size integers are encoded directly
Enum discriminants are encoded as u32
Lengths and usize are encoded as u64

See the documentation of FixintEncoding for more information.

Enums

Enums are encoded with their variant first, followed by optionally the variant fields. The variant index is based on the IntEncoding during serialization.

Both named and unnamed fields are serialized with their values only, and therefore encode to the same value.

#[derive(bincode::Encode)]
pub enum SomeEnum {
    A,
    B(u32),
    C { value: u32 },
}

// SomeEnum::A
let encoded = bincode::encode_to_vec(SomeEnum::A, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
    0, 0, 0, 0, // first variant, A
    // no extra bytes because A has no fields
]);

// SomeEnum::B(0)
let encoded = bincode::encode_to_vec(SomeEnum::B(0), bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
    1, 0, 0, 0, // second variant, B
    0, 0, 0, 0  // B has 1 unnamed field, which is an u32, so 4 bytes
]);

// SomeEnum::C { value: 0u32 }
let encoded = bincode::encode_to_vec(SomeEnum::C { value: 0u32 }, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
    2, 0, 0, 0, // third variant, C
    0, 0, 0, 0  // C has 1 named field which is a u32, so 4 bytes
]);

Options

Option<T> is always serialized using a single byte for the discriminant, even in Fixint encoding (which normally uses a u32 for discriminant).

let data: Option<u32> = Some(123);
let encoded = bincode::encode_to_vec(data, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
    1, 123, 0, 0, 0  // the Some(..) tag is the leading 1
]);

let data: Option<u32> = None;
let encoded = bincode::encode_to_vec(data, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
    0 // the None tag is simply 0
]);

Collections

General Collection Serialization

Collections are encoded with their length value first, followed by each entry of the collection. The length value is based on the configured IntEncoding.

Serialization Considerations

Length is always serialized first
Entries are serialized in the order they are returned from the iterator implementation.
- Iteration order depends on the collection type
  - Ordered collections (e.g., Vec): Iteration from lowest to highest index
  - Unordered collections (e.g., HashMap): Implementation-defined iteration order
Duplicate keys are not checked in bincode, but may be resulting in an error when decoding a container from a list of pairs.

Handling of Specific Collection Types

Linear Collections (`Vec`, Arrays, etc.)

Serialized by iterating from lowest to highest index
Length prefixed
Each item serialized sequentially

let list = vec![0u8, 1u8, 2u8];
let encoded = bincode::encode_to_vec(list, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
    3, 0, 0, 0, 0, 0, 0, 0, // length of 3u64
    0, // entry 0
    1, // entry 1
    2, // entry 2
]);

Key-Value Collections (`HashMap`, etc.)

Serialized as a sequence of key-value pairs
Iteration order is implementation-defined
Each entry is a tuple of (key, value)

Special Collection Considerations

Bincode will serialize the entries based on the iterator order.
Deserialization is deterministic but the collection implementation might not guarantee the same order as serialization.

Note: Fixed-length arrays do not have their length encoded. See Arrays for details.

String and &str

Encoding Principles

Strings are encoded as UTF-8 byte sequences
No null terminator is added
No Byte Order Mark (BOM) is written
Unicode non-characters are preserved

Encoding Details

Length is encoded first using the configured IntEncoding
Raw UTF-8 bytes follow the length
Supports the full range of valid UTF-8 sequences
U+0000 and other code points can appear freely within the string

Unicode Handling

During serialization, the string is encoded as a sequence of the given bytes.
- Rust strings are UTF-8 encoded by default, but this is not enforced by bincode
No normalization or transformation of text
If an invalid UTF-8 sequence is encountered during decoding, an DecodeError::Utf8 error is raised

let str = "Hello 🌍"; // Mixed ASCII and Unicode

let encoded = bincode::encode_to_vec(str, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
    10, 0, 0, 0, 0, 0, 0, 0, // length of the string, 10 bytes
    b'H', b'e', b'l', b'l', b'o', b' ', 0xF0, 0x9F, 0x8C, 0x8D // UTF-8 encoded string
]);

Comparison with Other Types

Treated similarly to Vec<u8> in serialization
See Collections for more information about length and entry encoding

Arrays

Array length is never encoded.

Note that &[T] is encoded as a Collection.

let arr: [u8; 5] = [10, 20, 30, 40, 50];
let encoded = bincode::encode_to_vec(arr, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
    10, 20, 30, 40, 50, // the bytes
]);

This applies to any type T that implements Encode/Decode

#[derive(bincode::Encode)]
struct Foo {
    first: u8,
    second: u8
};

let arr: [Foo; 2] = [
    Foo {
        first: 10,
        second: 20,
    },
    Foo {
        first: 30,
        second: 40,
    },
];

let encoded = bincode::encode_to_vec(&arr, bincode::config::legacy()).unwrap();
assert_eq!(encoded.as_slice(), &[
    10, 20, // First Foo
    30, 40, // Second Foo
]);

TupleEncoding

Tuple fields are serialized in first-to-last declaration order, with no additional metadata.

No length prefix is added
Fields are encoded sequentially
No padding or alignment adjustments are made
Order of serialization is deterministic and matches the tuple's declaration order

StructEncoding

Struct fields are serialized in first-to-last declaration order, with no metadata representing field names.

No length prefix is added
Fields are encoded sequentially
No padding or alignment adjustments are made
Order of serialization is deterministic and matches the struct's field declaration order
Both named and unnamed fields are serialized identically

EnumEncoding

Enum variants are encoded with a discriminant followed by optional variant payload.

Discriminant Allocation

Discriminants are automatically assigned by the derive macro in declaration order
- First variant starts at 0
- Subsequent variants increment by 1
Explicit discriminant indices are currently not supported
Discriminant is always represented as a u32 during serialization. See Discriminant Representation for more details.
Maintains the original enum variant semantics during encoding

Variant Payload Encoding

Tuple variants: Fields serialized in declaration order
Struct variants: Fields serialized in declaration order
Unit variants: No additional data encoded

Discriminant Representation

Always encoded as a u32
Encoding method depends on the configured IntEncoding
- VarintEncoding: Variable-length encoding
- FixintEncoding: Fixed 4-byte representation

Handling of Variant Payloads

Payload is serialized immediately after the discriminant
No additional metadata about field names or types
Payload structure matches the variant's definition

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spec.md

spec.md

Serialization Specification

Definitions

Endianness

Byte Order Considerations

Basic Types

Boolean Encoding

Numeric Types

Floating Point Special Values

Character Encoding

IntEncoding

VarintEncoding

FixintEncoding

Enums

Options

Collections

General Collection Serialization

Serialization Considerations

Handling of Specific Collection Types

Linear Collections (`Vec`, Arrays, etc.)

Key-Value Collections (`HashMap`, etc.)

Special Collection Considerations

String and &str

Encoding Principles

Encoding Details

Unicode Handling

Comparison with Other Types

Arrays

TupleEncoding

StructEncoding

EnumEncoding

Discriminant Allocation

Variant Payload Encoding

Discriminant Representation

Handling of Variant Payloads

Files

spec.md

Latest commit

History

spec.md

File metadata and controls

Serialization Specification

Definitions

Endianness

Byte Order Considerations

Basic Types

Boolean Encoding

Numeric Types

Floating Point Special Values

Character Encoding

IntEncoding

VarintEncoding

FixintEncoding

Enums

Options

Collections

General Collection Serialization

Serialization Considerations

Handling of Specific Collection Types

Linear Collections (Vec, Arrays, etc.)

Key-Value Collections (HashMap, etc.)

Special Collection Considerations

String and &str

Encoding Principles

Encoding Details

Unicode Handling

Comparison with Other Types

Arrays

TupleEncoding

StructEncoding

EnumEncoding

Discriminant Allocation

Variant Payload Encoding

Discriminant Representation

Handling of Variant Payloads

Linear Collections (`Vec`, Arrays, etc.)

Key-Value Collections (`HashMap`, etc.)