Version 9.0
A Grain Sequence Format (GSF) file contains a sequence of grains from one or more flows. It has the mimetype application/x-ips-gsf and a filename typically uses the suffix .gsf
.
The GSF file uses version 2.1 of the SSB format that defines the base file structure and data types.
Each file begins with a 12 octet SSB header:
Name | Data | Type | Size |
---|---|---|---|
signature | "SSBB" | Tag | 4 octets |
file_type | "grsg" | Tag | 4 octets |
major_version | 0x0009 | Unsigned | 2 octets |
minor_version | 0x0000 | Unsigned | 2 octets |
The current GSF version is 9.0. See the SSB Versioning section for a description of how versioning works from a reader's perspective.
Every GSF file starts with a single head block, which itself contains other types of blocks, followed by a (possibly empty) sequence of grai blocks and finally a grai terminator block.
The grai terminator block has the block size set to 0 (and no content) which signals to readers that the GSF stream has ended. It is typically used by readers when receiving a GSF stream where the sender does not know the duration beforehand and has set count in segm to -1.
As such the overall structure of the file is (count shown in brackets):
- File header
- head (1): file identify and creation time
- grai (0..*): grain info and data
- grai (0..1): terminator block
A reader may support concatenated GSF files by handling the occurence of the SSB header when a grai block is expected or after a terminator grai block.
A basic reader implementation could detect the SSBB file signature and skip the file header and the following head block. A reader could also read the head block, replace existing metadata and do some checks to ensure the data is consistent given knowledge of what it acceptable, e.g. if grains are from the same flow. The mediagrains implementation reads the head block, replacing what is stored in the decoder object.
The unique "head" block consists of a standard block header
Name | Data | Type | Size |
---|---|---|---|
tag | "head" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by some special header fields:
Name | Data | Type | Size |
---|---|---|---|
id | UUID | 16 octets | |
created | DateTime | 7 octets |
Where id is a UUID identifying the file itself, and created is a timestamp identifying when the file was laid down.
The "head" block then contains any number of segm and tag blocks (with any other blocks in-between).
Each segm block describes a segment within the file. Each segment contains a number of grains, but the actual grain data is not included in this block, which is more of an index of segments.
It begins with a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "segm" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by some special header fields:
Name | Data | Type | Size |
---|---|---|---|
local_id | Unsigned | 2 octets | |
id | UUID | 16 octets | |
count | Signed | 8 octets |
where local_id is a numerical identifier for the segment, which is unique within the file, id is a UUID for the segment, and count is the number of grains considered part of this segment or -1 to indicate the number of grains is unknown. The id could be used to transfer and persist a global unique identifier for the GSF segment instance, but it is generally not used for that purpose as the GSF segment is a transient representation for the grains.
A segment, which is defined locally by the local_id, always contains grains from a single flow.
The segm block may contain a flow block and any number of tag blocks.
Each tag block contains a 'tag' used to provide user extensible metadata for the immediate parent block - the segm and grai block. Each such tag is a pair of strings, referred to as the key and val.
It begins with a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "tag " | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by two variable length string fields, one for key and one for val:
Name | Data | Type | Size |
---|---|---|---|
key | VarString | variable | |
val | VarString | variable |
where the maximum string length for either key or val is 65535 octets. Note that the VarString size includes 2 octets to encode the string length.
A tag block will not have any child blocks.
A flow block contains the Flow metadata for the grains in the segment. It begins with a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "flow" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by the Flow metadata:
Name | Data | Type | Size |
---|---|---|---|
src_id | UUID | 16 octets | |
flow_id | UUID | 16 octets | |
format | FixString | 64 octets | |
data | VarByteArray | variable |
The src_id is the source identifier, flow_id is the flow identifier, format is a Flow format URN and data contains the Flow metadata as a UTF-8 encoded JSON string. The src_id, flow_id and format are extractions of the Flow properties contained in data.
The known format values are defined in the FlowFormat enum type and are as follows:
- "urn:x-nmos:format:video" for video
- "urn:x-nmos:format:audio" for audio
- "urn:x-nmos:format:data" for data
Each grai block contains the actual data for a grain. Every grain in every segment in the file is represented by such a block.
It begins with a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "grai" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by a single field containing the local_id of the segment to which the grain belongs (and a segment contains grains of a single flow, all using the same local_id):
Name | Data | Type | Size |
---|---|---|---|
local_id | Unsigned | 2 octets |
It is then followed by a gbhd block and then a grdt block (with any other blocks in-between). Note that an empty grain type still requires a (empty) grdt block.
Each gbhd block contains the metadata for a grain header. It begins with a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "gbhd" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by the fields of the common grain header:
Name | Data | Type | Size |
---|---|---|---|
src_id | UUID | 16 octets | |
flow_id | UUID | 16 octets | |
primary_ts | Timestamp | 11 octets | |
secondary_ts | Timestamp | 11 octets | |
rate | Rational | 8 octets | |
duration | Rational | 8 octets |
The src_id is the source identifier for the grains, flow_id is the flow identifier, primary_ts is the primary timestamp (it contained an "origination" timestamp in version <= 7.0.0), secondary_ts is the secondary timestamp (contained a "synchronization" timestamp in version <= 7.0.0), rate is the grain rate and duration is the grain duration.
The source and use of the primary_ts and secondary_ts should be defined as part of the flow.
The gbhd block then contains (in any order and with any other blocks in-between) an optional tils block, and a mandatory block for the non-empty grain types:
- Video Grain: a vghd block.
- Coded Video Grain: a cghd block.
- Audio Grain: an aghd block.
- Coded Audio Grain: a cahd block.
- Event Grain: an eghd block.
- Empty Grain: no block.
Each tils block contains tagged time labels for the grain it exists in. If the grain has none then this block should be ommitted. It consists of a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "tils" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by the number of time labels:
Name | Data | Type | Size |
---|---|---|---|
num_labels | Unsigned | 2 octets |
Then, for each label the following data:
Name | Data | Type | Size |
---|---|---|---|
label | Timelabel | 29 octets |
Each vghd block contains the header data for a video grain. It consists of a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "vghd" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by the grain data:
Name | Data | Type | Size |
---|---|---|---|
format | Unsigned | 4 octets | |
layout | Unsigned | 4 octets | |
width | Unsigned | 4 octets | |
height | Unsigned | 4 octets | |
extension | Unsigned | 4 octets | |
aspect_ratio | Rational | 8 octets | |
pixel_aspect_ratio | Rational | 8 octets |
followed by an optional comp block (with any other blocks in-between).
The format and layout parameters are enumerated values as defined in cogenums.py. The values originated from the COG library. The current set of known formats (from the CogFrameFormat
enum class) are:
Name | Enumeration |
---|---|
ALPHA_U8_1BIT | 0x1080 |
U8_444 | 0x2000 |
U8_422 | 0x2001 |
U8_420 | 0x2003 |
U8_444_RGB | 0x2010 |
ALPHA_U8 | 0x2080 |
YUYV | 0x2100 |
UYVY | 0x2101 |
AYUV | 0x2102 |
RGB | 0x2104 |
RGBx | 0x2110 |
xRGB | 0x2111 |
BGRx | 0x2112 |
xBGR | 0x2113 |
RGBA | 0x2114 |
ARGB | 0x2115 |
BGRA | 0x2116 |
ABGR | 0x2117 |
S16_444_10BIT | 0x2804 |
S16_444_10BIT_RGB | 0x2814 |
S16_422_10BIT | 0x2805 |
S16_420_10BIT | 0x2807 |
ALPHA_S16_10BIT | 0x2884 |
v210 | 0x2906 |
S16_444_12BIT | 0x3004 |
S16_444_12BIT_RGB | 0x3014 |
S16_422_12BIT | 0x3005 |
S16_420_12BIT | 0x3007 |
ALPHA_S16_12BIT | 0x3084 |
S16_444 | 0x4004 |
S16_444_RGB | 0x4014 |
S16_422 | 0x4005 |
S16_420 | 0x4007 |
ALPHA_S16 | 0x4084 |
v216 | 0x4105 |
S32_444 | 0x8008 |
S32_444_RGB | 0x8018 |
S32_422 | 0x8009 |
S32_420 | 0x800b |
ALPHA_S32 | 0x8088 |
UNKNOWN | 0xfffffffe |
INVALID | 0xffffffff |
The current set of known layouts (from the CogFrameLayout
enum class) are:
Name | Enumeration |
---|---|
FULL_FRAME | 0x00 |
SEPARATE_FIELDS | 0x01 |
SINGLE_FIELD | 0x02 |
MIXED_FIELDS | 0x03 |
SEGMENTED_FRAME | 0x04 |
UNKNOWN | 0xfffffffe |
The width and height are the video dimensions, extension is the number of pixels to edge extend the frame by, aspect_rate is the display aspect ratio (eg. 4:3 or 16:9) and pixel_aspect_ratio is the pixel aspect ratio (eg. 1:1, 12:11).
Each comp block contains the component sizes for a video grain. It consists of a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "comp" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by the number of components (usually either 1 or 3):
Name | Data | Type | Size |
---|---|---|---|
num_comps | Unsigned | 2 octets |
Then, for each component the following data:
Name | Data | Type | Size |
---|---|---|---|
width | Unsigned | 4 octets | |
height | Unsigned | 4 octets | |
stride | Unsigned | 4 octets | |
length | Unsigned | 4 octets |
where width is the number of samples per line, height is the number of lines, stride is the number of octets between the start of each line and the start of the next, and length is the number of octets from the start of the data for this component to the start of the data for the next component or the end of the grain data.
Each cghd block contains the header data for a coded video grain. It consists of a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "cghd" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by the grain data:
Name | Data | Type | Size |
---|---|---|---|
format | Unsigned | 4 octets | |
layout | Unsigned | 4 octets | |
origin_width | Unsigned | 4 octets | |
origin_height | Unsigned | 4 octets | |
coded_width | Unsigned | 4 octets | |
coded_height | Unsigned | 4 octets | |
key_frame | Boolean | 1 octet | |
temporal_offset | Signed | 4 octets |
The format and layout parameters are enumerated values as defined in cogenums.py. The values originated from the COG library. The current set of known formats (from the CogFrameFormat
enum class) are:
Name | Enumeration |
---|---|
MJPEG | 0x0200 |
DNxHD | 0x0201 |
MPEG2 | 0x0202 |
AVCI | 0x0203 |
H264 | 0x0204 |
DV | 0x0205 |
D10 | 0x0206 |
VC2 | 0x0207 |
VP8 | 0x0208 |
H265 | 0x0209 |
UNKNOWN | 0xfffffffe |
INVALID | 0xffffffff |
The layouts are the same as those described in the vghd block. The origin_width and origin_height are the original frame dimensions that were input to the encoder and is the output of the decoder after applying any clipping. The coded_width and coded_height are the frame dimensions used to encode from, eg. including padding to meet the fixed macroblock size requirement.
The key_frame is set to 1 if the video frame is a key frame, eg. an I-frame, or 0 if it is not a key frame. A value >= 2 indicates that the key_frame value is unknown.
The temporal_offset is the offset between display and stored order for inter-frame coding schemes (offset = display - stored). A value 2147483647 (0x7fffffff) indicates the temporal_offset value is unknown.
The cghd block is followed by an optional unof block (with any other blocks in-between).
Each unof block contains the offsets from the start of the data section for coded units within a coded grain. It consists of a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "unof" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by the number of unit offset:
Name | Data | Type | Size |
---|---|---|---|
num_units | Unsigned | 2 octets |
Then, for each component the following data:
Name | Data | Type | Size |
---|---|---|---|
unit_offset | Unsigned | 4 octets |
this information is optional, and not meaningful for all coded formats.
Each aghd block contains the header data for an audio grain. It consists of a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "aghd" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by the grain data:
Name | Data | Type | Size |
---|---|---|---|
format | Unsigned | 4 octets | |
channels | Unsigned | 2 octets | |
samples | Unsigned | 4 octets | |
sample_rate | Unsigned | 4 octets |
The format parameter enumerated values are defined in cogenums.py. The values originated from the COG library. The current set of known formats (from the CogAudioFormat
class) are:
Name | Enumeration |
---|---|
S16_PLANES | 0x00 |
S16_PAIRS | 0x01 |
S16_INTERLEAVED | 0x02 |
S24_PLANES | 0x04 |
S24_PAIRS | 0x05 |
S24_INTERLEAVED | 0x06 |
S32_PLANES | 0x08 |
S32_PAIRS | 0x09 |
S32_INTERLEAVED | 0x0a |
S64_INVALID | 0x0c |
FLOAT_PLANES | 0x18 |
FLOAT_PAIRS | 0x19 |
FLOAT_INTERLEAVED | 0x1a |
DOUBLE_PLANES | 0x2c |
DOUBLE_PAIRS | 0x2d |
DOUBLE_INTERLEAVED | 0x2e |
INVALID | 0xffffffff |
The channels is the number of audio channels, samples is the number of (multi-channel) audio samples and sample_rate is the audio sample rate.
Each cahd block contains the header data for a coded audio grain. It consists of a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "cahd" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by the grain data:
Name | Data | Type | Size |
---|---|---|---|
format | Unsigned | 4 octets | |
channels | Unsigned | 2 octets | |
samples | Unsigned | 4 octets | |
priming | Unsigned | 4 octets | |
remainder | Unsigned | 4 octets | |
sample_rate | Unsigned | 4 octets |
The format parameter enumerated values are defined in cogenums.py. The values originated from the COG library. The current set of known formats (from the CogAudioFormat
enum class) are:
Name | Enumeration |
---|---|
MP1 | 0x200 |
AAC | 0x201 |
OPUS | 0x202 |
INVALID | 0xffffffff |
The channels is the number of audio channels, samples is the number of (multi-channel) audio samples, priming is the number of samples at the start of the grain that were used for encoder priming, remainder is the number of samples at the end of the grain that were required to complete the encoding frame and sample_rate is the audio sample rate. The priming and remainder are additional audio samples used in the encoding and may be discarded after decoding to allow seamless stitching of contiguous audio fragments for example.
Each eghd block contains the header data for an event grain. It consists of a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "eghd" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by an empty octet:
Name | Data | Type | Size |
---|---|---|---|
type | Unsigned | 1 octet |
The event payload type identifies the encoding and/or content for the event data. The type 0x00 is currently recognised and defines a JSON encoding following the schema defined in the Content Container library.
Each grdt block contains the raw data of a grain of any type. It consists of a standard block header:
Name | Data | Type | Size |
---|---|---|---|
tag | "grdt" | Tag | 4 octets |
size | Unsigned | 4 octets |
followed by the data components of the grain, copied byte-for-byte from the grain, in order. An empty grain type has size set to 8, ie. there is no data.