Skip to content

Attribute description

Mikhail Yakshin edited this page Oct 21, 2016 · 10 revisions

Attribute description specifies how to read one particular attribute — typically, a single number, a string, array of bytes, etc. Attribute can also reference other complex structures by specifying user type given in type description. Each attribute is typically compiled into equivalent parsing instruction(s) in target language.

Common attributes

id

  • Contents: a string that matches /^[a-z][a-z0-9_]*$/ — i.e. starts with lowercase letter and then may contain lowercase letters, numbers and underscore
  • Purpose: identify attribute among others
  • Influences: used as variable / field name in target programming language
  • Mandatory: yes

contents

  • Contents: one of:
    • a string in UTF-8 encoding
    • an array of:
      • integers in decimal representation
      • integers in hexadecimal representation, starting with 0x
      • strings in UTF-8 encoding
  • Purpose: specify fixed contents that should be encountered by parser at this point
  • Influences: parser checks if specified content exists at a given point in stream; if everything matches, then parsing continues; if content in the stream doesn't match bytes specified in given contents, it will trigger a parsing exception, thus signalling that something went terribly wrong and it's meaningless to continue parsing.
  • Mandatory: no

Examples:

  • foo — expect bytes 66 6f 6f
  • [foo, 0, A, 0xa, 42] — expect bytes 66 6f 6f 00 41 0a 2a
  • [1, 0x55, '▒,3', 3] — expect bytes 01 55 e2 96 92 2c 33 03

Note that you can use either JSON or YAML array syntax, and quotes are optional in YAML syntax.

type

repeat

  • Contents: expr or eos
  • Purpose: designate repeated attribute in a structure;
    • if repeat: expr is used, then attribute is repeated the number of times specified in repeat-expr key;
    • if repeat: eos is used, then attribute is repeated until the end of current stream
    • if repeat: until is used, then attribute is repeated until given expression becomes true (one may use a reference to last parsed element in such expression)
  • Influences: attribute would be read as array / list / sequence, executing parsing code multiple times
  • Mandatory: no

repeat-expr

  • Contents: expression, expected to be of integer type
  • Purpose: specify number of repetitions for repeated attribute
  • Influences: number of times attribute is parsed
  • Mandatory: yes, if repeat: expr

repeat-until

  • Contents: expression, expected to be of boolean type
  • Purpose: specify expression that would be checked each time after an element of requested type is parsed; while expression is false (i.e. until it becomes true), more elements would be parsed and added to resulting array; one can use _ in expression as a special variable that references last read element
  • Influences: number of times attribute is parsed
  • Mandatory: yes, if repeat: until

if

  • Contents: expression, expected to be of boolean type
  • Purpose: mark the attribute as optional
  • Influences: attribute would be parsed only if condition specified in if key evaluates (in runtime) to true
  • Mandatory: no

Attributes that depend on type

No type specified

If there's no type specified, attribute will be read just as a sequence of bytes from a stream. Thus, one has to decide on how many bytes to read. There are two ways:

  • Specify amount of bytes to read in size key. One can specify an integer constant or an expression in this field (for example, if the number of bytes to read depends on some other attribute).
  • Set size-eos: true, thus ordering to read all the bytes till the end of current stream.

size

size-eos

process

It is possible to apply some algorithmic processing to a byte buffer before accessing it. This can be done using process attribute.

u*, s*

These specify primitive integer types. One can map an integer to some enum value with an enum attribute.

enum

  • Contents: name of existing enum
  • Purpose: apply mapping of parsed integer using a given enum dictionary into some sort of named constant
  • Influences: field data type becomes given enum
  • Mandatory: no

str

Specifies a fixed-length string, i.e. first it reads a designated number of bytes, then it tries to convert bytes to characters using a specified encoding. There are 2 ways to specify amount of data to read:

  • Specify number of bytes to read directly in size key. One can specify an integer constant or an expression in this field (for example, if the number of bytes to read depends on some other attribute).
  • Set size-eos: true, thus ordering to read all the bytes till the end of current stream.

size

size-eos

encoding

strz

Specifies parsing a string until a terminator byte (i.e. C-style strings terminated with 0).

terminator

  • Contents: integer that represents terminating byte
  • Purpose: string reading will stop when this byte will be encountered
  • Influences: field data type becomes given enum
  • Mandatory: no, default is 0

consume

  • Contents: boolean
  • Purpose: specify if terminator byte should be "consumed" when reading - that is:
    • if consume is true, stream pointer will point to the byte after the terminator byte
    • if consume is false, stream pointer will point to the terminator byte itself
  • Influences: stream position after reading of string
  • Mandatory: no, default is true

include

  • Contents: boolean
  • Purpose: specify if terminator byte should be considered a part of string read and thus appended to it
  • Influences: string parsed: if true, then resulting string would be 1 byte longer and that byte would be terminator byte
  • Mandatory: no, default is false

eos-error

  • Contents: boolean
  • Purpose: allow ignoring of lack of terminator (disabling error reporting)
  • Influences:
    • normally (if eos-error is true), reading a stream without encountering the terminator byte would result in end-of-stream exception being raised;
    • if eos-error is false, string reading will stop successfully at: either:
      • terminator being encountered, or
      • end of stream is reached string parsed: if true, then resulting string would be 1 byte longer and that byte would be terminator byte
  • Mandatory: no, default is true

User-specified types

Clone this wiki locally