-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathdraft-mcquistin-augmented-ascii-diagrams-04.txt
1456 lines (998 loc) · 58.8 KB
/
draft-mcquistin-augmented-ascii-diagrams-04.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Network Working Group S. McQuistin
Internet-Draft V. Band
Intended status: Experimental D. Jacob
Expires: 26 October 2020 C. S. Perkins
University of Glasgow
24 April 2020
Describing Protocol Data Units with Augmented Packet Header Diagrams
draft-mcquistin-augmented-ascii-diagrams-04
Abstract
This document describes a machine-readable format for specifying the
syntax of protocol data units within a protocol specification. This
format is comprised of a consistently formatted packet header
diagram, followed by structured explanatory text. It is designed to
maintain human readability while enabling support for automated
parser generation from the specification document. This document is
itself an example of how the format can be used.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 26 October 2020.
Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
McQuistin, et al. Expires 26 October 2020 [Page 1]
Internet-Draft Augmented Packet Diagrams April 2020
extracted from this document must include Simplified BSD License text
as described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. Limitations of Current Packet Format Diagrams . . . . . . 4
2.2. Formal languages in standards documents . . . . . . . . . 7
3. Design Principles . . . . . . . . . . . . . . . . . . . . . . 7
4. Augmented Packet Header Diagrams . . . . . . . . . . . . . . 10
4.1. PDUs with Fixed and Variable-Width Fields . . . . . . . . 10
4.2. PDUs That Cross-Reference Previously Defined Fields . . . 13
4.3. PDUs with Non-Contiguous Fields . . . . . . . . . . . . . 15
4.4. PDUs with Constraints on Field Values . . . . . . . . . . 16
4.5. PDUs That Extend Sub-Structures . . . . . . . . . . . . . 17
4.6. Storing Data for Parsing . . . . . . . . . . . . . . . . 18
4.7. Connecting Structures with Functions . . . . . . . . . . 19
4.8. Specifying Enumerated Types . . . . . . . . . . . . . . . 20
4.9. Specifying Protocol Data Units . . . . . . . . . . . . . 21
4.10. Importing PDU Definitions from Other Documents . . . . . 21
5. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 22
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22
7. Security Considerations . . . . . . . . . . . . . . . . . . . 22
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 23
9. Informative References . . . . . . . . . . . . . . . . . . . 23
Appendix A. ABNF specification . . . . . . . . . . . . . . . . . 24
A.1. Constraint Expressions . . . . . . . . . . . . . . . . . 24
A.2. Augmented packet diagrams . . . . . . . . . . . . . . . . 25
Appendix B. Source code repository . . . . . . . . . . . . . . . 25
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25
1. Introduction
Packet header diagrams have become a widely used format for
describing the syntax of binary protocols. In otherwise largely
textual documents, they allow for the visualisation of packet
formats, reducing human error, and aiding in the implementation of
parsers for the protocols that they specify.
Figure 1 gives an example of how packet header diagrams are used to
define binary protocol formats. The format has an obvious structure:
the diagram clearly delineates each field, showing its width and its
position within the header. This type of diagram is designed for
human readers, but is consistent enough that it should be possible to
develop a tool that generates a parser for the packet format from the
diagram.
McQuistin, et al. Expires 26 October 2020 [Page 2]
Internet-Draft Augmented Packet Diagrams April 2020
: 0 1 2 3
: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: | Source Port | Destination Port |
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: | Sequence Number |
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: | Acknowledgment Number |
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: | Data | |U|A|P|R|S|F| |
: | Offset| Reserved |R|C|S|S|Y|I| Window |
: | | |G|K|H|T|N|N| |
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: | Checksum | Urgent Pointer |
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: | Options | Padding |
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: | data |
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1: TCP's header format (from [RFC793])
Unfortunately, the format of such packet diagrams varies both within
and between documents. This variation makes it difficult to build
tools to generate parsers from the specifications. Better tooling
could be developed if protocol specifications adopted a consistent
format for their packet descriptions. Indeed, this underpins the
format described by this draft: we want to retain the benefits that
packet header diagrams provide, while identifying the benefits of
adopting a consistent format.
This document describes a consistent packet header diagram format and
accompanying structured text constructs that allow for the parsing
process of protocol headers to be fully specified. This provides
support for the automatic generation of parser code. Broad design
principles, that seek to maintain the primacy of human readability
and flexibility in writing, are described, before the format itself
is given.
This document is itself an example of the approach that it describes,
with the packet header diagrams and structured text format described
by example. Examples that do not form part of the protocol
description language are marked by a colon at the beginning of each
line; this prevents them from being parsed by the accompanying
tooling.
McQuistin, et al. Expires 26 October 2020 [Page 3]
Internet-Draft Augmented Packet Diagrams April 2020
This draft describes early work. As consensus builds around the
particular syntax of the format described, both a formal ABNF
specification (Appendix A) and code (Appendix B) that parses it (and,
as described above, this document) will be provided.
2. Background
This section begins by considering how packet header diagrams are
used in existing documents. This exposes the limitations that the
current usage has in terms of machine-readability, guiding the design
of the format that this document proposes.
While this document focuses on the machine-readability of packet
format diagrams, this section also discusses the use of other
structured or formal languages within IETF documents. Considering
how and why these languages are used provides an instructive contrast
to the relatively incremental approach proposed here.
2.1. Limitations of Current Packet Format Diagrams
: The RESET_STREAM frame is as follows:
:
: 0 1 2 3
: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: | Stream ID (i) ...
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: | Application Error Code (16) |
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: | Final Size (i) ...
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
:
: RESET_STREAM frames contain the following fields:
:
: Stream ID: A variable-length integer encoding of the Stream ID
: of the stream being terminated.
:
: Application Protocol Error Code: A 16-bit application protocol
: error code (see Section 20.1) which indicates why the stream
: is being closed.
:
: Final Size: A variable-length integer indicating the final size
: of the stream by the RESET_STREAM sender, in unit of bytes.
Figure 2: QUIC's RESET_STREAM frame format (from [QUIC-TRANSPORT])
McQuistin, et al. Expires 26 October 2020 [Page 4]
Internet-Draft Augmented Packet Diagrams April 2020
Packet header diagrams are frequently used in IETF standards to
describe the format of binary protocols. While there is no standard
for how these diagrams should be formatted, they have a broadly
similar structure, where the layout of a protocol data unit (PDU) or
structure is shown in diagrammatic form, followed by a description
list of the fields that it contains. An example of this format,
taken from the QUIC specification, is given in Figure 2.
These packet header diagrams, and the accompanying descriptions, are
formatted for human readers rather than for automated processing. As
a result, while there is rough consistency in how packet header
diagrams are formatted, there are a number of limitations that make
them difficult to work with programmatically:
Inconsistent syntax: There are two classes of consistency that are
needed to support automated processing of specifications: internal
consistency within a diagram or document, and external consistency
across all documents.
Figure 2 gives an example of internal inconsistency. Here, the
packet diagram shows a field labelled "Application Error Code",
while the accompanying description lists the field as "Application
Protocol Error Code". The use of an abbreviated name is suitable
for human readers, but makes parsing the structure difficult for
machines. Figure 3 gives a further example, where the description
includes an "Option-Code" field that does not appear in the packet
diagram; and where the description states that each field is 16
bits in length, but the diagram shows the OPTION_RELAY_PORT as 13
bits, and Option-Len as 19 bits. Another example is [RFC6958],
where the packet format diagram showing the structure of the
Burst/Gap Loss Metrics Report Block shows the Number of Bursts
field as being 12 bits wide but the corresponding text describes
it as 16 bits.
Comparing Figure 2 with Figure 3 exposes external inconsistency
across documents. While the packet format diagrams are broadly
similar, the surrounding text is formatted differently. If
machine parsing is to be made possible, then this text must be
structured consistently.
McQuistin, et al. Expires 26 October 2020 [Page 5]
Internet-Draft Augmented Packet Diagrams April 2020
Ambiguous constraints: The constraints that are enforced on a
particular field are often described ambiguously, or in a way that
cannot be parsed easily. In Figure 3, each of the three fields in
the structure is constrained. The first two fields ("Option-Code"
and "Option-Len") are to be set to constant values (note the
inconsistency in how these constraints are expressed in the
description). However, the third field ("Downstream Source Port")
can take a value from a constrained set. This constraint is
expressed in prose that cannot readily by understood by machine.
Poor linking between sub-structures: Protocol data units and other
structures are often comprised of sub-structures that are defined
elsewhere, either in the same document, or within another
document. Chaining these structures together is essential for
machine parsing: the parsing process for a protocol data unit is
only fully expressed if all elements can be parsed.
Figure 2 highlights the difficulty that machine parsers have in
chaining structures together. Two fields ("Stream ID" and "Final
Size") are described as being encoded as variable-length integers;
this is a structure described elsewhere in the same document.
Structured text is required both alongside the definition of the
containing structure and with the definition of the sub-structure,
to allow a parser to link the two together.
Lack of extension and evolution syntax: Protocols are often
specified across multiple documents, either because the protocol
explicitly includes extension points (e.g., profiles and payload
format specifications in RTP [RFC3550]) or because definition of a
protocol data unit has changed and evolved over time. As a
result, it is essential that syntax be provided to allow for a
complete definition of a protocol's parsing process to be
constructed across multiple documents.
McQuistin, et al. Expires 26 October 2020 [Page 6]
Internet-Draft Augmented Packet Diagrams April 2020
: The format of the "Relay Source Port Option" is shown below:
:
: 0 1 2 3
: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: | OPTION_RELAY_PORT | Option-Len |
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: | Downstream Source Port |
: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
:
: Where:
:
: Option-Code: OPTION_RELAY_PORT. 16-bit value, 135.
:
: Option-Len: 16-bit value to be set to 2.
:
: Downstream Source Port: 16-bit value. To be set by the IPv6
: relay either to the downstream relay agent's UDP source port
: used for the UDP packet, or to zero if only the local relay
: agent uses the non-DHCP UDP port (not 547).
Figure 3: DHCPv6's Relay Source Port Option (from [RFC8357])
2.2. Formal languages in standards documents
A small proportion of IETF standards documents contain structured and
formal languages, including ABNF [RFC5234], ASN.1 [ASN1], C, CBOR
[RFC7049], JSON, the TLS presentation language [RFC8446], YANG models
[RFC7950], and XML. While this broad range of languages may be
problematic for the development of tooling to parse specifications,
these, and other, languages serve a range of different use cases.
ABNF, for example, is typically used to specify text protocols, while
ASN.1 is used to specify data structure serialisation. This document
specifies a structured language for specifying the parsing of binary
protocol data units.
3. Design Principles
The use of structures that are designed to support machine
readability might potentially interfere with the existing ways in
which protocol specifications are used and authored. To the extent
that these existing uses are more important than machine readability,
such interference must be minimised.
In this section, the broad design principles that underpin the format
described by this document are given. However, these principles
apply more generally to any approach that introduces structured and
formal languages into standards documents.
McQuistin, et al. Expires 26 October 2020 [Page 7]
Internet-Draft Augmented Packet Diagrams April 2020
It should be noted that these are design principles: they expose the
trade-offs that are inherent within any given approach. Violating
these principles is sometimes necessary and beneficial, and this
document sets out the potential consequences of doing so.
The central tenet that underpins these design principles is a
recognition that the standardisation process is not broken, and so
does not need to be fixed. Failure to recognise this will likely
lead to approaches that are incompatible with the standards process,
or that will see limited adoption. However, the standards process
can be improved with appropriate approaches, as guided by the
following broad design principles:
Most readers are human: Primarily, standards documents should be
written for people, who require text and diagrams that they can
understand. Structures that cannot be easily parsed by people
should be avoided, and if included, should be clearly delineated
from human-readable content.
Any approach that shifts this balance -- that is, that primarily
targets machine readers -- is likely to be disruptive to the
standardisation process, which relies upon discussion centered
around documents written in prose.
Writing tools are diverse: Standards document writing is a
distributed process that involves a diverse set of tools and
workflows. The introduction of machine-readable structures into
specifications should not require that specific tools are used to
produce standards documents, to ensure that disruption to existing
workflows is minimised. This does not preclude the development of
optional, supplementary tools that aid in the authoring machine-
readable structures.
The immediate impact of requiring specific tooling is that
adoption is likely to be limited. A long-term impact might be
that authors whose workflows are incompatible might be alienated
from the process.
Canonical specifications: As far as possible, machine-readable
structures should not replicate the human readable specification
of the protocol within the same document. Machine-readable
structures should form part of a canonical specification of the
protocol. Adding supplementary machine-readable structures, in
parallel to the existing human readable text, is undesirable
because it creates the potential for inconsistency.
As an example, program code that describes how a protocol data
unit can be parsed might be provided as an appendix within a
McQuistin, et al. Expires 26 October 2020 [Page 8]
Internet-Draft Augmented Packet Diagrams April 2020
standards document. This code would provide a specification of
the protocol that is separate to the prose description in the main
body of the document. This has the undesirable effect of
introducing the potential for the program code to specify
behaviour that the prose-based specification does not, and vice-
versa.
Expressiveness: Any approach should be expressive enough to capture
the syntax and parsing process for the majority of binary
protocols. If a given language is not sufficiently expressive,
then adoption is likely to be limited. At the limits of what can
be expressed by the language, authors are likely to revert to
defining the protocol in prose: this undermines the broad goal of
using structured and formal languages. Equally, though,
understandable specifications and ease of use are critical for
adoption. A tool that is simple to use and addresses the most
common use cases might be preferred to a complex tool that
addresses all use cases.
It may be desirable to restrict expressiveness, however, to
guarantee intrinsic safety, security, and computability properties
of both the generated parser code for the protocol, and the parser
of the description language itself. In much the same way as the
language-theoretic security ([LANGSEC]) community advocates for
programming language design to be informed by the desired
properties of the parsers for those languages, protocol designers
should be aware of the implications of their design choices. The
expressiveness of the protocol description languages that they use
to define their protocols can force such awareness.
Broadly, those languages that have grammars which are more
expressive tend to have parsers that are more complex and less
safe. As a result, while considering the other goals described in
this document, protocol description languages should attempt to be
minimally expressive, and either restrict protocol designs to
those for which safe and secure parsers can be generated, or as a
minimum, ensure that protocol designers are aware of the
boundaries their designs cross, in terms of computability and
decidability [SASSAMAN].
Minimise required change: Any approach should require as few changes
as possible to the way that documents are formatted, authored, and
published. Forcing adoption of a particular structured or formal
language is incompatible with the IETF's standardisation process:
there are very few components of standards documents that are non-
optional.
McQuistin, et al. Expires 26 October 2020 [Page 9]
Internet-Draft Augmented Packet Diagrams April 2020
4. Augmented Packet Header Diagrams
The design principles described in Section 3 can largely be met by
the existing uses of packet header diagrams. These diagrams aid
human readability, do not require new or specialised tools to write,
do not split the specification into multiple parts, can express most
binary protocol features, and require no changes to existing
publication processes.
However, as discussed in Section 2.1 there are limitations to how
packet header diagrams are used that must be addressed if they are to
be parsed by machine. In this section, an augmented packet header
diagram format is described.
The concept is first illustrated by example. This is appropriate,
given the visual nature of the language. In future drafts, these
examples will be parsable using provided tools, and a formal
specification of the augmented packet diagrams will be given in
Appendix A.
4.1. PDUs with Fixed and Variable-Width Fields
The simplest PDU is one that contains only a set of fixed-width
fields in a known order, with no optional fields or variation in the
packet format.
Some packet formats include variable-width fields, where the size of
a field is either derived from the value of some previous field, or
is unspecified and inferred from the total size of the packet and the
size of the other fields.
To ensure that there is no ambiguity, a PDU description can contain
only one field whose length is unspecified. The length of a single
field, where all other fields are of known (but perhaps variable)
length, can be inferred from the total size of the containing PDU.
A PDU description is introduced by the exact phrase "A/An _______ is
formatted as follows:" at the end of a paragraph. This is followed
by the PDU description itself, as a packet diagram within an
<artwork> element in the XML representation, starting with a header
line to show the bit width of the diagram. The description of the
fields follows the diagram, as an XML <dl> list, after a paragraph
containing the text "where:".
PDU names must be unique, both within a document, and across all
documents that are linked together (i.e., using the structured
language defined in Section 4.10).
McQuistin, et al. Expires 26 October 2020 [Page 10]
Internet-Draft Augmented Packet Diagrams April 2020
Each field of the description starts with a <dt> tag comprising the
field name and an optional short name in parenthesis. These are
followed by a colon, the field length, an optional presence
expression (described in Section 4.2), and a terminating period. The
following <dd> tag contains a prose description of the field. Field
names cannot be the same as a previously defined PDU name, and must
be unique within a given structure definition.
For example, this can be illustrated using the IPv4 Header Format
[RFC791]. An IPv4 Header is formatted as follows:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL | DSCP |ECN| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| :
: Payload :
: |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
where:
Version (V): 4 bits. This is a fixed-width field, whose full label
is shown in the diagram. The field's width -- 4 bits -- is given
in the label of the description list, separated from the field's
label by a colon.
Internet Header Length (IHL): 4 bits. This is a shorter field, whose
full label is too large to be shown in the diagram. A short label
(IHL) is used in the diagram, and this short label is provided, in
brackets, after the full label in the description list.
Differentiated Services Code Point (DSCP): 6 bits. This is a fixed-
width field, as previously discussed.
Explicit Congestion Notification (ECN): 2 bits. This is a fixed-
width field, as previously discussed.
McQuistin, et al. Expires 26 October 2020 [Page 11]
Internet-Draft Augmented Packet Diagrams April 2020
Total Length (TL): 2 bytes. This is a fixed-width field, as
previously discussed. Where fields are an integral number of
bytes in size, the field length can be given in bytes rather than
in bits.
Identification: 2 bytes. This is a fixed-width field, as previously
discussed.
Flags: 3 bits. This is a fixed-width field, as previously discussed.
Fragment Offset: 13 bits. This is a fixed-width field, as previously
discussed.
Time to Live (TTL): 1 byte. This is a fixed-width field, as
previously discussed.
Protocol: 1 byte. This is a fixed-width field, as previously
discussed.
Header Checksum: 2 bytes. This is a fixed-width field, as previously
discussed.
Source Address: 32 bits. This is a fixed-width field, as previously
discussed.
Destination Address: 32 bits. This is a fixed-width field, as
previously discussed.
Options: (IHL-5)*32 bits. This is a variable-length field, whose
length is defined by the value of the field with short label IHL
(Internet Header Length). Constraint expressions can be used in
place of constant values: the grammar for the expression language
is defined in Appendix A.1. Constraints can include a previously
defined field's short or full label, where one has been defined.
Short variable-length fields are indicated by "..." instead of a
pipe at the end of the row.
Payload: TL - ((IHL*32)/8) bytes. This is a multi-row variable-
length field, constrained by the values of fields TL and IHL.
Instead of the "..." notation, ":" is used to indicate that the
field is variable-length. The use of ":" instead of "..."
indicates the field is likely to be a longer, multi-row field.
However, semantically, there is no difference: these different
notations are for the benefit of human readers.
McQuistin, et al. Expires 26 October 2020 [Page 12]
Internet-Draft Augmented Packet Diagrams April 2020
4.2. PDUs That Cross-Reference Previously Defined Fields
Binary formats often reference sub-structures that have been defined
earlier in the specification. For example, in RTP [RFC3550], the
Contributing Source Identifiers in an RTP Data Packet are defined as
comprising a list of Source Identifier elements. A Source Identifier
is formatted as follows:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
where:
SSRC: 32 bits. This is a fixed-width field, as described previously.
The following example shows how a Source Identifier can be referenced
in the description of an RTP Data Packet. It also shows how the
presence of some fields in a format may be dependent on the values of
an earlier field.
An RTP Data Packet is formatted as follows:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| V |P|X| CC |M| PT | Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Synchronization Source identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| [Contributing Source identifiers] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Header Extension |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload :
: :
: |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Padding | Padding Count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
where:
McQuistin, et al. Expires 26 October 2020 [Page 13]
Internet-Draft Augmented Packet Diagrams April 2020
Version (V): 2 bits. This is a fixed-width field, as described
previously.
Padding (P): 1 bit. This is a fixed-width field, as described
previously.
Extension (X): 1 bit. This is a fixed-width field, as described
previously.
CSRC count (CC): 4 bits. This is a fixed-width field, as described
previously.
Marker (M): 1 bit. This is a fixed-width field, as described
previously.
Payload Type (PT): 7 bits. This is a fixed-width field, as described
previously.
Sequence Number (PT): 16 bits. This is a fixed-width field, as
described previously.
Timestamp (PT): 32 bits. This is a fixed-width field, as described
previously.
Synchronization Source identifier: 1 * Source Identifier. This is a
field whose structure is a previously defined PDU format (Source
Identifier). To indicate this, the width of the field is
expressed in terms of cross-referenced structure. When used in
constraint expressions, PDU names refer to the length of that PDU
structure.
Contributing Source identifiers: CC * Source Identifier. Where a
field is comprised of a sequence of previously defined structures,
square brackets can be used to indicate this in the diagram. The
length of the sequence can be defined using the constraint
expression grammar as described earlier.
In this example, both a PDU name (Source Identifier) and a field
name (CC) are used in the constraint expression. The PDU name
refers to the length of the PDU, while the field name refers to
the value of the field. This is possible because field names
cannot be the same as previously defined PDU names.
Header Extension: 32 bits; present only when X == 1. This is a field
whose presence is predicated on an expression given using the
constraint expression grammar described earlier. Optional fields
can be of any previously defined format (e.g., fixed- or variable-
width). Optional fields are indicated by the presence of ";
McQuistin, et al. Expires 26 October 2020 [Page 14]
Internet-Draft Augmented Packet Diagrams April 2020
present only when [expr]." at the end of the definition term
(i.e., the text contained within the <dt> tag).
[Note that this example deviates from the format as described in
[RFC3550]. As specified in that document, the Header Extension
would be a cross-referenced structure. This is not shown here for
brevity.]
Payload. The length of the Payload is not specified, and hence needs
to be inferred from the total length of the packet and the lengths
of the known fields. There can only be one field of unspecified
size in a PDU.
Padding: Padding Count bytes; present only when (P == 1) and
(Padding Count > 0).
This is a variable size field, with size dependent on a later
field in the packet. Fields can only depend on the value of a
later field if they follow a field with unspecified size.
Padding Count: 1 byte; present only when P == 1. This is a fixed-
width field, as previously discussed.
4.3. PDUs with Non-Contiguous Fields
In some binary formats, fields are striped across multiple non-
contiguous bits. This is often to allow for backwards compatibility
with previous definitions of the same fields in earlier documents:
striping in this way allows for careful use of the possible range of
values.
This format is illustrated using the STUN Message Type
[draft-ietf-tram-stunbis-21]. A STUN Message Type is formatted as
follows:
0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|M|M|M|M|M|C|M|M|M|C|M|M|M|M|
|B|A|9|8|7|1|6|5|4|0|3|2|1|0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
where:
McQuistin, et al. Expires 26 October 2020 [Page 15]
Internet-Draft Augmented Packet Diagrams April 2020
Method (M): 12 bits. This field is comprised of multiple sub-fields
(M0 through MB) as shown in the diagram. That these sub-fields
should be concatenated, after parsing, into a single field is
indicated by their being labelled using the 'M' short field name
followed by a single hexadecimal digit, with the least significant
bit labelled with 0, and subsequent bits labelled in sequence.
Class (C): 2 bits. This field follows the same format as M described
above.
4.4. PDUs with Constraints on Field Values
A PDU may be defined not only by the layout and type of its fields,
but also by the value of those fields. For example, field values may
be constrained to be of a known exact value or to be within a range.
More generally, our format enables a boolean expression to be
attached to a field, which must be true for the PDU to be parsed
successfully.
This format is illustrated using the QUIC Long Header Packet format
[QUIC-TRANSPORT]. A Long Header is formatted as follows:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+
|1|1| T | R | P |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DCID Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Connection ID (DCID) ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SCID Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Connection ID (SCID) ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
where:
Header Form (HF): 1 bit; HF == 1. This is a fixed-width field,
constrained to be a of an known, exact value. At most one field
value constraint may be given, and if provided, it must be given
as a boolean expression, separated by a semi-colon in the field
definition name (i.e., the text contained within the <dt> tag).
If present, a value constraint must follow the name, short name,
and length of the field, but appear before any presence
constraint, if applicable.
McQuistin, et al. Expires 26 October 2020 [Page 16]
Internet-Draft Augmented Packet Diagrams April 2020
The order of the field must be the same in both the diagram and
description list.
Fixed Bit (FB): 1 bit; FB == 1. This is a fixed-width field, with a
value constraint, as previously described.
Long Packet Type (T): 2 bits. This is a fixed-width field as
previously described.
Reserved Bits (R): 2 bits. This is a fixed-width field as previously
described.
Packet Number Length (P): 2 bits. This is a fixed-width field as
previously described.
Version: 32 bits. This is a fixed-width field as previously
described.
DCID Len (DLen): 1 byte; DLen <= 20. This is a fixed-width field,
with a value constraint, as previously described. Note that the
constraint language is not limited to equality; it is defined
fully in Appendix A.1.
Destination Connection ID: DLen bytes. This is a variable-width
field as previously described.
SCID Len (SLen): 1 byte; SLen <= 20. This is a fixed-width field,
with a value constraint, as previously described.
Source Connection ID: SLen bytes. This is a variable-width field as
previously described.
4.5. PDUs That Extend Sub-Structures
A PDU may not only use or reference existing sub-structures, but they
may extend them, adding new fields, or enforcing different or
additional constraints.
Where a sub-structure is extended, the diagram may show the sub-
structure as a block, labelled with the sub-structure name. It may
also be desirable to show the sub-structure diagram in full; in this
case, the fields must be given in the same order and be of the same
length. New field constraints can be shown. Similarly, in the
description list, those fields inherited without change (i.e., with
no change to their constraints) do not need to be repeated. Those
with different or additional constraints must be described, and the
order of the fields in the description list must match that of the
sub-structure and the containing structure.
McQuistin, et al. Expires 26 October 2020 [Page 17]
Internet-Draft Augmented Packet Diagrams April 2020
This format is illustrated using the QUIC Retry Packet format
[QUIC-TRANSPORT]. A Retry Packet is formatted as follows:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| :
: Long Header :
: |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Retry Token ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Retry Integrity Tag +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
where:
Long Header (LH): 1 Long Header; LH.T == 3. This field is a
previously defined sub-structure. Its constraints can access
fields in that sub-structure. In this example, the T field of the
Long Header must be equal to 3.
Retry Token This is a variable-length field as previously defined.
Retry Integrity Tag: 128 bits. This is a fixed-width field as
previously defined.
As shown, the Long Header packet sub-structure is included. The
Retry Packet enforces a new value constraint on the Long Packet Type
(T) field.
4.6. Storing Data for Parsing
The parsing process may require data from previously parsed
structures. This means that data needs to be stored persistently
throughout the process. This data needs to be identified.
That the value of a particular field be stored upon parsing is