Explore built-in support for ASN.1 #137
Replies: 7 comments
-
I just realized one other non-obvious consideration, where a binary copy of the transmitted value into int32& is not correct. -1 for example is just a single byte, and needs to be sign-extended before being stored into an int16 or int32. In Unsigned and INTEGER types, every value is required to be transmitted in the minimum number of bytes. A leading FF byte is allowed in INTEGER only if the high bit H of the byte which follows is not set. Three byte INTEGER values -8388608 thru -32769 do not transmit a leading fourth byte which would be FF. Similarly, to be transmitted in the minimum number of bytes, the three byte Unsigned values 65536 thru 16777215 and three byte INTEGER values 65536 thru 8388607 do not transmit a leading fourth byte which would be 0. But INTEGER values 8388608 thru 16777215, which have the high bit H of their third (in terms where LSB is first) byte set, do transmit a leading fourth byte, to distinguish them from negative numbers. |
Beta Was this translation helpful? Give feedback.
-
Thanks for context. Yeah, I can see providing this built-in. Need to think a bit about it, but a quick idea would be an &encoding attribute for integers, like:
That would take care of coercion and sign-extension as well. Essentially using ASN1::INTEGER32 would parse it according to ASN.1 rules, and the do a standard assignment of the parsed value to the field We could actually provide similar built-in support for other ASN.1 types as well (like strings). Would this work for you? |
Beta Was this translation helpful? Give feedback.
-
Yes, that would capture the essence, if |
Beta Was this translation helpful? Give feedback.
-
Oh, but note: whatever syntax is chosen needs to convey a &size. ASN.1 INTEGER and Unsigned do not indicate where they end. FF can be -1; or it can be the first byte of FF 00, etc. |
Beta Was this translation helpful? Give feedback.
-
@0xxon Any thoughts on built-in ASN.1 support for Spicy? |
Beta Was this translation helpful? Give feedback.
-
Sorry for my late reply - I actually had to spend a bit of time to read up on ASN.1 again (it has been a few years). Just to make sure that we are on the same page here: the problem that this wants to solve is that using DER encoding rules, ASN.1 integers can take one more byte to transfer than the actual size they will have in the end? If yes - I see the appeal to have a function that does this decoding. However, I am not really sure if there is much of a reason to make it length specific - so to have a decoder function for 8 bits, 16 bits, 24 bits, etc. I know, there are some cases in which you do know the exact length - and that a field will be exactly 8 bits. But - it seems to me that it is easier to just always encode these fields into a 64 bit integer. That way you can just have one decode function - which accepts everything up to 64 bits. For the encode this should not matter - since with DER you will just spit out everything in the lowest possible number of bytes. I am sure that you saw the way that this was done in the old bacnet analyzer, which kind of does go this way:
As a side note - for ASN.1 it is pretty typical that you see integers >64 bit on the wire. In theory it would be super neat to just be able to support arbitrary length integers ;) |
Beta Was this translation helpful? Give feedback.
-
Closing this since we'll now move zeek/spicy-ldap into Zeek proper with zeek/zeek#3234, and grammars of builtin analyzers like ASN.1 are available to users. |
Beta Was this translation helpful? Give feedback.
-
In BACnet and ASN.1, there is a set of numeric datatypes which encode on the
wire as variable length. All of them are straightforward numeric in their expected
behavior, like in the existing spicy for uint8, uint16, uint32, uint64, int8,
int16, int32, and int64.
If we regard these BACnet and ASN.1 situations as worth
the effort, we could make them full-participation numeric datatypes in the spicy code,
equal to those other numerics, and so allowed to be used with the same intuitive
syntax, and thus a smaller cognitive load for using them.
If we use the ASN naming, they would be INTEGER, INTEGER8, INTEGER16,
INTEGER32, INTEGER64, Unsigned, Unsigned8, Unsigned16, Unsigned32, Unsigned64
and if we wished, they could even allow in datatype situations, for any
expression starting INTEGER or Unsigned, and then immediately continuing as any
trailing decimal natural number, indicating the maximum bit-length.
The trailing number is not representative of how many bits they occupy on the
wire, but only indicating the maximum bit-length of a value which might appear
in context, so pre-allocation to hold an arriving one, or to validate an arriving
one against a maximum and/or range, can take place per specification.
The most significant byte is transmitted first. On the wire, no leading 0 byte
is allowed in Unsigned unless that is the only byte and the number is zero.
In transit, one leading 0 byte is allowed in INTEGER in two situations: if that
is the only byte and the number is zero, or in the case that the number is
positive and the high bit H of the byte which follows is set. This thus leads
to a frequently occuring leading 0 signed 3-octet number, which is an arriving
INTEGER between 32768 and 65535. It is also a case which needs to be properly handled in the code for any INTEGER in the range 128 through 255, which similarly has a leading 0. Lacking that leading 0, the arriving value expresses the negative number that signed integers have when stored as int/int8/int16/int32/int64.
One additional "gotcha" to beware, INTEGER numbers above 2147483647 are
potentially a pathology. If sent, these will encode on the wire as five
arriving bytes. The "numeric" which that represents can be an INTEGER, but it
cannot be an INTEGER32 since in that datatype numbers cannot be above
2147483647, because there is no way to store that in the pre-allocation to hold
one, since range -2147483648 thru 2147483647 is the agreed specification of
INTEGER32. Similar "gotcha" exists at all the other sizes as well.
Beta Was this translation helpful? Give feedback.
All reactions