You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are several issues with the "p" format case of the struct.pack() function. The documentation is unclear about the meaning of the $count value. Furthermore, the function may even end up writing more than 255 characters, which is in conflict with the documentation, and even with the purpose of the "p" case (short pascal strings) - i.e. an actual ERROR/BUG.
The documentation on section (8): the "p" format case is quite unclear in regards to the $count value. The contents of that section could be rephrased as (to reflect my current understanding):
the 1st byte stored is the length of the string, or 255, whichever is smaller. the bytes of the string follow.
if the string is too long (longer than $count-1), only the leading ($count-1) bytes are stored.
if the string is shorter than ($count-1), it is padded with null bytes so that exactly $count bytes in all are used.
note that for unpack(), the "p" format character consumes $count bytes, but the string returned can never contain more than 255 bytes.
NOTE that the documentation does not clearly state that, with pack(), $count needs to also include the length byte. Only (4) provides some hint. In short: If you intend to write $num character bytes, then $count needs to have a value of ($num+1).
#- the output buffer is empty ...buf=struct.pack("<0p", b'a')
print(buf) #- b''
Even though this is not a strict error, the response to a format string "0p" should not be an empty buffer. Python should either raise an error, or write a length value of 0. That is because writing no output at all will most likely break the programmer's intended output.
NOTE that (1) states that the $length value (not the $count) value is defined to be within the range (0,254). Apart from the $length value, the pack() function will therefore only write up to 254 character bytes. The returned buffer is therefore restricted to a maximum of 255 bytes. That aspect seems to be related to (4).
There should be a clear warning that the "p" case will not write more than 254 character bytes.
The mention of the numeric value 255 (\xff) is misleading. If the pack() function only writes strings of 0 to 254 characters, then the length value should only ever hold a numeric value within the range (0,254).
Either that, or it needs to be explained what the meaning of that value is.
#- an input string in excess of 254 character bytesinput=b'12345678901234567890123456789012345678901234567890' \
b'12345678901234567890123456789012345678901234567890' \
b'12345678901234567890123456789012345678901234567890' \
b'12345678901234567890123456789012345678901234567890' \
b'12345678901234567890123456789012345678901234567890' \
b'1234567890'num=len(input) #- 260#- print(buf) => b'\xff# 12345678901234567890123456789012345678901234567890# 12345678901234567890123456789012345678901234567890# 12345678901234567890123456789012345678901234567890# 12345678901234567890123456789012345678901234567890# 12345678901234567890123456789012345678901234567890# 1234567890\x00'buf=struct.pack("<262p", input)
print(buf)
#- ERROR - this assert() will trigger (!)num=len(buf) #- 262 (!)assert (num<=255)
If pack() is instructed to write 261 characters, then 261 character bytes plus one length byte with the numeric value 255 (\xff) will be written. Obviously, a reader process has no means to determine just how many character bytes follow the length value. Because of that, this output can only be considered broken.
Since section (9) states that the $count value in the "s" case has a different meaning, one could assume that the $count value for the "p" case could be "redefined" such that it refers to the number of character bytes one intends to write. However, that is a matter for discussion.
Since freepascal.org does not mention that a length value of 255 needs to be understood to introduce an "extended" encoding (e.g. introduces a subsequent short-encoded length value), I can only assume that the restriction to 254 character bytes is due to the underlying implementation. The question is, if that is a hard limit that cannot be changed?
Either way, the pack() function in the "p" case should only write its output such that the $length value accurately reflects the amount of character bytes that follow.
CPython versions tested on:
3.10
Operating systems tested on:
Linux
The text was updated successfully, but these errors were encountered:
Bug report
Bug description:
There are several issues with the "p" format case of the struct.pack() function. The documentation is unclear about the meaning of the $count value. Furthermore, the function may even end up writing more than 255 characters, which is in conflict with the documentation, and even with the purpose of the "p" case (short pascal strings) - i.e. an actual ERROR/BUG.
The documentation on section (8): the "p" format case is quite unclear in regards to the $count value. The contents of that section could be rephrased as (to reflect my current understanding):
NOTE that the documentation does not clearly state that, with pack(), $count needs to also include the length byte. Only (4) provides some hint. In short: If you intend to write $num character bytes, then $count needs to have a value of ($num+1).
Even though this is not a strict error, the response to a format string "0p" should not be an empty buffer. Python should either raise an error, or write a length value of 0. That is because writing no output at all will most likely break the programmer's intended output.
NOTE that (1) states that the $length value (not the $count) value is defined to be within the range (0,254). Apart from the $length value, the pack() function will therefore only write up to 254 character bytes. The returned buffer is therefore restricted to a maximum of 255 bytes. That aspect seems to be related to (4).
If pack() is instructed to write 261 characters, then 261 character bytes plus one length byte with the numeric value 255 (\xff) will be written. Obviously, a reader process has no means to determine just how many character bytes follow the length value. Because of that,
this output can only be considered broken
.Since section (9) states that the $count value in the "s" case has a different meaning, one could assume that the $count value for the "p" case could be "redefined" such that it refers to the number of character bytes one intends to write. However, that is a matter for discussion.
Since freepascal.org does not mention that a length value of 255 needs to be understood to introduce an "extended" encoding (e.g. introduces a subsequent short-encoded length value), I can only assume that the restriction to 254 character bytes is due to the underlying implementation. The question is, if that is a hard limit that cannot be changed?
Either way, the pack() function in the "p" case should only write its output such that the $length value accurately reflects the amount of character bytes that follow.
CPython versions tested on:
3.10
Operating systems tested on:
Linux
The text was updated successfully, but these errors were encountered: