Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client data values should be a list in the filter pipeline #32

Closed
bnlawrence opened this issue Jan 21, 2025 · 1 comment
Closed

client data values should be a list in the filter pipeline #32

bnlawrence opened this issue Jan 21, 2025 · 1 comment

Comments

@bnlawrence
Copy link
Collaborator

bnlawrence commented Jan 21, 2025

The filter pipeline is something that in principle is modifiable outside of the hdf library (and hence outside of pyfive). It's certainly something we need to get at for PyActiveStorage.

The spec says within the filter description:

Field Name Description
Number of Client Data Values Each filter can store integer values to control how the filter operates. The number of entries in the Client Data array is stored in this field.
Name If the Name Length field is non-zero then it will contain the size of this field, not padded to a multiple of eight. This field contains a non-null-terminated, ASCII character string to serve as a comment/name for the filter.Filters that are defined in this format documentation (deflate, shuffle, etc.) do not store the Name Length or Name fields.
Client Data This is an array of four-byte integers which will be passed to the filter function. The Client Data Number of Values determines the number of elements in the array.

Which clearly suggests we should have an array of these things.

Our current code has this:

    ...
    num_client_values = struct.unpack_from('<H', self.msg_data, offset)[0]
    offset += 2
    name = ""
    if name_length > 0:
          name = struct.unpack_from("{:d}s".format(name_length), self.msg_data, offset)[0]
          offset += name_length
    filter_info['name'] = name
    client_values = struct.unpack_from("<{:d}i".format(num_client_values), self.msg_data, offset)
    offset += (4 * num_client_values)
    filter_info['client_data'] = client_values
    filter_info['client_data_values'] = num_client_values
    ...

which produces a tupe in the filter_info['client_data'] evn if there is only one, but somewhere this is turned into an integer so we can expect things to break (and they do).

@bnlawrence
Copy link
Collaborator Author

Oh, for Pete's sake, this is now the consequence of the perfectly correct change we made associated with jjhelmus#66, so the problem is not here at all, it's in the things that made use of the wrong thing from pyfive because we had the wrong thing. Fixed downstream. No problems here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant