-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue64 #66
base: master
Are you sure you want to change the base?
Issue64 #66
Conversation
tests/test_filter_pipeline_v2.py
Outdated
assert 'data' in hfile | ||
d = hfile['data'] | ||
# the point of this test is to ensure we can actually retrieve the compression opts | ||
x = d.compression_opts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an enhancement to the unit test, could you also check the value of x
to be correct?
if d['filter_id'] == GZIP_DEFLATE_FILTER][0] | ||
return gzip_entry['client_data'][0] | ||
key = {0:'client_data_values',1:'client_data'}['client_data' in gzip_entry] | ||
return gzip_entry[key][0] | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the documentation pointed to in your issue, [link], it seems there is only a single list of client data values. Instead of your suggested fix, I think we instead should search-and-replace in the code and never produce any dictionary with "client_data"
as the key. (disclaimer: I haven't actually studied the source code yet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see arguments for doing it both ways, the code I've committed is as faithful to what is in the file at point of use, to do different would require hacking the message structure before using it ... I'd defer to the package owners on what is the best option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine, thanks for elaborating!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(At some point you'd have to do what I've got above, insofar as we see files with both compression option structures, we can't control what we will get to have to read.)
I think there is a bug in For version 1 pipeline descriptions, For version 2 pipelines, the values in the client data array are stored in This is incorrect - the values should be stored in If this is fixed then the changes to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend fixing the section of dataobjects.py
that reads the client data from the file into a dict rather than trying to deal with the broken dict after the fact.
Ha... looking back through the history it looks like it might be me that committed the code that I'm suggesting be fixed here. |
diff --git a/pyfive/dataobjects.py b/pyfive/dataobjects.py
index adb19eb..5cdc169 100644
--- a/pyfive/dataobjects.py
+++ b/pyfive/dataobjects.py
@@ -409,7 +409,8 @@ class DataObjects(object):
filter_info['name'] = name
client_values = struct.unpack_from("<{:d}i".format(num_client_values), self.msg_data, offset)
offset += (4 * num_client_values)
- filter_info['client_data_values'] = client_values
+ filter_info['client_data'] = client_values
+ filter_info['client_data_values'] = num_client_values
filters.append(filter_info)
else: |
Your new test passes after the diff above (without the other change in ca1b6d6) |
Storing the number of values in a separate key is redundant, as it could be extracted with |
Oh, I agree it's redundant - I was just trying to keep the dict consistent between v1 and v2 pipelines (in v1, you have to have that key because it's part of the struct message parsed directly from disk, while in v2 we're manually creating the dict) |
Yes, it works for me too! What's the best way forward? Do you want me to fix this in my pull request and go from there? |
This is the simple fix described for #64, includes a test that exposes it, and the one liner to fix it. Works for a number of other data files. Now includes the correct issue number in the commits.