-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NTNDArray Blosc compression byte order #6
Comments
No, that is not correct. BLOSC_SHUFFLE is an additional operation to improve compression. It is exposed in the NDPluginCodec, and by selecting BLOSC_SHUFFLE or BLOSC_BITSHUFFLE the compression can be greatly improved. NDPluginCodec passes the shuffle argument on compression:
But it is transparent on decompression, because the shuffle is encoded in the byte stream:
I suspect the Blosc compressor assumes the input is native byte order of the host, compresses into a well-defined stream of bytes which is identical whether the host is big-endian or little-endian. The decompressor knows the datatype of what it is decompressing and converts to the native endianness of the machine doing the decompressing. One reason I think this is that the Blosc compressor is widely used for compressing in files like HDF5 which are commonly written and read on machines with different endianness. They need to make it transparent. |
OK thanks for this information. |
I'm not saying I am certain of this, so it is definitely worth testing. I don't have a big-endian machine to test with, because mine all run vxWorks which does not support Blosc compression. |
We could encode byte order in the |
Does anyone have a big-endian machine we can test with? I am not sure there is really a problem. Here are 2 notes from the Blosc release notes: https://github.com/Blosc/c-blosc/blob/master/RELEASE_NOTES.rst Changes from 1.11.2 to 1.11.3 Changes from 0.9.3 to 0.9.4 |
But what does the following mean?
I think that it only means that it handles byte order in it's private fields. |
Note that if we have to switch byte order then java.nio.ByteBuffer has methods:
and
|
The NDPluginCodec supports scalar arrays of all numeric types: int8,uint8,...,int64,uint64.
In all cases the compresed array will have type byte (which is the same as int8)
For all except int8 and uint8, if client and server have different byte order, then byte order must be switched by either client or server.
Lets assume the server compresses and client decompresses.
Then it should be the client that switches byte order after decompression.
In order to do this the NTNDArray.codec.attribute structure most have fields like:
If these differ than the client must switch byte order
In the C code for blosc there are methods:
and
BLOSC_EXPORT
int blosc_decompress(const void *src, void *dest, size_t destsize);The doshuffle argument can be one of
I think that BLOSC_SHUFFLE just means switch byte order.
Only compress has an argument to switch byte order.
But we want client to switch the byte order.
There is also a method:
The client can call this if byte order needs to be changed.
BUT The Java blosc code does not provide this method.
Thus fir
The text was updated successfully, but these errors were encountered: