Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uncorrect compression (or decompression) for some lengths not divisible by 8 bytes #357

Open
byphilipp opened this issue Jan 11, 2025 · 1 comment

Comments

@byphilipp
Copy link

Describe the bug
The result of compression and decompression with
filters=[blosc2.Filter.SHUFFLE, blosc2.Filter.BYTEDELTA]
for some lengths not divisible by 8 bytes
is not equvivalent: the last byte is differs

To Reproduce

import blosc2
import numpy as np
dtype = np.float32
for N in range(100,270):
    x = np.random.rand(N).astype(dtype)
    kw=dict(
            clevel=5,
#            filters=[blosc2.Filter.SHUFFLE], # correct
#            filters=[blosc2.Filter.BYTEDELTA], # correct
#            filters=[blosc2.Filter.BYTEDELTA, blosc2.Filter.SHUFFLE], # correct
             filters=[blosc2.Filter.SHUFFLE, blosc2.Filter.BYTEDELTA], # uncorrect !
    )
    y = np.frombuffer( blosc2.decompress2 (blosc2.compress2(np.frombuffer(x, dtype=np.uint8).copy(order='C'), codec=blosc2.Codec.ZSTD, **kw)), dtype=dtype)
    #y = np.frombuffer( np.frombuffer(x, dtype=np.uint8), dtype=dtype)
    if y[-1]-x[-1] != 0:
        print(f"{N=} {y[-1]-x[-1]}")

output:

N=195 -0.1865813434123993
N=223 -0.20511126518249512
N=225 2.6814947806386197e+37
N=227 -0.2120416760444641
N=229 1.0758920518724525e+25
N=235 -0.8137646317481995
N=237 -0.5785623788833618
N=239 -445465753550848.0
N=241 344564.53125
N=247 -6.356997702928093e+27
N=249 6891780.0
N=251 -0.4392470419406891
N=253 -1.2648026543392063e+29
N=257 -101628403712.0
N=259 56037704.0
N=261 -0.7367768287658691
N=263 -0.5903393626213074
N=265 -0.5874314308166504
N=267 -0.1973332017660141
N=269 -0.1740696132183075

Expected behavior
Empty output

Desktop (please complete the following information):

  • OS: Ubuntu 22.04
  • Python 3.11
  • blosc2 - I try 3.0.0, 2.7.1, 2.6.2 and see this problem in all versions
@FrancescAlted
Copy link
Member

I suppose the issue should be in the C-Blosc2 side. Would you like to investigate more there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants