Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large packages created by rattler-build fails to be extracted by conda-package-handling or unzip #1147

Open
najose opened this issue Oct 30, 2024 · 4 comments

Comments

@najose
Copy link

najose commented Oct 30, 2024

Extracting large packages (while testing around 8GB, but I'm guessing it might have something to do with zip64 extensions, so might be anything larger than 4GB) created by rattler-build seems to be buggy. We upload these packages to our Quetz server, and we noticed it couldn't process them. When we looked further into it, it looks like we can't extract the archive neither using conda-package-handling nor with standard unzip on linux (Errors with error: invalid zip file with overlapped components (possible zip bomb)). Note that we don't seem to run into the unzip error if the package is a single file with random data, cph still fails. But with packages with multiple larger files in it, unzip fails too.

Here's a sample recipe with which we were able to reproduce errors with cph extract/list

package:
  name: large_package
  version: 1

requirements:
    build:

build:
  number: 1
  noarch: generic
  script:
    - if: unix
      then: |
        # Write around 8GB of random data to a file in PREFIX
        dd if=/dev/urandom of="$PREFIX/random_data" bs=1M count=8192

Here are the different errors we run into depending on what is in the package:

For random data like in the above example recipe:
cph fails with below error, but unzip works fine.

$ cph extract <package>.conda
...
<long stack trace >
...
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'info-large_package-1-hb0f4dca_1.tar.zst'

For lots of files ranging from 300MB jar files to files as small as 1 MB
cph fails with below error, and unzip fails too with error: invalid zip file with overlapped components (possible zip bomb)

$ cph extract <package>.conda
...
<long stack trace >
...
  File "../lib/python3.12/site-packages/conda_package_handling/streaming.py", line 38, in _stream_components
    raise exceptions.InvalidArchiveError(filename, f"failed with error: {str(e)}") from e
conda_package_handling.exceptions.InvalidArchiveError: Error with archive .<path>/<package>.conda.  You probably need to delete and re-download or re-create this file.  Message was:

failed with error: File name in directory 'info-<package-name>-<package-version>-<build>.tar.zst' and header b'metadata.json' differ.
@wolfv
Copy link
Member

wolfv commented Oct 30, 2024

Thank you for the great issue + reproducer! Super helpful. Unfortunately I don't know yet what's going on.

@wolfv
Copy link
Member

wolfv commented Oct 30, 2024

I think this issue might be related: zip-rs/zip2#248

@wolfv
Copy link
Member

wolfv commented Oct 31, 2024

@najose this PR has the patch for zip: #1146

There are builds that you could try. Do you think you could try one of them? You can find the builds at the bottom here: https://github.com/prefix-dev/rattler-build/actions/runs/11607783736

If that fixes it we can think about shipping with this patched zip version.

@najose
Copy link
Author

najose commented Nov 1, 2024

@wolfv, Thanks for the patched version, we tried it out and we can confirm that the fix works for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants