Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[zip] Bugfix on zipfiles not having data descriptor after filedata. #11686

Open
wants to merge 1 commit into
base: development
Choose a base branch
from

Conversation

sebbernery
Copy link

Hello !
I had an issue with zipfiles generated with the tool 7zip on windows. Some files once extracted with haxe.zip.ZipReader was corrupted.

In the zip file format (here or here) some metadata (CRC32 and filesize) can be in the header of the file (in the local file header) or in the footer of the file (in the data descriptor after the compressed data). If the third bit of the local file header flags is 1, there is a data descriptor, not if the flag is 0.
Haxe ZipReader don't decompress files if the third bit of flag is at 0 so the result data (supposed to be decompressed) is the compressed data. This patch fix this behaviour.

I attach two files if you want to test this issue. The first one works and the second is broken in the current version of Haxe.
test_zip_roller.zip
test_zip_7zip.zip

Here is a snippet to test the issue:

class Main {
    static public function main() {
        trace("Working archive");
        var zipread = sys.io.File.read("./test_zip_roller.zip", true);
        var zipfile_entries = haxe.zip.Reader.readZip(zipread);
        for (entry in zipfile_entries) {
            trace(entry.fileName, entry.fileSize, entry.crc32);
            trace(entry.data.toString());
        }
        trace("Broken archive");
        var zipread = sys.io.File.read("./test_zip_7zip.zip", true);
        var zipfile_entries = haxe.zip.Reader.readZip(zipread);
        for (entry in zipfile_entries) {
            trace(entry.fileName, entry.fileSize, entry.crc32);
            trace(entry.data.toString());
        }
    }
}

Output without patch:

$ haxe -main Main.hx -python export/out.py && python3 export/out.py
Main.hx:7: Working archive                                                                                                                                                                                          
Main.hx:11: test/, 0                                                                                                                                                                                                
Main.hx:12:                                                                                                                                                                                                         
Main.hx:11: test/moretext2.txt, 68                                                                                                                                                                                  
Main.hx:12: MORETEXTMORETEXTMORETEXTMORETEXT                                                                                                                                                                        
MORETEXTMORETEXTMORETEXTMORETEXT                                                                                                                                                                                    
                                                                                                                                                                                                                    
Main.hx:11: test/salut.txt, 4                                                                                                                                                                                       
Main.hx:12: test                                                                                                                                                                                                    
Main.hx:11: moretext.txt, 30                                                                                                                                                                                        
Main.hx:12: SECONDTEXTSECONDTEXTSECONDTEXT                                                                                                                                                                          
Main.hx:11: moretext2.txt, 68                                                                                                                                                                                       
Main.hx:12: MORETEXTMORETEXTMORETEXTMORETEXT                                                                                                                                                                        
MORETEXTMORETEXTMORETEXTMORETEXT

Main.hx:11: salut.txt, 4
Main.hx:12: test
Main.hx:14: Broken archive
Main.hx:18: moretext.txt, 30
Main.hx:19: m�!
               �p?� 蟄L�����
Main.hx:18: moretext2.txt, 68
Main.hx:19: �ʱ �]�"H����NY2{ �ή<
Main.hx:18: salut.txt, 4
Main.hx:19: test
Main.hx:18: test/, 0
Main.hx:19: 
Main.hx:18: test/moretext2.txt, 68
Main.hx:19: �ʱ �]�"H����NY2{ �ή<
Main.hx:18: test/salut.txt, 4
Main.hx:19: test

With the patch

Main.hx:7: Working archive
Main.hx:11: test/, 0, 0
Main.hx:12: 
Main.hx:11: test/moretext2.txt, 68, -2888415
Main.hx:12: MORETEXTMORETEXTMORETEXTMORETEXT
MORETEXTMORETEXTMORETEXTMORETEXT

Main.hx:11: test/salut.txt, 4, -662733300
Main.hx:12: test
Main.hx:11: moretext.txt, 30, 55748892
Main.hx:12: SECONDTEXTSECONDTEXTSECONDTEXT
Main.hx:11: moretext2.txt, 68, -2888415
Main.hx:12: MORETEXTMORETEXTMORETEXTMORETEXT
MORETEXTMORETEXTMORETEXTMORETEXT

Main.hx:11: salut.txt, 4, -662733300
Main.hx:12: test
Main.hx:14: Broken archive
Main.hx:18: moretext.txt, 30, 55748892
Main.hx:19: SECONDTEXTSECONDTEXTSECONDTEXT
Main.hx:18: moretext2.txt, 68, -2888415
Main.hx:19: MORETEXTMORETEXTMORETEXTMORETEXT
MORETEXTMORETEXTMORETEXTMORETEXT

Main.hx:18: salut.txt, 4, -662733300
Main.hx:19: test
Main.hx:18: test/, 0, 0
Main.hx:19: 
Main.hx:18: test/moretext2.txt, 68, -2888415
Main.hx:19: MORETEXTMORETEXTMORETEXTMORETEXT
MORETEXTMORETEXTMORETEXTMORETEXT

Main.hx:18: test/salut.txt, 4, -662733300
Main.hx:19: test

The archive contains a file uncompressed (test/salut.txt).

Have a nice day.

@sebbernery
Copy link
Author

Sorry, I'm not sure why the CI fails, but while trying to solve the issue I think I misunderstood the API of zip Reader class, I guess I have to call Reader.unzip() when a file entry is compressed. But the decompression is transparent if there is a data descriptor and require a call to unzip() if it's not here.
I'll check to correct the CI failure (I guess I should set compressed to false to avoid that haxelib try to unzip an already decompressed data) but is this a behavior you want to keep ? I may miss some context.

@sebbernery sebbernery force-pushed the development branch 2 times, most recently from 9845b64 to 152238e Compare June 7, 2024 14:56
@sebbernery sebbernery marked this pull request as draft June 7, 2024 15:27
    (Zip files generated with 7zip tool on windows was corrupted when
    read with Haxe)
@sebbernery sebbernery marked this pull request as ready for review June 7, 2024 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant