Skip to content

Zip async implementation #114421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Apr 17, 2025
Merged

Conversation

carlossanlop
Copy link
Contributor

@carlossanlop carlossanlop commented Apr 9, 2025

Fixes #1541

Approved APIs: #1541 (comment)

Each commit is small to help with reviewing.

@edwardneal - our changes will conflict. Ideally your PR should get merged first and I would update mine afterwards.

@ghost

This comment was marked as duplicate.

1 similar comment
@ghost

This comment was marked as resolved.

Copy link
Contributor

@edwardneal edwardneal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @carlossanlop. If time's limited, could we look at #113306 before #114256 though please?

@carlossanlop
Copy link
Contributor Author

If time's limited, could we look at #113306 before #114256 though please?

Looking at them today. I think we can get them all merged before the next snap.

I wanted to ask you @edwardneal: Are you on discord by any chance? Maybe we could communicate faster via chat. Feel free to add me. My username is chayote_jarocho .

Copy link
Member

@ericstj ericstj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a brief look - try and find opportunities for sharing non-trivial logic between sync / async.

@edwardneal's suggestion is a good one but also consider factoring synchronous chunks of logic in to shared methods.

If you cannot share the code then consider co-locating it if it must be maintained in parallel.

@carlossanlop carlossanlop marked this pull request as ready for review April 10, 2025 23:02
@carlossanlop
Copy link
Contributor Author

Marked as ready for review because I pushed a ton of tests. I will keep adding some more.

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks good, I had few quesitons and found only some nits.

I have not compared the sync vs async implementations and since this code contains a lot of if blocks it would make me feel better about approving it if you could add some fuzz tests for it.

A simple fuzz test could look like this:

  • Take the bytes generated by the fuzzer, write them to file.
  • Compress this file with sync API.
  • Compress this file with async API.
  • Ensure both produced the same output.
  • Decompress the output with sync API.
  • Decompress the output with async API.
  • Ensure the result is the same as input.

BTW I am really excited to see those APIs being added in .NET 10. I know users have been waiting for this moment for a long time.

@carlossanlop
Copy link
Contributor Author

carlossanlop commented Apr 15, 2025

Need to fix the merge conflicts with f93aa8a , 1b090a9 and 44c08b8

@carlossanlop carlossanlop force-pushed the ZipAsyncImplementation branch from e7a3a28 to 5ffb3cb Compare April 16, 2025 01:34
Copy link
Contributor

@edwardneal edwardneal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got one final set of comments - mostly optimisations around memory allocation. All of them could be changed in subsequent PRs though, the logic itself looks sound.

@carlossanlop
Copy link
Contributor Author

/azp run runtime-libraries-coreclr outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@carlossanlop
Copy link
Contributor Author

@adamsitnik I added fuzz tests.

@carlossanlop carlossanlop added this to the 10.0.0 milestone Apr 16, 2025
@carlossanlop
Copy link
Contributor Author

/azp run runtime-libraries-coreclr outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

…s in ZipFile and ZipFileExtensions. Remove redundant switch case.
…o the right location, remove the unnecessary suppression text.
…r the system. Keep one sync and another async instead.
…ome cases due to mixing of static readonly and consts in some block types by setting Signature value to the fixed number 4, which will most likely never change.
@carlossanlop carlossanlop force-pushed the ZipAsyncImplementation branch from 93d9799 to 0b6e6fe Compare April 17, 2025 06:11
… running async they fail with 'Field marshaling is not supported by Invoke: async'.
@carlossanlop
Copy link
Contributor Author

I investigated the failures, they are happening in wasm due to async code not being able to run nicely in single threaded environments. The error is happening in all async tests and says "PlatformNotSupportedException: Cannot wait on monitors for this runtime". I opened a KnownBuildError issue to track this, will work on fixing it tomorrow: #114769

Aside from that, all other failures have been fixed, so this is good to merge.

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for addressing my feedback @carlossanlop !


public void FuzzTarget(ReadOnlySpan<byte> bytes)
{

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change


Task.WaitAll(sync_test, async_test);
}
catch (Exception) { }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to catch only the exceptions that we expect here? For example this is how we do it for NRBF:

catch (SerializationException) { /* Reading from the stream encountered invalid NRBF data.*/ }
catch (NotSupportedException) { /* Reading from the stream encountered unsupported records */ }
catch (DecoderFallbackException) { /* Reading from the stream encountered an invalid UTF8 sequence. */ }
catch (EndOfStreamException) { /* The end of the stream was reached before reading SerializationRecordType.MessageEnd record. */ }
catch (IOException) { /* An I/O error occurred. */ }

It has helped me to find that the library was throwing undocumented exceptions (part of threat model)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are more things I can improve here besides what you suggest. I'll include such changes in the follow up PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add async ZipFile APIs
6 participants