-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
artifact(download): skip non-zip files #1874
base: main
Are you sure you want to change the base?
Conversation
17ec324
to
4ec46bb
Compare
Added some tests @robherley Sorry for the ping but let me know if this needs anything else for review. |
core.info(`Artifact download completed successfully.`) | ||
return {downloadPath, skipped: false} | ||
} else { | ||
core.info(`Artifact download skipped.`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user requests a specific artifact to download and that artifact can't be downloaded, why should that silently fail?
Shouldn't that be an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reluctant to change the default behavior here as it could be a breaking change for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iiuc the API only expects zip content type for extraction when used with actions/download-artifact@v4
:
.pipe(unzip.Extract({path: directory})) |
And it currently breaks workflows when artifacts with other content-type are being downloaded.
I guess we could create an error type if content type does not match so actions/download-artifact
can catch it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought silently skipping unrelated artifacts not uploaded with actions/upload-artifact
would be best so it doesn't require any changes in actions/download-artifact
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to use pattern
input to filter out any artifacts that are not created by upload-artifact
action in this case? (credit to @joshmgross for this suggestion as well!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe an exclude pattern such as
!**/*.gzip
if all docker uploaded artifacts are gzips?
It is uploaded as .dockerbuild
file: https://github.com/docker/build-push-action/actions/runs/12707375438/job/35422185205#step:6:22
But azure blob storage or GitHub upload backend enforces .zip
extension somehow when downloaded from Summary page: https://github.com/docker/build-push-action/actions/runs/12707375438/artifacts/2412083382
$ file docker~build-push-action~QFMCC3.dockerbuild.zip
docker~build-push-action~QFMCC3.dockerbuild.zip: gzip compressed data, original size modulo 2^32 311808
Of if the artifacts have a common name pattern, it can also be matched? Is there a pattern for docker uploaded artifacts?
I don't think that's reliable. Best is checking the content type imo.
Also, if there are customer who rely on download docker uploaded artifacts, we are breaking them either way. Would it be possible to actually upload zips? I understand this maybe difficult too.
Zip compression is less efficient and may result in larger files for the same data and uses a less efficient compression algorithm compared to gzip. Also zip does not handle Unix-specific metadata (e.g., permissions, ownership, symbolic links). But we could encapsulate our tarball within the zip file but that sounds hacky. Doing so would also break Docker Desktop users trying to import builds with an unknown format. I defer to @colinhemmings @thompson-shaun.
What's wrong with excluding artifacts that don't have the expected content-type during extraction? If someone else using the API to upload another content-type, people using actions/download-artifact
would have the same issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's wrong with excluding artifacts that don't have the expected content-type during extraction? If someone else using the API to upload another content-type, people using actions/download-artifact would have the same issue.
I think the issue is it blurs the line between a real exception and an "expected" exception? Consider the same scenario where a user uses API to upload, and uses download-artifact to verify the content. Everything could be working until someone accidentally modify the upload portion to point to a wrong file, with this change it will be hard for those users to catch the mistake?
pattern
exists to include and exclude files, and to me it fits the current situation and it could solve our issue without any code change? A exclude pattern of !**/*.dockerbuild
seems could do the job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue is it blurs the line between a real exception and an "expected" exception? Consider the same scenario where a user uses API to upload, and uses download-artifact to verify the content.
The API only expects zip content type for extraction when used with actions/download-artifact@v4
:
.pipe(unzip.Extract({path: directory})) |
So I'm not sure why this would be expected. I think this is an oversight in the implementation to skip unsupported content-types.
pattern
exists to include and exclude files, and to me it fits the current situation and it could solve our issue without any code change? A exclude pattern of!**/*.dockerbuild
seems could do the job?
So what you mean is having this pattern set as default in actions/download-artifact
?: https://github.com/actions/download-artifact/blob/533298bc57c27f112a2c04a74a04a4d43e2866fd/action.yml#L11-L13
If in the future we change the filename it would break people using actions/download-artifact
. The problem arises from actions/download-artifact
downloading all artifacts if no name/pattern is defined.
Maybe I miss something but I think it needs code changes in your toolkit or in actions/download-artifact
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what you mean is having this pattern set as default in actions/download-artifact?
No, customers must manually update their workflow to specify the exclusion pattern. We need to provide guidance on how to properly configure the download step to skip over docker produced artifacts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm not sure why this would be expected. I think this is an oversight in the implementation to skip unsupported content-types.
I suspect the disagreement here is fundamentally - "should the 1st party artifact actions be compatible with artifacts created by other actions?"
The current implementation says no - thus users should avoid downloading artifacts that weren't uploaded by actions/upload-artifact
.
The behavior change to skip non-zip files does not make them compatible, but it would at least avoid some friction due to incompatibility.
I'm still not convinced skipping these downloads is the right approach though - changing an explicit failure to a silent one could surprise users and increase friction.
relates to
At docker we are using the API to upload a build export artifact and we are not using zip format but gzip one:
When using the
actions/download-artifact
action, workflow would fail:As the download API expects a valid zip content type:
toolkit/packages/artifact/src/internal/download/download-artifact.ts
Line 92 in bb2278e
I think we should just skip downloading artifacts that don't have the expected content-type before extracting them.
Can be tested with:
In this workflow we have two files downloaded by "Download artifacts" step. After adding some logging on response headers we can see that the regular artifact uploaded with
actions/upload-artifact@v4
haszip
ascontent-type
header but one uploaded bydocker/build-push-action
hasapplication/gzip
:Step logs:
I think this change should mitigate this issue by making sure we try to extract a valid zip file. And with the deprecation of v3 on December 5, 2024 and brownouts coming in, we are going to have more reports.
cc @thompson-shaun @colinhemmings @tonistiigi