Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

artifact(download): skip non-zip files #1874

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

crazy-max
Copy link

@crazy-max crazy-max commented Nov 13, 2024

relates to

At docker we are using the API to upload a build export artifact and we are not using zip format but gzip one:

When using the actions/download-artifact action, workflow would fail:

Redirecting to blob download url: <redacted>.zip
Starting download of artifact to: <redacted>
Error: Unable to download artifact(s): Unable to download and extract artifact: Not a valid zip file

As the download API expects a valid zip content type:

.pipe(unzip.Extract({path: directory}))

I think we should just skip downloading artifacts that don't have the expected content-type before extracting them.

Can be tested with:

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      -
        name: Checkout
        uses: actions/checkout@v4
      -
        name: Meta
        id: meta
        uses: docker/metadata-action@master
      -
        name: Upload meta bake definition
        uses: actions/upload-artifact@v4
        with:
          name: bake-meta
          path: ${{ steps.meta.outputs.bake-file }}
          if-no-files-found: error
          retention-days: 1
      -
        name: Build
        uses: docker/build-push-action@master
        with:
          context: .

  post:
    runs-on: ubuntu-latest
    needs: build
    steps:
      -
        name: Download artifacts
        uses: crazy-max/download-artifact@test-skip-non-zip

In this workflow we have two files downloaded by "Download artifacts" step. After adding some logging on response headers we can see that the regular artifact uploaded with actions/upload-artifact@v4 has zip as content-type header but one uploaded by docker/build-push-action has application/gzip:

{
  "content-length": "5572",
  "content-type": "application/gzip",
  "content-md5": "yPIHPOPuYDEHs/vabwyt6A==",
  "last-modified": "Mon, 01 Jul 2024 09:22:34 GMT",
  "accept-ranges": "bytes",
  "etag": "\"0x8DC99AF5777FC1C\"",
  "server": "Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0",
  "x-ms-request-id": "b45769a4-301e-00cf-5498-cb2dab000000",
  "x-ms-version": "2023-11-03",
  "x-ms-creation-time": "Mon, 01 Jul 2024 09:22:34 GMT",
  "x-ms-lease-status": "unlocked",
  "x-ms-lease-state": "available",
  "x-ms-blob-type": "BlockBlob",
  "content-disposition": "attachment; filename=\"docker~test-docker-action~44M6YV.dockerbuild\"",
  "x-ms-server-encrypted": "true",
  "access-control-expose-headers": "x-ms-request-id,Server,x-ms-version,Content-Type,Last-Modified,ETag,x-ms-creation-time,Content-MD5,x-ms-lease-status,x-ms-lease-state,x-ms-blob-type,Content-Disposition,x-ms-server-encrypted,Accept-Ranges,Content-Length,Date,Transfer-Encoding",
  "access-control-allow-origin": "*",
  "date": "Mon, 01 Jul 2024 09:22:47 GMT"
}

Step logs:

Preparing to download the following artifacts:
- docker~test-docker-action~BGHOTQ.dockerbuild (ID: 1654731091, Size: 5567)
- bake-meta (ID: 1654730707, Size: 497)
##[debug]Artifact destination folder does not exist, creating: /home/runner/work/test-docker-action/test-docker-action/docker~test-docker-action~BGHOTQ.dockerbuild
##[debug]Artifact destination folder does not exist, creating: /home/runner/work/test-docker-action/test-docker-action/bake-meta
##[debug]Workflow Run Backend ID: 0cc20625-3561-4440-9048-70024d8ad258
##[debug]Workflow Job Run Backend ID: 937ea504-f21e-52f6-d164-c808765d698a
##[debug][Request] ListArtifacts https://results-receiver.actions.githubusercontent.com/twirp/github.actions.results.api.v1.ArtifactService/ListArtifacts
##[debug]Workflow Run Backend ID: 0cc20625-3561-4440-9048-70024d8ad258
##[debug]Workflow Job Run Backend ID: 937ea504-f21e-52f6-d164-c808765d698a
##[debug][Request] ListArtifacts https://results-receiver.actions.githubusercontent.com/twirp/github.actions.results.api.v1.ArtifactService/ListArtifacts
##[debug][Response] - 200
##[debug]Headers: {
##[debug]  "content-length": "282",
##[debug]  "content-type": "application/json",
##[debug]  "date": "Mon, 01 Jul 2024 09:36:54 GMT",
##[debug]  "x-github-backend": "Kubernetes",
##[debug]  "x-github-request-id": "E00A:2A89D3:1D45CA:25322C:668278B6"
##[debug]}
##[debug]Body: {
##[debug]  "artifacts": [
##[debug]    {
##[debug]      "workflow_run_backend_id": "0cc20625-3561-4440-9048-70024d8ad258",
##[debug]      "workflow_job_run_backend_id": "ca395085-040a-526b-2ce8-bdc85f692774",
##[debug]      "database_id": "1654731091",
##[debug]      "name": "docker~test-docker-action~BGHOTQ.dockerbuild",
##[debug]      "size": "5567",
##[debug]      "created_at": "2024-07-01T09:36:43Z"
##[debug]    }
##[debug]  ]
##[debug]}
##[debug][Request] GetSignedArtifactURL https://results-receiver.actions.githubusercontent.com/twirp/github.actions.results.api.v1.ArtifactService/GetSignedArtifactURL
##[debug][Response] - 200
##[debug]Headers: {
##[debug]  "content-length": "560",
##[debug]  "content-type": "application/json",
##[debug]  "date": "Mon, 01 Jul 2024 09:36:54 GMT",
##[debug]  "x-github-backend": "Kubernetes",
##[debug]  "x-github-request-id": "E00A:2A89D3:1D45DB:253240:668278B6"
##[debug]}
##[debug]Body: {
##[debug]  "signed_url": "https://productionresultssa10.blob.core.windows.net/actions-results/0cc20625-3561-4440-9048-70024d8ad258/workflow-job-run-ca395085-040a-526b-2ce8-bdc85f692774/artifacts/771ba7777401e8a24ea0b5dc7d95a3da79ee6e928254cc46f0f13e40846c0ab1.zip?se=2024-07-01T09%3A46%3A54Z&sig=B9B9ehrtCs66uPrB9jHHRXfHsr6n0g8hnLS1TuwBTtc%3D&ske=2024-07-01T19%3A00%3A39Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2024-07-01T07%3A00%3A39Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2023-11-03&sp=r&spr=https&sr=b&st=2024-07-01T09%3A36%3A49Z&sv=2023-11-03"
##[debug]}
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/0cc20625-3561-4440-9048-70024d8ad258/workflow-job-run-ca395085-040a-526b-2ce8-bdc85f692774/artifacts/771ba7777401e8a24ea0b5dc7d95a3da79ee6e928254cc46f0f13e40846c0ab1.zip
Starting download of artifact to: /home/runner/work/test-docker-action/test-docker-action/docker~test-docker-action~BGHOTQ.dockerbuild
##[debug][Response] - 200
##[debug]Headers: {
##[debug]  "content-length": "246",
##[debug]  "content-type": "application/json",
##[debug]  "date": "Mon, 01 Jul 2024 09:36:54 GMT",
##[debug]  "x-github-backend": "Kubernetes",
##[debug]  "x-github-request-id": "E00B:1FBE19:1D1E16:24FD99:668278B6"
##[debug]}
##[debug]Body: {
##[debug]  "artifacts": [
##[debug]    {
##[debug]      "workflow_run_backend_id": "0cc20625-3561-4440-9048-70024d8ad258",
##[debug]      "workflow_job_run_backend_id": "ca395085-040a-526b-2ce8-bdc85f692774",
##[debug]      "database_id": "1654730707",
##[debug]      "name": "bake-meta",
##[debug]      "size": "497",
##[debug]      "created_at": "2024-07-01T09:36:37Z"
##[debug]    }
##[debug]  ]
##[debug]}
##[debug][Request] GetSignedArtifactURL https://results-receiver.actions.githubusercontent.com/twirp/github.actions.results.api.v1.ArtifactService/GetSignedArtifactURL
##[debug][Response] - 200
##[debug]Headers: {
##[debug]  "content-length": "562",
##[debug]  "content-type": "application/json",
##[debug]  "date": "Mon, 01 Jul 2024 09:36:54 GMT",
##[debug]  "x-github-backend": "Kubernetes",
##[debug]  "x-github-request-id": "E00B:1FBE19:1D1E23:24FDAA:668278B6"
##[debug]}
##[debug]Body: {
##[debug]  "signed_url": "https://productionresultssa10.blob.core.windows.net/actions-results/0cc20625-3561-4440-9048-70024d8ad258/workflow-job-run-ca395085-040a-526b-2ce8-bdc85f692774/artifacts/906bf0728887597ba91b16b1778f41cea66aa49961106539f0c04e0b11d3abd5.zip?se=2024-07-01T09%3A46%3A54Z&sig=ArW2G%2BWxAGsJgNZB2X2kZhBt2RmgSbTSwR6atQG4Hwo%3D&ske=2024-07-01T20%3A16%3A40Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2024-07-01T08%3A16%3A40Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2023-11-03&sp=r&spr=https&sr=b&st=2024-07-01T09%3A36%3A49Z&sv=2023-11-03"
##[debug]}
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/0cc20625-3561-4440-9048-70024d8ad258/workflow-job-run-ca395085-040a-526b-2ce8-bdc85f692774/artifacts/906bf0728887597ba91b16b1778f41cea66aa49961106539f0c04e0b11d3abd5.zip
Starting download of artifact to: /home/runner/work/test-docker-action/test-docker-action/bake-meta
##[debug]response.message.headers: {"content-length":"5567","content-type":"application/gzip","content-md5":"s35vhwNK24ehlOSl2bTFUA==","last-modified":"Mon, 01 Jul 2024 09:36:43 GMT","accept-ranges":"bytes","etag":"\"0x8DC99B15171B60C\"","server":"Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0","x-ms-request-id":"e9c547ca-a01e-0012-649a-cb4aa8000000","x-ms-version":"2023-11-03","x-ms-creation-time":"Mon, 01 Jul 2024 09:36:43 GMT","x-ms-lease-status":"unlocked","x-ms-lease-state":"available","x-ms-blob-type":"BlockBlob","content-disposition":"attachment; filename=\"docker~test-docker-action~BGHOTQ.dockerbuild\"","x-ms-server-encrypted":"true","access-control-expose-headers":"x-ms-request-id,Server,x-ms-version,Content-Type,Last-Modified,ETag,x-ms-creation-time,Content-MD5,x-ms-lease-status,x-ms-lease-state,x-ms-blob-type,Content-Disposition,x-ms-server-encrypted,Accept-Ranges,Content-Length,Date,Transfer-Encoding","access-control-allow-origin":"*","date":"Mon, 01 Jul 2024 09:36:53 GMT"}
##[debug]Invalid content-type: application/gzip, skipping download
Artifact download completed successfully.
##[debug]response.message.headers: {"content-length":"497","content-type":"zip","last-modified":"Mon, 01 Jul 2024 09:36:37 GMT","accept-ranges":"bytes","etag":"\"0x8DC99B14DA08F62\"","server":"Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0","x-ms-request-id":"63899c09-b01e-0021-749a-cb1503000000","x-ms-version":"2023-11-03","x-ms-creation-time":"Mon, 01 Jul 2024 09:36:37 GMT","x-ms-lease-status":"unlocked","x-ms-lease-state":"available","x-ms-blob-type":"BlockBlob","x-ms-server-encrypted":"true","access-control-expose-headers":"x-ms-request-id,Server,x-ms-version,Content-Type,Last-Modified,ETag,x-ms-creation-time,x-ms-lease-status,x-ms-lease-state,x-ms-blob-type,x-ms-server-encrypted,Accept-Ranges,Content-Length,Date,Transfer-Encoding","access-control-allow-origin":"*","date":"Mon, 01 Jul 2024 09:36:54 GMT"}
(node:1468) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
Artifact download completed successfully.
Total of 2 artifact(s) downloaded
Download artifact has finished successfully
##[debug]Node Action run completed with exit code 0
##[debug]Set output download-path = /home/runner/work/test-docker-action/test-docker-action
##[debug]Finishing: Download artifacts

I think this change should mitigate this issue by making sure we try to extract a valid zip file. And with the deprecation of v3 on December 5, 2024 and brownouts coming in, we are going to have more reports.

cc @thompson-shaun @colinhemmings @tonistiigi

@crazy-max
Copy link
Author

Added some tests

@robherley Sorry for the ping but let me know if this needs anything else for review.

core.info(`Artifact download completed successfully.`)
return {downloadPath, skipped: false}
} else {
core.info(`Artifact download skipped.`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user requests a specific artifact to download and that artifact can't be downloaded, why should that silently fail?

Shouldn't that be an error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reluctant to change the default behavior here as it could be a breaking change for users.

Copy link
Author

@crazy-max crazy-max Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iiuc the API only expects zip content type for extraction when used with actions/download-artifact@v4:

.pipe(unzip.Extract({path: directory}))

And it currently breaks workflows when artifacts with other content-type are being downloaded.

I guess we could create an error type if content type does not match so actions/download-artifact can catch it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought silently skipping unrelated artifacts not uploaded with actions/upload-artifact would be best so it doesn't require any changes in actions/download-artifact.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to use pattern input to filter out any artifacts that are not created by upload-artifact action in this case? (credit to @joshmgross for this suggestion as well!)

Copy link
Author

@crazy-max crazy-max Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yacaovsnc

Maybe an exclude pattern such as !**/*.gzip if all docker uploaded artifacts are gzips?

It is uploaded as .dockerbuild file: https://github.com/docker/build-push-action/actions/runs/12707375438/job/35422185205#step:6:22

But azure blob storage or GitHub upload backend enforces .zip extension somehow when downloaded from Summary page: https://github.com/docker/build-push-action/actions/runs/12707375438/artifacts/2412083382

$ file docker~build-push-action~QFMCC3.dockerbuild.zip 
docker~build-push-action~QFMCC3.dockerbuild.zip: gzip compressed data, original size modulo 2^32 311808

Of if the artifacts have a common name pattern, it can also be matched? Is there a pattern for docker uploaded artifacts?

I don't think that's reliable. Best is checking the content type imo.

Also, if there are customer who rely on download docker uploaded artifacts, we are breaking them either way. Would it be possible to actually upload zips? I understand this maybe difficult too.

Zip compression is less efficient and may result in larger files for the same data and uses a less efficient compression algorithm compared to gzip. Also zip does not handle Unix-specific metadata (e.g., permissions, ownership, symbolic links). But we could encapsulate our tarball within the zip file but that sounds hacky. Doing so would also break Docker Desktop users trying to import builds with an unknown format. I defer to @colinhemmings @thompson-shaun.

What's wrong with excluding artifacts that don't have the expected content-type during extraction? If someone else using the API to upload another content-type, people using actions/download-artifact would have the same issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with excluding artifacts that don't have the expected content-type during extraction? If someone else using the API to upload another content-type, people using actions/download-artifact would have the same issue.

I think the issue is it blurs the line between a real exception and an "expected" exception? Consider the same scenario where a user uses API to upload, and uses download-artifact to verify the content. Everything could be working until someone accidentally modify the upload portion to point to a wrong file, with this change it will be hard for those users to catch the mistake?

pattern exists to include and exclude files, and to me it fits the current situation and it could solve our issue without any code change? A exclude pattern of !**/*.dockerbuild seems could do the job?

Copy link
Author

@crazy-max crazy-max Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue is it blurs the line between a real exception and an "expected" exception? Consider the same scenario where a user uses API to upload, and uses download-artifact to verify the content.

The API only expects zip content type for extraction when used with actions/download-artifact@v4:

.pipe(unzip.Extract({path: directory}))

So I'm not sure why this would be expected. I think this is an oversight in the implementation to skip unsupported content-types.

pattern exists to include and exclude files, and to me it fits the current situation and it could solve our issue without any code change? A exclude pattern of !**/*.dockerbuild seems could do the job?

So what you mean is having this pattern set as default in actions/download-artifact?: https://github.com/actions/download-artifact/blob/533298bc57c27f112a2c04a74a04a4d43e2866fd/action.yml#L11-L13

If in the future we change the filename it would break people using actions/download-artifact. The problem arises from actions/download-artifact downloading all artifacts if no name/pattern is defined.

Maybe I miss something but I think it needs code changes in your toolkit or in actions/download-artifact.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what you mean is having this pattern set as default in actions/download-artifact?

No, customers must manually update their workflow to specify the exclusion pattern. We need to provide guidance on how to properly configure the download step to skip over docker produced artifacts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm not sure why this would be expected. I think this is an oversight in the implementation to skip unsupported content-types.

I suspect the disagreement here is fundamentally - "should the 1st party artifact actions be compatible with artifacts created by other actions?"

The current implementation says no - thus users should avoid downloading artifacts that weren't uploaded by actions/upload-artifact.

The behavior change to skip non-zip files does not make them compatible, but it would at least avoid some friction due to incompatibility.

I'm still not convinced skipping these downloads is the right approach though - changing an explicit failure to a silent one could surprise users and increase friction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants