Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New image for ubuntu-latest docker issues #10646

Closed
2 of 13 tasks
BBTristanBenschop opened this issue Sep 19, 2024 · 21 comments
Closed
2 of 13 tasks

New image for ubuntu-latest docker issues #10646

BBTristanBenschop opened this issue Sep 19, 2024 · 21 comments

Comments

@BBTristanBenschop
Copy link

Description

When starting a docker container in our devops pipeline since last night we receive the error:

Docker.DotNet.DockerApiException : Docker API responded with status code=Conflict, response={"message":"container f111f68d35dc63185d89a5d93600a4af7fdb01513d9e17f873aa400c3dc0c6da is not running"}

Platforms affected

  • Azure DevOps
  • GitHub Actions - Standard Runners
  • GitHub Actions - Larger Runners

Runner images affected

  • Ubuntu 20.04
  • Ubuntu 22.04
  • Ubuntu 24.04
  • macOS 12
  • macOS 13
  • macOS 13 Arm64
  • macOS 14
  • macOS 14 Arm64
  • Windows Server 2019
  • Windows Server 2022

Image version and build link

We are using ubuntu-latest

Agent name: 'Hosted Agent'
Agent machine name: 'fv-az366-67'
Current agent version: '3.243.1'
Operating System
Runner Image
Runner Image Provisioner
Current image version: '20240915.1.0'
Agent running as: 'vsts'

the last working version is:

Agent name: 'Azure Pipelines 2'
Agent machine name: 'fv-az634-412'
Current agent version: '3.243.1'
Operating System
Runner Image
Runner Image Provisioner
Current image version: '20240908.1.0'
Agent running as: 'vsts'

Is it regression?

Current image version: '20240908.1.0'

Expected behavior

To properly startup the docker container so that our tests can be ran inside the container.

Actual behavior

Fails to start docker container

Repro steps

Start a docker container

@SpaceOgre
Copy link

We are seeing this problem as well, tested in ubuntu 24.04 as well and same problem there.

@josefinbrandt
Copy link

We also have this problem.

@kishorekumar-anchala
Copy link
Contributor

Hi @BBTristanBenschop ,

Thank you for bringing this issue to us. We are looking into this issue and will update you on this issue after investigating.

@BBTristanBenschop
Copy link
Author

Temporarily fix for us is switching to ubuntu-20.04 image where this issue doesn't exist.

@kishorekumar-anchala
Copy link
Contributor

Hi @SpaceOgre @BBTristanBenschop ,
Kindly share below things

  1. The specific DevOps pipeline configuration related to starting the Docker container.
  2. The exact command or script used to start the Docker container.

@SpaceOgre
Copy link

@kishorekumar-anchala

YAML file

# ASP.NET Core
# Build and test ASP.NET Core projects targeting .NET Core.
# Add steps that run tests, create a NuGet package, deploy, and more:
# https://docs.microsoft.com/azure/devops/pipelines/languages/dotnet-core

trigger:
  - main

pool:
  vmImage: ubuntu-latest

variables:
  buildConfiguration: "Release"

steps:
  - task: UseDotNet@2
    displayName: "Use .NET 8 sdk"
    inputs:
      packageType: "sdk"
      version: "8.0.x"

  # This is needed for the dotnet tool install commands to work, the dotnet restore command should work without it but I keep it at the top just in case.
  - task: NuGetAuthenticate@1
    displayName: "Authenticate with NuGet"

  - task: DotNetCoreCLI@2
    displayName: dotnet restore
    inputs:
      command: restore
      projects: '**/*.csproj'
      feedRestore: GR.Library

  - task: DotNetCoreCLI@2
    inputs:
      command: custom
      custom: format
      arguments: "--verify-no-changes --verbosity diagnostic"
    displayName: Check formatting

  - task: DotNetCoreCLI@2
    displayName: "dotnet build $(buildConfiguration)"
    inputs:
      command: build
      projects: "**/*.csproj"
      arguments: "--configuration $(buildConfiguration)"

  - task: DotNetCoreCLI@2
    displayName: Dotnet test
    inputs:
      command: "test"
      projects: "tests/**/*.csproj"
      publishTestResults: true
      arguments: '--configuration $(buildConfiguration) --collect:"Code Coverage" --settings:devops/CodeCoverage.runsettings'

  - task: DotNetCoreCLI@2
    displayName: "Install dotnet-coverage"
    inputs:
      command: custom
      custom: tool
      arguments: "install --global dotnet-coverage"

  - task: DotNetCoreCLI@2
    displayName: "Install ReportGenerator"
    inputs:
      command: custom
      custom: tool
      arguments: "install --global dotnet-reportgenerator-globaltool"

  # This step is needed for the reportgenerator to work, since we use the Code Coverage collect during tests and the reportgenerator does not support it.
  # It is done like this so we can get code coverage results in Pull Request and get a full report to download and look at if needed.
  - script: dotnet-coverage merge -r -f cobertura -o merged.cobertura.xml $(Agent.WorkFolder)/*.coverage
    displayName: Merge code coverage files

  - script: reportgenerator -reports:merged.cobertura.xml -targetdir:$(Build.SourcesDirectory)/CodeCoverage -reporttypes:'HtmlInline' -classfilters:+GR.PRIIS.*
    displayName: Create Html Report for Code Coverage

  - task: PublishBuildArtifacts@1
    displayName: "Publish code coverage html report as artifact"
    inputs:
      PathtoPublish: "$(Build.SourcesDirectory)/CodeCoverage"
      ArtifactName: "CodeCoverage"
      publishLocation: "Container"

Docker part

We are starting docker in the Dotnet test task using TestContainers DotNet: https://github.com/testcontainers/testcontainers-dotnet

@robinbaxon
Copy link

robinbaxon commented Sep 19, 2024

Can confirm that we experience a similar issue on our side as well, related to our TestContainers usage.
We are running our workflows in GitHub on GitHub repositories, not pipelines in AzureDevOps.

We reverted our workflows to use ubuntu-20.04 (which gave us the ubuntu-20.04.6 revision of the image) and that works for us for most of our workflows. Thanks for the tip @BBTristanBenschop.

@BBTristanBenschop
Copy link
Author

@kishorekumar-anchala we use a very similar setup as SpaceOgre including the use of TestContainers. It fails on running the tests which makes use of TestContainer library.

@SpaceOgre
Copy link

SpaceOgre commented Sep 19, 2024

@kishorekumar-anchala Adding some more context from stdout:

Failing in 24.04

[xUnit.net 00:00:00.45]   Starting:    GR.PRIIS.API.IntegrationTests
[testcontainers.org 00:00:00.13] Connected to Docker:
  Host: unix:///var/run/docker.sock
  Server Version: 26.1.3
  Kernel Version: 6.8.0-1014-azure
  API Version: 1.45
  Operating System: Ubuntu 24.04.1 LTS
  Total Memory: 6.77 GB
[testcontainers.org 00:00:00.23] Searching Docker registry credential in Auths
[testcontainers.org 00:00:00.23] Docker registry credential https://index.docker.io/v1/ found
[testcontainers.org 00:00:00.87] Searching Docker registry credential in CredHelpers
[testcontainers.org 00:00:00.87] Searching Docker registry credential in CredsStore
[testcontainers.org 00:00:03.20] Docker image testcontainers/ryuk:0.6.0 created
[testcontainers.org 00:00:03.30] Docker container 0935fb8ac4c0 created
[testcontainers.org 00:00:03.37] Start Docker container 0935fb8ac4c0
[testcontainers.org 00:00:04.13] Wait for Docker container 0935fb8ac4c0 to complete readiness checks
[testcontainers.org 00:00:04.14] Docker container 0935fb8ac4c0 ready
[testcontainers.org 00:00:04.15] Searching Docker registry credential in Auths
[testcontainers.org 00:00:04.15] Searching Docker registry credential in CredHelpers
[testcontainers.org 00:00:04.15] Searching Docker registry credential in Auths
[testcontainers.org 00:00:04.15] Searching Docker registry credential in CredsStore
[testcontainers.org 00:00:04.15] Docker registry credential mcr.microsoft.com not found
[testcontainers.org 00:00:24.72] Docker image mcr.microsoft.com/mssql/server:2019-CU18-ubuntu-20.04 created
[testcontainers.org 00:00:24.74] Docker container 0661ae8d376a created
[testcontainers.org 00:00:24.75] Start Docker container 0661ae8d376a
[testcontainers.org 00:00:25.01] Wait for Docker container 0661ae8d376a to complete readiness checks
[testcontainers.org 00:00:25.02] Execute "/bin/sh -c find /opt/mssql-tools*/bin/sqlcmd -type f -print -quit" at Docker container 0661ae8d376a
[testcontainers.org 00:00:25.17] Execute "/opt/mssql-tools/bin/sqlcmd -C -Q SELECT 1;" at Docker container 0661ae8d376a
[testcontainers.org 00:00:31.45] Execute "/opt/mssql-tools/bin/sqlcmd -C -Q SELECT 1;" at Docker container 0661ae8d376a
[xUnit.net 00:00:32.16]       Docker.DotNet.DockerApiException : Docker API responded with status code=Conflict, response={"message":"container 0661ae8d376a18e1b335b257c18323bc58a990b1507a2bac423dc778385c179e is not running"}

How it looks when it works in 20.04

[testcontainers.org 00:00:00.09] Connected to Docker:
  Host: unix:///var/run/docker.sock
  Server Version: 26.1.3
  Kernel Version: 5.15.0-1071-azure
  API Version: 1.45
  Operating System: Ubuntu 20.04.6 LTS
  Total Memory: 6.77 GB
[testcontainers.org 00:00:00.15] Searching Docker registry credential in CredHelpers
[testcontainers.org 00:00:00.15] Searching Docker registry credential in Auths
[testcontainers.org 00:00:00.16] Searching Docker registry credential in CredsStore
[testcontainers.org 00:00:00.16] Docker registry credential https://index.docker.io/v1/ found
[testcontainers.org 00:00:02.50] Docker image testcontainers/ryuk:0.6.0 created
[testcontainers.org 00:00:02.60] Docker container 0c5851659281 created
[testcontainers.org 00:00:02.66] Start Docker container 0c5851659281
[testcontainers.org 00:00:03.05] Wait for Docker container 0c5851659281 to complete readiness checks
[testcontainers.org 00:00:03.05] Docker container 0c5851659281 ready
[testcontainers.org 00:00:03.07] Searching Docker registry credential in Auths
[testcontainers.org 00:00:03.07] Searching Docker registry credential in CredHelpers
[testcontainers.org 00:00:03.07] Searching Docker registry credential in CredsStore
[testcontainers.org 00:00:03.07] Searching Docker registry credential in Auths
[testcontainers.org 00:00:03.07] Docker registry credential mcr.microsoft.com not found
[testcontainers.org 00:00:21.09] Docker image mcr.microsoft.com/mssql/server:2019-CU18-ubuntu-20.04 created
[testcontainers.org 00:00:21.14] Docker container 551804a10cff created
[testcontainers.org 00:00:21.14] Start Docker container 551804a10cff
[testcontainers.org 00:00:21.40] Wait for Docker container 551804a10cff to complete readiness checks
[testcontainers.org 00:00:21.41] Execute "/bin/sh -c find /opt/mssql-tools*/bin/sqlcmd -type f -print -quit" at Docker container 551804a10cff
[testcontainers.org 00:00:21.51] Execute "/opt/mssql-tools/bin/sqlcmd -C -Q SELECT 1;" at Docker container 551804a10cff
[testcontainers.org 00:00:27.12] Execute "/opt/mssql-tools/bin/sqlcmd -C -Q SELECT 1;" at Docker container 551804a10cff
[testcontainers.org 00:00:27.21] Docker container 551804a10cff ready

@kwuite
Copy link

kwuite commented Sep 19, 2024

Temporarily fix for us is switching to ubuntu-20.04 image where this issue doesn't exist.

Our mssql container failed in CI/CD with really odd messages like:

This program has encountered a fatal error and cannot continue running at Thu Sep 19 08:41:27 2024
The following diagnostic information is available:

         Reason: 0x00000001
         Signal: SIGABRT - Aborted (6)
          Stack:
                 IP               Function
                 ---------------- --------------------------------------
                 000056207ec88a5a std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > std::__1::operator+<char, std::__1::char_traits<char>, std::__1::allocator<char> >(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > c
                 000056207ec88559 std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > std::__1::operator+<char, std::__1::char_traits<char>, std::__1::allocator<char> >(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > c
                 000056207ec8745c std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > std::__1::operator+<char, std::__1::char_traits<char>, std::__1::allocator<char> >(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > c
                 00007f0982d7b4b0 killpg+0x40
                 00007f0982d7b428 gsignal+0x38
                 00007f0982d7d02a abort+0x16a
                 000056207ec1add4 std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > std::__1::operator+<char, std::__1::char_traits<char>, std::__1::allocator<char> >(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > c
                 000056207ecdf7a8 void google::protobuf::internal::arena_delete_object<google::protobuf::Message>(void*)+0x2a38
                 000056207ecdf500 void google::protobuf::internal::arena_delete_object<google::protobuf::Message>(void*)+0x2790
                 000056207ec2ab36 std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > std::__1::operator+<char, std::__1::char_traits<char>, std::__1::allocator<char> >(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > c
                 000056207ec2a7d0 std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > std::__1::operator+<char, std::__1::char_traits<char>, std::__1::allocator<char> >(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > c
        Process: 9 - sqlservr
         Thread: 73 (application thread 0xf4)
    Instance Id: a672206a-cf56-49cc-9c29-89a1990f3f36
       Crash Id: a1fba965-828c-4d91-bf1c-8cb2a0709c68
    Build stamp: 52ec5c991c015cbdd504002245460be8a9a9b6c41343aaab03cf768750a6c2df
   Distribution: Ubuntu 16.04.6 LTS
     Processors: 2
   Total Memory: 8324341760 bytes
      Timestamp: Thu Sep 19 08:41:27 2024
     Last errno: 2
Last errno text: No such file or directory

We noticed, the runner images have been update by Github.

This was our Github Action Host machine image from a few days ago

Runner Image
Image: ubuntu-22.04
  Version: 20240908.1.0
  Included Software: https://github.com/actions/runner-images/blob/ubuntu22/20240908.1/images/ubuntu/Ubuntu2204-Readme.md
  Image Release: https://github.com/actions/runner-images/releases/tag/ubuntu22%2F20240908.1

This is the version we see today, that fail.

Runner Image
  Image: ubuntu-22.04
  Version: 20240915.1.0
  Included Software: https://github.com/actions/runner-images/blob/ubuntu22/20240915.1/images/ubuntu/Ubuntu2204-Readme.md
  Image Release: https://github.com/actions/runner-images/releases/tag/ubuntu22%2F20240915.1

I believe the Linux kernel change from 6.5 to 6.8 is the reason this is failing.

🎉We fixed our issue by using the Ubuntu 20.04 runner image because both 22 and 24 have been affected as you can read in this issue on Github: #10646

to fix docker issues in your workflow.yml, make the following change:

runs-on: ubuntu-20.04

@kiview
Copy link

kiview commented Sep 19, 2024

Related issue:
#10649

@HofmeisterAn
Copy link
Contributor

It looks like another workaround is bumping the image to:

mcr.microsoft.com/mssql/server:2022-CU14-ubuntu-22.04

@jupjohn
Copy link

jupjohn commented Sep 20, 2024

It looks like another workaround is bumping the image to:

mcr.microsoft.com/mssql/server:2022-CU14-ubuntu-22.04

If you don't want to upgrade to SQLServer 2022, I've confirmed the tag 2019-CU26-ubuntu-20.04 is the lowest working version on both ubuntu-22.04 and ubuntu-24.04.

@kishorekumar-anchala
Copy link
Contributor

@SpaceOgre
Copy link

It looks like another workaround is bumping the image to:

mcr.microsoft.com/mssql/server:2022-CU14-ubuntu-22.04

This works well for us and we will go with this solution, just a heads up to anyone else reading this:
To get a newer image of SQL server to work you will need TestContainers v3.10 since the sqlcmd has been moved and a fix for that is available starting in 3.10

@SpaceOgre
Copy link

SpaceOgre commented Sep 25, 2024

@kishorekumar-anchala

Hi @SpaceOgre @BBTristanBenschop ,

We request you to try below workaround.

https://github.com/RaviAkshintala/testcontainers-dotnet/actions/runs/10993413941/workflow#L38

Is the workaround to start the docker image manually and not through TestContainers? Not sure I follow completely.
The fix provided by @HofmeisterAn works well and we are going with that.

But it would be good if it was possible to specify exactly wich runner version we are using and not just the OS-version, if something like this happens again since it was a bit of a problem this time.

@BBTristanBenschop
Copy link
Author

We have also opted to increase the image version of the sql server. We don't call docker directly, its hidden somewhere in TestContainers package.

@urnie
Copy link

urnie commented Sep 27, 2024

I'll add my two cents here saying that our workflows (that start up a dockerized SQL server 2019 (mcr.microsoft.com/mssql/server:2019-CU15-ubuntu-20.04) stopped working when the ubuntu-22.04.4 was updated to ubuntu-22.04.5 (release 20240915.1 and release 20240922.1). Looking at the changelog, there's some Docker-related version updates, but I don't know if and how they can cause this bug.

The patch number (.4 and .5) can't be specified by the developer (source), which makes sense, as they contain security updates.

This workaround by @BBTristanBenschop worked for me:

Temporarily fix for us is switching to ubuntu-20.04 image where this issue doesn't exist.

However I was left wondering that if the patch update from ubuntu-22.04.4 to 22.04.5 broke something, why hasn't the same (breaking) patch been applied to ubuntu-20.04, which still works?

@kishorekumar-anchala kishorekumar-anchala added awaiting-deployment Code complete; awaiting deployment and/or deployment in progress and removed awaiting-deployment Code complete; awaiting deployment and/or deployment in progress labels Oct 8, 2024
@kishorekumar-anchala
Copy link
Contributor

Hi @SpaceOgre @BBTristanBenschop ,

Could you try with the new image version (20241006.1) . it might fix your issue.

@kishorekumar-anchala
Copy link
Contributor

Hi @SpaceOgre @BBTristanBenschop ,

Could you please share your confirmation on this issue ?

@kishorekumar-anchala
Copy link
Contributor

HI @SpaceOgre @BBTristanBenschop ,

We hope that this has been fixed and we are closing this issue. Thank you !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants