feat: delete acr and recreate if cache rule is wrong #5354
Merged
Azure Pipelines / Agentbaker E2E
succeeded
Dec 4, 2024 in 12m 25s
Build #20241204.33 had test failures
Details
- Failed: 4 (3.39%)
- Passed: 114 (96.61%)
- Other: 0 (0.00%)
- Total: 118
Annotations
Check failure on line 1 in Test_AzureLinuxV2_GPUAzureCNI
azure-pipelines / Agentbaker E2E
Test_AzureLinuxV2_GPUAzureCNI
Failed
Raw output
scenario_helpers_test.go:169: running scenario vhd: "/subscriptions/c4c3550e-a965-4993-a50c-628fd38cd3e1/resourceGroups/aksvhdtestbuildrg/providers/Microsoft.Compute/galleries/PackerSigGalleryEastUS/images/AzureLinuxV2gen2/versions/1.1733166545.3214", tags {Name:Test_AzureLinuxV2_GPUAzureCNI ImageName:AzureLinuxV2gen2 OS:azurelinux Arch:amd64 Airgap:false GPU:true WASM:false ServerTLSBootstrapping:false Scriptless:false KubeletCustomConfig:false}
vmss.go:39: creating VMSS "2024-12-04-7sed-azurelinuxv2gpuazurecni" in resource group "MC_abe2e-westus3_abe2e-azure-network-1bef8_westus3"
types.go:160: using "/subscriptions/c4c3550e-a965-4993-a50c-628fd38cd3e1/resourceGroups/aksvhdtestbuildrg/providers/Microsoft.Compute/galleries/PackerSigGalleryEastUS/images/AzureLinuxV2gen2/versions/1.1733166545.3214" for VHD
scenario_helpers_test.go:125: vmss 2024-12-04-7sed-azurelinuxv2gpuazurecni creation succeeded
pollers.go:29: waiting for node 2024-12-04-7sed-azurelinuxv2gpuazurecni to be ready
pollers.go:57: node 2024-12-04-7sed-azurelinuxv2gpuazurecni000000 is ready
scenario_helpers_test.go:128: node 2024-12-04-7sed-azurelinuxv2gpuazurecni is ready
pod.go:65: creating pod "2024-12-04-7sed-azurelinuxv2gpuazurecni000000-test-pod"
pod.go:78: pod "2024-12-04-7sed-azurelinuxv2gpuazurecni000000-test-pod" is ready
validation.go:21: node health validation: test pod "2024-12-04-7sed-azurelinuxv2gpuazurecni000000-test-pod" is running on node "2024-12-04-7sed-azurelinuxv2gpuazurecni000000"
validators.go:271: validating pod using nvidia GPU
pod.go:63: truncated pod name to "2024-12-04-7sed-azurelinuxv2gpuazurecni000000-enable-nvidia-dev"
pod.go:65: creating pod "2024-12-04-7sed-azurelinuxv2gpuazurecni000000-enable-nvidia-dev"
pod.go:78: pod "2024-12-04-7sed-azurelinuxv2gpuazurecni000000-enable-nvidia-dev" is ready
validators.go:296: resource "nvidia.com/gpu" is available
pod.go:63: truncated pod name to "2024-12-04-7sed-azurelinuxv2gpuazurecni000000-gpu-validation-po"
pod.go:65: creating pod "2024-12-04-7sed-azurelinuxv2gpuazurecni000000-gpu-validation-po"
pollers.go:120: -- pod metadata --
pollers.go:121: Name:
Namespace:
Node:
Status:
Start Time: <nil>
pollers.go:126: -- container(s) info --
pollers.go:130: -- pod events --
pollers.go:77: time before timeout: 8m43.35630677s
pollers.go:88: pod 2024-12-04-7sed-azurelinuxv2gpuazurecni000000-gpu-validation-po not found yet. Err pods "2024-12-04-7sed-azurelinuxv2gpuazurecni000000-gpu-validation-po" not found
pollers.go:120: -- pod metadata --
pollers.go:121: Name:
Namespace:
Node:
Status:
Start Time: <nil>
pollers.go:126: -- container(s) info --
pollers.go:130: -- pod events --
pollers.go:77: time before timeout: 3m43.358003274s
pollers.go:88: pod 2024-12-04-7sed-azurelinuxv2gpuazurecni000000-gpu-validation-po not found yet. Err pods "2024-12-04-7sed-azurelinuxv2gpuazurecni000000-gpu-validation-po" not found
pod.go:78: pod "2024-12-04-7sed-azurelinuxv2gpuazurecni000000-gpu-validation-po" is ready
pod.go:79:
Error Trace: /mnt/vss/_work/1/s/e2e/pod.go:79
/mnt/vss/_work/1/s/e2e/validators.go:277
/mnt/vss/_work/1/s/e2e/scenario_test.go:220
/mnt/vss/_work/1/s/e2e/scenario_helpers_test.go:177
/mnt/vss/_work/1/s/e2e/scenario_helpers_test.go:96
/mnt/vss/_work/1/s/e2e/scenario_test.go:200
Error: Received unexpected error:
context deadline exceeded
Test: Test_AzureLinuxV2_GPUAzureCNI
Messages: failed to wait for pod "2024-12-04-7sed-azurelinuxv2gpuazurecni000000-gpu-validation-po" to
Check failure on line 1 in Test_Ubuntu2204_GPUNC
azure-pipelines / Agentbaker E2E
Test_Ubuntu2204_GPUNC
Failed
Raw output
scenario_helpers_test.go:169: running scenario vhd: "/subscriptions/c4c3550e-a965-4993-a50c-628fd38cd3e1/resourceGroups/aksvhdtestbuildrg/providers/Microsoft.Compute/galleries/PackerSigGalleryEastUS/images/2204gen2containerd/versions/1.1733181276.698", tags {Name:Test_Ubuntu2204_GPUNC ImageName:2204gen2containerd OS:ubuntu Arch:amd64 Airgap:false GPU:true WASM:false ServerTLSBootstrapping:false Scriptless:false KubeletCustomConfig:false}
vmss.go:39: creating VMSS "2024-12-04-ogn7-ubuntu2204gpunc" in resource group "MC_abe2e-westus3_abe2e-kubenet-331fc_westus3"
types.go:160: using "/subscriptions/c4c3550e-a965-4993-a50c-628fd38cd3e1/resourceGroups/aksvhdtestbuildrg/providers/Microsoft.Compute/galleries/PackerSigGalleryEastUS/images/2204gen2containerd/versions/1.1733181276.698" for VHD
scenario_helpers_test.go:125: vmss 2024-12-04-ogn7-ubuntu2204gpunc creation succeeded
pollers.go:29: waiting for node 2024-12-04-ogn7-ubuntu2204gpunc to be ready
pollers.go:57: node 2024-12-04-ogn7-ubuntu2204gpunc000000 is ready
scenario_helpers_test.go:128: node 2024-12-04-ogn7-ubuntu2204gpunc is ready
pod.go:65: creating pod "2024-12-04-ogn7-ubuntu2204gpunc000000-test-pod"
pod.go:78: pod "2024-12-04-ogn7-ubuntu2204gpunc000000-test-pod" is ready
validation.go:21: node health validation: test pod "2024-12-04-ogn7-ubuntu2204gpunc000000-test-pod" is running on node "2024-12-04-ogn7-ubuntu2204gpunc000000"
validators.go:271: validating pod using nvidia GPU
pod.go:63: truncated pod name to "2024-12-04-ogn7-ubuntu2204gpunc000000-enable-nvidia-device-plug"
pod.go:65: creating pod "2024-12-04-ogn7-ubuntu2204gpunc000000-enable-nvidia-device-plug"
pod.go:78: pod "2024-12-04-ogn7-ubuntu2204gpunc000000-enable-nvidia-device-plug" is ready
validators.go:296: resource "nvidia.com/gpu" is available
pod.go:65: creating pod "2024-12-04-ogn7-ubuntu2204gpunc000000-gpu-validation-pod"
pollers.go:120: -- pod metadata --
pollers.go:121: Name: 2024-12-04-ogn7-ubuntu2204gpunc000000-gpu-validation-pod
Namespace: default
Node: 2024-12-04-6e08-ubuntu2204gpua10000000
Status: Pending
Start Time: <nil>
pollers.go:123: Reason:
pollers.go:124: Message:
pollers.go:126: -- container(s) info --
pollers.go:128: Container: gpu-validation-container
Image: mcr.microsoft.com/azuredocs/samples-tf-mnist-demo:gpu
Ports: []
pollers.go:130: -- pod events --
pollers.go:133: Reason: Scheduled, Message: Successfully assigned default/2024-12-04-ogn7-ubuntu2204gpunc000000-gpu-validation-pod to 2024-12-04-6e08-ubuntu2204gpua10000000, Count: 0, Last Timestamp: 0001-01-01 00:00:00 +0000 UTC
pollers.go:77: time before timeout: 5m24.078736926s
pollers.go:120: -- pod metadata --
pollers.go:121: Name:
Namespace:
Node:
Status:
Start Time: <nil>
pollers.go:126: -- container(s) info --
pollers.go:130: -- pod events --
pollers.go:77: time before timeout: 24.079296106s
pollers.go:88: pod 2024-12-04-ogn7-ubuntu2204gpunc000000-gpu-validation-pod not found yet. Err pods "2024-12-04-ogn7-ubuntu2204gpunc000000-gpu-validation-pod" not found
pod.go:78: pod "2024-12-04-ogn7-ubuntu2204gpunc000000-gpu-validation-pod" is ready
pod.go:79:
Error Trace: /mnt/vss/_work/1/s/e2e/pod.go:79
/mnt/vss/_work/1/s/e2e/validators.go:277
/mnt/vss/_work/1/s/e2e/scenario_test.go:803
/mnt/vss/_work/1/s/e2e/scenario_helpers_test.go:177
/mnt/vss/_work/1/s/e2e/scenario_helpers_test.go:96
/mnt/vss/_work/1/s/e2e/scenario_test.go:782
/mnt/vss/_work/1/s/e2e/scenario_test.go:769
Erro
Check failure on line 1 in Test_Ubuntu2204_GPUA100
azure-pipelines / Agentbaker E2E
Test_Ubuntu2204_GPUA100
Failed
Raw output
scenario_helpers_test.go:169: running scenario vhd: "/subscriptions/c4c3550e-a965-4993-a50c-628fd38cd3e1/resourceGroups/aksvhdtestbuildrg/providers/Microsoft.Compute/galleries/PackerSigGalleryEastUS/images/2204gen2containerd/versions/1.1733181276.698", tags {Name:Test_Ubuntu2204_GPUA100 ImageName:2204gen2containerd OS:ubuntu Arch:amd64 Airgap:false GPU:true WASM:false ServerTLSBootstrapping:false Scriptless:false KubeletCustomConfig:false}
vmss.go:39: creating VMSS "2024-12-04-gwfi-ubuntu2204gpua100" in resource group "MC_abe2e-westus3_abe2e-kubenet-331fc_westus3"
types.go:160: using "/subscriptions/c4c3550e-a965-4993-a50c-628fd38cd3e1/resourceGroups/aksvhdtestbuildrg/providers/Microsoft.Compute/galleries/PackerSigGalleryEastUS/images/2204gen2containerd/versions/1.1733181276.698" for VHD
vmss.go:75:
Error Trace: /mnt/vss/_work/1/s/e2e/vmss.go:75
/mnt/vss/_work/1/s/e2e/scenario_helpers_test.go:122
/mnt/vss/_work/1/s/e2e/scenario_helpers_test.go:95
/mnt/vss/_work/1/s/e2e/scenario_test.go:782
/mnt/vss/_work/1/s/e2e/scenario_test.go:773
Error: Received unexpected error:
PUT https://management.azure.com/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-331fc_westus3/providers/Microsoft.Compute/virtualMachineScaleSets/2024-12-04-gwfi-ubuntu2204gpua100
--------------------------------------------------------------------------------
RESPONSE 409: 409 Conflict
ERROR CODE: OperationNotAllowed
--------------------------------------------------------------------------------
{
"error": {
"code": "OperationNotAllowed",
"message": "Operation could not be completed as it results in exceeding approved StandardNCADSA100v4Family Cores quota. Additional details - Deployment Model: Resource Manager, Location: WestUS3, Current Limit: 50, Current Usage: 48, Additional Required: 24, (Minimum) New Limit Required: 72. Setup Alerts when Quota reaches threshold. Learn more at https://aka.ms/quotamonitoringalerting . Submit a request for Quota increase at https://aka.ms/ProdportalCRP/#blade/Microsoft_Azure_Capacity/UsageAndQuota.ReactView/Parameters/%7B%22subscriptionId%22:%228ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8%22,%22command%22:%22openQuotaApprovalBlade%22,%22quotas%22:[%7B%22location%22:%22WestUS3%22,%22providerId%22:%22Microsoft.Compute%22,%22resourceName%22:%22StandardNCADSA100v4Family%22,%22quotaRequest%22:%7B%22properties%22:%7B%22limit%22:72,%22unit%22:%22Count%22,%22name%22:%7B%22value%22:%22StandardNCADSA100v4Family%22%7D%7D%7D%7D]%7D by specifying parameters listed in the ‘Details’ section for deployment to succeed. Please read more about quota limits at https://docs.microsoft.com/en-us/azure/azure-supportability/per-vm-quota-requests"
}
}
--------------------------------------------------------------------------------
Test: Test_Ubuntu2204_GPUA100
Check failure on line 1 in Test_Ubuntu2204_GPUGridDriver
azure-pipelines / Agentbaker E2E
Test_Ubuntu2204_GPUGridDriver
Failed
Raw output
scenario_helpers_test.go:169: running scenario vhd: "/subscriptions/c4c3550e-a965-4993-a50c-628fd38cd3e1/resourceGroups/aksvhdtestbuildrg/providers/Microsoft.Compute/galleries/PackerSigGalleryEastUS/images/2204gen2containerd/versions/1.1733181276.698", tags {Name:Test_Ubuntu2204_GPUGridDriver ImageName:2204gen2containerd OS:ubuntu Arch:amd64 Airgap:false GPU:true WASM:false ServerTLSBootstrapping:false Scriptless:false KubeletCustomConfig:false}
vmss.go:39: creating VMSS "2024-12-04-6l41-ubuntu2204gpugriddriver" in resource group "MC_abe2e-westus3_abe2e-kubenet-331fc_westus3"
types.go:160: using "/subscriptions/c4c3550e-a965-4993-a50c-628fd38cd3e1/resourceGroups/aksvhdtestbuildrg/providers/Microsoft.Compute/galleries/PackerSigGalleryEastUS/images/2204gen2containerd/versions/1.1733181276.698" for VHD
scenario_helpers_test.go:125: vmss 2024-12-04-6l41-ubuntu2204gpugriddriver creation succeeded
pollers.go:29: waiting for node 2024-12-04-6l41-ubuntu2204gpugriddriver to be ready
pollers.go:57: node 2024-12-04-6l41-ubuntu2204gpugriddriver000000 is ready
scenario_helpers_test.go:128: node 2024-12-04-6l41-ubuntu2204gpugriddriver is ready
pod.go:65: creating pod "2024-12-04-6l41-ubuntu2204gpugriddriver000000-test-pod"
pod.go:78: pod "2024-12-04-6l41-ubuntu2204gpugriddriver000000-test-pod" is ready
validation.go:21: node health validation: test pod "2024-12-04-6l41-ubuntu2204gpugriddriver000000-test-pod" is running on node "2024-12-04-6l41-ubuntu2204gpugriddriver000000"
validators.go:271: validating pod using nvidia GPU
pod.go:63: truncated pod name to "2024-12-04-6l41-ubuntu2204gpugriddriver000000-enable-nvidia-dev"
pod.go:65: creating pod "2024-12-04-6l41-ubuntu2204gpugriddriver000000-enable-nvidia-dev"
pod.go:78: pod "2024-12-04-6l41-ubuntu2204gpugriddriver000000-enable-nvidia-dev" is ready
validators.go:296: resource "nvidia.com/gpu" is available
pod.go:63: truncated pod name to "2024-12-04-6l41-ubuntu2204gpugriddriver000000-gpu-validation-po"
pod.go:65: creating pod "2024-12-04-6l41-ubuntu2204gpugriddriver000000-gpu-validation-po"
pollers.go:120: -- pod metadata --
pollers.go:121: Name: 2024-12-04-6l41-ubuntu2204gpugriddriver000000-gpu-validation-po
Namespace: default
Node: 2024-12-04-846p-azurelinuxv2gpu000000
Status: Pending
Start Time: <nil>
pollers.go:123: Reason:
pollers.go:124: Message:
pollers.go:126: -- container(s) info --
pollers.go:128: Container: gpu-validation-container
Image: mcr.microsoft.com/azuredocs/samples-tf-mnist-demo:gpu
Ports: []
pollers.go:130: -- pod events --
pollers.go:133: Reason: Scheduled, Message: Successfully assigned default/2024-12-04-6l41-ubuntu2204gpugriddriver000000-gpu-validation-po to 2024-12-04-846p-azurelinuxv2gpu000000, Count: 0, Last Timestamp: 0001-01-01 00:00:00 +0000 UTC
pollers.go:77: time before timeout: 6m23.457743142s
pollers.go:120: -- pod metadata --
pollers.go:121: Name:
Namespace:
Node:
Status:
Start Time: <nil>
pollers.go:126: -- container(s) info --
pollers.go:130: -- pod events --
pollers.go:77: time before timeout: 1m22.459936479s
pollers.go:88: pod 2024-12-04-6l41-ubuntu2204gpugriddriver000000-gpu-validation-po not found yet. Err pods "2024-12-04-6l41-ubuntu2204gpugriddriver000000-gpu-validation-po" not found
pod.go:78: pod "2024-12-04-6l41-ubuntu2204gpugriddriver000000-gpu-validation-po" is ready
pod.go:79:
Error Trace: /mnt/vss/_work/1/s/e2e/pod.go:79
/mnt/vss/_work/1/s/e2e/validators.go:277
/mnt/vss/_work/1/s/e2e/scenario_test.go:831
/mnt/vss/_work/1/s/e2e/scenario_helpers_test.go:177
Loading