NO-ISSUE: moving images to be multiplatform #19

tsorya · 2021-06-28T17:29:33Z

No description provided.

) This commit add a retry mechanism while waiting for the operator to be ready. If we apply the CR of the operator it may happen (bug 1968606), that the OLM will report the Failed state, even that it's actually progressing. So we decided to ignore the failed state for few times.

…ess (openshift#298) * MGMT-6663: Update progress of operators only when there's a new progress * MGMT-6663: Enhance log operator update status

- Minor log changes on `setting EFI boot order unsupported` - In `SetBootOrder` function - passing `liveLogger=nil` in order to not have an uneeded log at that stage (Failed executing nsenter...)

…essage to the service (openshift#324)

- Minor change to log regarding boot order setup on BIOS systems

Older versions of go are out of support, so for security compliance, we were trying to get all components on the latest version. 1.14 is already out of support, i.e. https://endoflife.date/go

Signed-off-by: Flavio Percoco <[email protected]> Co-authored-by: Flavio Percoco <[email protected]>

Switch to a a generic `isOperatorAvailable` function that would be used for getting any class that inherits `OperatorHandler`. This function gets the operator status from the service, if the status is available - stop running. Otherwise, get the operator status locally and check if it is different than the status at the service, if so - send an update to the service. There are 3 implementations for the OperatorHandler` interface: - ClusterOperatorHandler (such as console) - ClusterVersionHandler (only CVO) - ClusterServiceVersionHandler (such as OCS, LSO)

…ift#332) The method call getProgressingOLMOperators could return error in case the assisted service API is unavailable. In that case we wouldn't update the status of the operators which are pending and it may happend that the cluster will hang in finalizing state.

…ft#329) * MGMT-4893: Add Must-Gather reports when olm controllers fail - Add support for JSON-formatted MUST_GATHER_IMAGE variable - Backward compatability with other formats of MUST_GATHER_IMAGE - When one of the Olm operator fails or timesout it is marked on the controller status - At the end of the installation process (either normal or aborted) we check if must-gather report should be collected and with what scope * NO-ISSUE: correcting typo in log Co-authored-by: Yuval Goldberg <[email protected]> Co-authored-by: Yuval Goldberg <[email protected]>

…rap kube-apiserver though the kube-apiserver moved to one of the masters (openshift#327) On the bootstrap node the assisted-installer use the loopback-kubeconfig to query the kube-apiserver for the number of ready master nodes. Usually both master nodes join the cluster and become ready before bootkube takes down the bootstrap control plane so the loopback kubeconfig works. But in case clusteer bootstrap finish before the 2 master nodes are ready the assisted-installer will wait forever since the it's using the loopback-kubeconfig and the bootstrap control plane is down resulting in "connection refused" The assisted-installer should query the kube-apiserver running on one of the master nodes, for that to work it should use the real kubeconfig instead of the loopback kubeconfig.

… on Cluster Version Operator (openshift#334) Fix for very specific case when cvo has new messages all the time but in reality it is stuck. Adding CVOMaxTimeout with 3 hours.

openshift#335) This will allow both string and json input to the must-gather image

Right now in case we kube-api-server is not reachable we will not send any logs. This code is changing, from now in case kube-api error we will send this error as log

…hift#338)

…penshift#339) We have a workaround that deletes service that took address of dns service. Till now it supported only ipv4. This change adds ipv6 support

…-env for the host (openshift#336) Assisted-installer will get infra-env-id as part of install command aragument, and will use it to update host progress and to download host ignition Assisted-installer-controller will use each host's InfraEnvID field to update it progress in assisted-service

Adding function that will change token value to <SECRET>

…end (openshift#342) joined status to the cluster. Now they will send joined and only then done

… join (openshift#343)

…hift#345)

…pping (openshift#346) * NO-ISSUE: remove obsolete installation-timeout parameter * MGMT-7635: Fix logs gathering on SNO when failing to complete bootstrapping The log_sender command failed to mount /root/.ssh due to: "no such file or directory" error. This code ensure the directory get created once the bootstrap flow begin

…ft#333)

This PR is fixing the manifest JSON parsing and improves logging.

…stalling with IPv6 (openshift#350) Updated the regex to allow more chars between the host IP and 'Ignition' this is required because in the MCS log the host IP is logged as scoped literal IPv6 address e.g. [fe80::ff:fe9d:12ac%ens3]:42692 This should also allow master nodes to get updated to 'Configuring'

SetBootOrder is using efibootmgr for selecting the correct device. The specified loader should be set with an appropriate efi file according to the runtime CPU architecture. I.e. x86_64 -> shimx64.efi arm64 -> shimaa64.efi

Signed-off-by: Flavio Percoco <[email protected]> Co-authored-by: Flavio Percoco <[email protected]>

…her (openshift#357)

…enshift#359) This commit adds an ability to match nodes during the installation using their IP addresses as well as reported hostnames. Currently only the hostname of the node is taken into account and compared against the known inventory. With this PR we are adding a feature that, in case of a name mismatch, performs a scan over IP addresses of the reporting node and nodes in the inventory and if the match is found, accepts the node. This is to cover cases where the node name in the inventory is not an exact match with the name reported by the node itself. Contributes-to: MGMT-7315

…unicating with assisted-service (openshift#358) This PR mainly converts the inventory_client to use V2 APIs instead of V1 for all its internal implementation. Since many function access hosts data now need InfraEnvID, it is taken from the configuration of the assisted-installer (set during install command)

The original plan was to move all images to ubi8. This is not possible due to the lack of some packages that are needed for other projects. We are now going to switch all images to stream8 with the hope that the consistency accross repos will prevent (or help) with debugging current/future issues in CI. The goal is to keep component's builds as consistent as possible in the channels we are releasing them on Signed-off-by: Flavio Percoco <[email protected]> Co-authored-by: Flavio Percoco <[email protected]>

…ace (openshift#291) * Bug 1966621: Do not use run-level label for assisted-installer namespace Namespaces using run-level label are considered to be highly privileged and Security Context Constrains are not applied to the workload running in them. One of the deployment models for assisted-installer uses cluster deployed by the AI service to deploy next clusters. In this scenario, if the same `assisted-installer` namespace is used for deploying Assisted Service Operator, the pods do not get any securityContext properties applied. This, apart from the potential security violations, causes functional errors as e.g. Postgres container is running using wrong UID. This PR changes configuration of the `assisted-installer` namespace so that it does not have run-level label applied and is treated like any other customer namespace. Contributes-to: OCPBUGSM-29833 * Bug 1966621: Clean up the code after run-level label removal With the `run-level` label being completely dropped we are now removing the remaining logic handling it in the post-installation steps. Contributes-to: OCPBUGSM-29833 * Bug 1966621: Allow assisted-installer service account to use SCCs This commits adds additional permissions to the service account used by the assisted-installer-controller. As we no longer override Security Context Constrains for the whole assisted-installer namespace, we are adding explicit permissions to the account used to run the AI controller pod. Contributes-to: OCPBUGSM-29833

…t#364)

…as moved in GitHub (openshift#365) The https://github.com/irifrance/gin repo we indirectly depended on via our `github.com/operator-framework/operator-lifecycle-manager v0.18.0` dependency has moved to https://github.com/go-air/gin. This invalid dependency was fixed in newer versions of the operator-lifecycle-manager, but I prefer just fixing this issue with a `replace` directive rather than dealing with an olm upgrade. Without this `replace` directive, attempting to work on the installer repository locally causes my IDE / go commands to complain about irifrance/gin being gone, this `replace` directive fixes those issues.

…#366) Updated retry configuration to 1 hour

…on failure (openshift#368) LogURL field is filled in case nay instance of the logs was generated In case cluster install fails due to the bootstrap node being stuck, the logs are generated by the 2 masters at the ned of installation process (writing to disk). In that case code directly brings up the send_logs command via podman, and the infra-env parameter should be passed there also

machacekondra and others added 30 commits July 4, 2021 08:55

MGMT-6663: Update progress of operators only when there's a new progr…

56359ed

…ess (openshift#298) * MGMT-6663: Update progress of operators only when there's a new progress * MGMT-6663: Enhance log operator update status

NO-ISSUE: Adding sagidayan as a code reviewer (openshift#322)

df41ab5

Bug 1979009: Change log message about EFI support (openshift#323)

a905189

- Minor log changes on `setting EFI boot order unsupported` - In `SetBootOrder` function - passing `liveLogger=nil` in order to not have an uneeded log at that stage (Failed executing nsenter...)

MGMT-7201: Wrap PostInstallConfigs logs and pass the entire failure m…

ebf29ac

…essage to the service (openshift#324)

Bug 1979009: Change log message about EFI support (openshift#326)

706bbf2

- Minor change to log regarding boot order setup on BIOS systems

MGMT-7210: Upgrade Go version to 1.16 (openshift#325)

ec000b2

Older versions of go are out of support, so for security compliance, we were trying to get all components on the latest version. 1.14 is already out of support, i.e. https://endoflife.date/go

NO-ISSUE: Log message before uploading logs to service (openshift#330)

e87d07c

Signed-off-by: Flavio Percoco <[email protected]> Co-authored-by: Flavio Percoco <[email protected]>

OCPBUGSM-27526: Cluster deployment freeze on Finalizing stage forever…

e285f30

… on Cluster Version Operator (openshift#334) Fix for very specific case when cvo has new messages all the time but in reality it is stuck. Adding CVOMaxTimeout with 3 hours.

MGMT-4893: quote must-gather-image env variable in controller manifest (

5273a2c

openshift#335) This will allow both string and json input to the must-gather image

MGMT-3817: asssited-controller should always send logs (openshift#337)

8c2351a

Right now in case we kube-api-server is not reachable we will not send any logs. This code is changing, from now in case kube-api error we will send this error as log

MGMT-7452: Remove token from assisted-installer-controller log (opens…

2403dad

…hift#338)

OCPBUGSM-32117: Adding ipv6 support to dns busy address workaroung (o…

424639f

…penshift#339) We have a workaround that deletes service that took address of dns service. Till now it supported only ipv4. This change adds ipv6 support

MGMT-7450: Removing pull secret token from failure logs (openshift#340)

f3800cf

Adding function that will change token value to <SECRET>

NO-ISSUE: remove obsolete installation-timeout parameter (openshift#341)

4c04303

OCPBUGSM-31802: hosts that were moved to ready state in k8s, didn't s…

3f66cb7

…end (openshift#342) joined status to the cluster. Now they will send joined and only then done

MGMT-7178: Assisted-installer should not reboot worker till 2 masters…

52a722d

… join (openshift#343)

MGMT-7597: Remove ronniel1, razregev, and asalkeld from OWNERS (opens…

f663d71

…hift#345)

MGMT-7292: Wait for the OLM to be initilized before CR apply (openshi…

e346bc3

…ft#333)

OCPBUGSM-33782: Fix the manifest parsing (openshift#347)

cc8d5bb

This PR is fixing the manifest JSON parsing and improves logging.

NO-ISSUE: Use -v for efibootmgr to get detailed output (openshift#354)

dbbd369

Signed-off-by: Flavio Percoco <[email protected]> Co-authored-by: Flavio Percoco <[email protected]>

NO-ISSUE: upload controller logs with operator status before must-gat…

4f87783

…her (openshift#357)

mkowalski and others added 13 commits September 22, 2021 07:23

NO-ISSUE: Add lranjbar to OWNERS_ALIASES (openshift#356)

be3b5fd

MGMT-4078: Validate console operator in parallel to CVO (openshift#362)

f88da83

NO-ISSUE: Add commit-message prefix NO-ISSUE to dependanbot (openshif…

a4fc5e8

…t#364)

Bug 2004633: cluster deployment failed on connection error (openshift…

a6debec

…#366) Updated retry configuration to 1 hour

NO-ISSUE: Add omertuc to code-approvers (openshift#367)

aa6c55f

NO-ISSUE: Remove user yuvigold (openshift#370)

149411c

NO-ISSUE: moving images to be multiplatform

87aec67

tsorya force-pushed the igal/arm branch from aec2c75 to 87aec67 Compare October 31, 2021 13:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NO-ISSUE: moving images to be multiplatform #19

NO-ISSUE: moving images to be multiplatform #19

tsorya commented Jun 28, 2021

NO-ISSUE: moving images to be multiplatform #19

Are you sure you want to change the base?

NO-ISSUE: moving images to be multiplatform #19

Conversation

tsorya commented Jun 28, 2021