Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic OpenShift support and setup [Part 1] #1955

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

tnozicka
Copy link
Member

@tnozicka tnozicka commented Jun 5, 2024

Description of your changes:
This PR brings the initial support for running on OpenShift.

It also fixes the operator when run on platforms where OwnerReferencesPermissionEnforcement is enabled.

It also moves all persistent storage to /var/lib which is available on all OSes, including atomic ones / CoreOS.

Note: This is part 1 that wires everything together and fixes most of the issues. Some tests that fail only on OpenShift still fail and need individual followups and we also needs to wire in proper cpu limits that flake a lot of tests waiting for certs to be generated.

Which issue is resolved by this Pull Request:
Resolves #713 #1935

@tnozicka tnozicka added kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jun 5, 2024
Copy link
Contributor

@tnozicka: GitHub didn't allow me to request PR reviews from the following users: tnozicka.

Note that only scylladb members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

Description of your changes:
TODO

Which issue is resolved by this Pull Request:
Resolves #713 #1935

/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@scylla-operator-bot scylla-operator-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 5, 2024
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tnozicka

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@scylla-operator-bot scylla-operator-bot bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 5, 2024
@tnozicka tnozicka force-pushed the fix-openshift branch 2 times, most recently from 0afcbff to 3b42f12 Compare June 14, 2024 10:09
@tnozicka tnozicka marked this pull request as draft June 14, 2024 11:14
@scylla-operator-bot scylla-operator-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 24, 2024
@tnozicka tnozicka force-pushed the fix-openshift branch 3 times, most recently from a9522f7 to 3fd195c Compare June 28, 2024 12:45
@scylla-operator-bot scylla-operator-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 9, 2024
@scylla-operator-bot scylla-operator-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 19, 2024
@tnozicka tnozicka force-pushed the fix-openshift branch 3 times, most recently from 75a8f53 to 8513810 Compare July 26, 2024 14:57
@tnozicka tnozicka marked this pull request as ready for review August 7, 2024 11:43
@tnozicka tnozicka force-pushed the fix-openshift branch 2 times, most recently from 0a08384 to 1475312 Compare August 9, 2024 09:09
hack/ci-deploy.sh Outdated Show resolved Hide resolved
@scylladb scylladb deleted a comment from scylla-operator-bot bot Aug 9, 2024
@scylladb scylladb deleted a comment from scylla-operator-bot bot Aug 9, 2024
@tnozicka tnozicka force-pushed the fix-openshift branch 4 times, most recently from 1dccbce to 7a2ea0e Compare December 27, 2024 13:54
@tnozicka tnozicka force-pushed the fix-openshift branch 2 times, most recently from 5800971 to 169c28a Compare December 27, 2024 15:12
@tnozicka tnozicka changed the title [WIP] Fix OpenShift [Part 1] Fix OpenShift [Part 1] Dec 27, 2024
@scylla-operator-bot scylla-operator-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 27, 2024
@tnozicka
Copy link
Member Author

/test e2e-openshift-aws-parallel
/test e2e-openshift-aws-serial
(out of interest, but expected to fail on some)

hack/.ci/run-e2e-openshift-aws.sh Outdated Show resolved Hide resolved
export SCYLLA_OPERATOR_FEATURE_GATES

for i in "${!KUBECONFIGS[@]}"; do
KUBECONFIG="${KUBECONFIGS[$i]}" DEPLOY_DIR="${ARTIFACTS}/deploy/${i}" timeout --foreground -v 10m "${parent_dir}/../ci-deploy.sh" "${SO_IMAGE}" &
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should either add a dedicated script running ci-deploy-release.sh and set it up in CI or at the very least have a tracking issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for openshift? the manifests are shared

Copy link
Member

@rzetelskik rzetelskik Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the split for GKE only coming from the fact ci-deploy.sh was there before the periodics or is there a different reason that's not there for openshift? In other words why would GKE periodics run a different script than the OpenShift ones (GKE periodics run ci-deploy-release.sh through run-e2e-gke-release.sh, while this runs ci-deploy.sh for both presubmits and periodics).

Copy link
Member Author

@tnozicka tnozicka Dec 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ci-deploy-release.sh uses the manifests from a real commit (which is not there for PRs because it's a local merge, so they use ci-deploy.sh, by design.

run-e2e-openshift-aws.sh is the entrypoint for presubmits (actually wired for periodics as well at this point)
run-e2e-openshift-aws-release.sh needs to be added later on and wired for postsubmits and periodics

test/e2e/set/nodeconfig/nodeconfig_disksetup.go Outdated Show resolved Hide resolved
@rzetelskik
Copy link
Member

Also please make the PR title a bit more descriptive

@tnozicka tnozicka changed the title Fix OpenShift [Part 1] Add basic OpenShift support [Part 1] Dec 27, 2024
@tnozicka tnozicka changed the title Add basic OpenShift support [Part 1] Add basic OpenShift support and setup[Part 1] Dec 27, 2024
@tnozicka tnozicka changed the title Add basic OpenShift support and setup[Part 1] Add basic OpenShift support and setup [Part 1] Dec 27, 2024
@tnozicka
Copy link
Member Author

/test e2e-openshift-aws-serial
(cluster provisioning failed)

@tnozicka
Copy link
Member Author

/retest
e2e-gke-release-script-latest should be unborked now

@tnozicka
Copy link
Member Author

tnozicka commented Dec 27, 2024

/test e2e-openshift-aws-parallel
/test e2e-openshift-aws-serial
(for reference - some of the failures are races / easy wins for the next PR)

Copy link
Contributor

scylla-operator-bot bot commented Dec 27, 2024

@tnozicka: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-openshift-aws-serial f0db302 link false /test e2e-openshift-aws-serial
ci/prow/e2e-openshift-aws-parallel f0db302 link false /test e2e-openshift-aws-parallel

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@tnozicka
Copy link
Member Author

Copy link
Contributor

@tnozicka: Overrode contexts on behalf of tnozicka: ci/prow/e2e-gke-release-script-latest

In response to this:

ci/prow/e2e-gke-release-script-latest

this has to fail because it deploys local-csi-driver from a released image and we have changed the mount paths

https://prow.scylla-operator.scylladb.com/view/gs/scylla-operator-prow/pr-logs/pull/scylladb_scylla-operator/1955/pull-scylla-operator-master-e2e-gke-release-script-latest/1872752453217161216#1:test-build-log.txt%3A287

note: 'MountVolume.SetUp failed for volume "volumes-dir" : hostPath type check failed:
 /mnt/persistent-volumes is not a directory'

https://gcsweb.scylla-operator.scylladb.com/gcs/scylla-operator-prow/pr-logs/pull/scylladb_scylla-operator/1955/pull-scylla-operator-master-e2e-gke-release-script-latest/1872752453217161216/artifacts/must-gather/0/namespaces/local-csi-driver/events.events.k8s.io/local-csi-driver-jtsx9.1815258d6c62fdda.yaml

/override ci/prow/e2e-gke-release-script-latest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tnozicka tnozicka requested a review from rzetelskik December 27, 2024 21:52
@rzetelskik
Copy link
Member

(one outstanding comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ScyllaDB installation on openshift
3 participants