Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bz:2274165] Test to verify mgr does not crash after enabling rook module #10525

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

parikshithb
Copy link
Contributor

Covers bz: 2274165

@parikshithb parikshithb added Test Case A test case PR Needs Testing Run tests and provide logs link team/e2e E2E team related issues/PRs labels Sep 18, 2024
@parikshithb parikshithb requested a review from a team as a code owner September 18, 2024 13:29
@pull-request-size pull-request-size bot added the size/L PR that changes 100-499 lines label Sep 18, 2024
Copy link

openshift-ci bot commented Sep 18, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: parikshithb

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Parikshith <[email protected]>
Signed-off-by: Parikshith <[email protected]>
Signed-off-by: Parikshith <[email protected]>
Copy link
Contributor

@nagendra202 nagendra202 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine for me. Can you please run a validation job?

sleep=30,
func=run_cmd_verify_cli_output,
cmd="ceph crash ls",
expected_output_lst={"mgr"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think archived crashes will still be listed in the "ceph crash ls" output. The test can fail falsely if there are any existing archived mgr crashes. Instead of checking for a specific mgr, should we check for
"expected_output_lst={"HEALTH_WARN", "daemons have recently crashed"}," and fail the test if we see this message?

Copy link
Contributor Author

@parikshithb parikshithb Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, make sense. I changed to use crash ls-new instead of crash ls which suits here. If there was a mgr crash before it would be archived and will not be displayed in crash ls-new o/p
As per ceph docs: Archive a crash report so that it is no longer considered for the RECENT_CRASH health check and does not appear in the crash ls-new output (it will still appear in the crash ls output).
https://docs.ceph.com/en/quincy/mgr/crash/#commands


"""
toolbox = pod.get_ceph_tools_pod()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can delete the empty line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@tier2
@brown_squad
@skipif_external_mode
@skipif_ocs_version("<4.16")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fix was backported to previous release as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: pbyregow-cn2
Cluster Configuration: conf/ocsci/live_deployment.yaml
PR Test Suite: tier2
PR Test Path: tests/functional/pod_and_daemons/test_mgr_enable_rook_backend_module.py
Additional Test Params:
OCP VERSION: 4.16-ga
OCS VERSION: 4.16
tested against branch: master

Job PASSED.

Signed-off-by: Parikshith <[email protected]>
Copy link

@ocs-ci ocs-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR validation on existing cluster

Cluster Name: pbyregow-cn2
Cluster Configuration: conf/ocsci/live_deployment.yaml
PR Test Suite: tier2
PR Test Path: tests/functional/pod_and_daemons/test_mgr_enable_rook_backend_module.py
Additional Test Params:
OCP VERSION: 4.16-ga
OCS VERSION: 4.16
tested against branch: master

Job PASSED.

@parikshithb parikshithb added Verified Mark when PR was verified and log provided and removed Needs Testing Run tests and provide logs link labels Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L PR that changes 100-499 lines team/e2e E2E team related issues/PRs Test Case A test case PR Verified Mark when PR was verified and log provided
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants