Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: [kserve] Add granite-7b-lab for vllm single model #581

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

sjmonson
Copy link
Contributor

@sjmonson sjmonson commented Nov 5, 2024

No description provided.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 5, 2024
Copy link

openshift-ci bot commented Nov 5, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kpouget for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

topsail-bot bot commented Nov 5, 2024

Jenkins Job #1603

🟢 Test of 'rhoai test test_ci' succeeded after 00 hours 36 minutes 56 seconds. 🟢

• Link to the test results.

• Link to the reports index.

Test configuration:

# RHOAI: run kserve test test_ci
PR_POSITIONAL_ARGS: vllm_instructlab_single_model_gating
PR_POSITIONAL_ARG_0: kserve-perf-ci
PR_POSITIONAL_ARG_1: vllm_instructlab_single_model_gating
matbench.lts.opensearch.export.enabled: false

• Link to the Rebuild page.

[Test ran on the internal Perflab CI]

@sjmonson sjmonson force-pushed the kserve/granite-7b-lab branch 2 times, most recently from 9e6af0f to 74b16c8 Compare November 5, 2024 23:22
Copy link

topsail-bot bot commented Nov 6, 2024

Jenkins Job #1604

🟢 Test of 'rhoai test test_ci' succeeded after 01 hours 03 minutes 55 seconds. 🟢

• Link to the test results.

• Link to the reports index.

Test configuration:

# RHOAI: run kserve test test_ci
PR_POSITIONAL_ARGS: vllm_instructlab_single_model_gating
PR_POSITIONAL_ARG_0: kserve-perf-ci
PR_POSITIONAL_ARG_1: vllm_instructlab_single_model_gating

• Link to the Rebuild page.

[Test ran on the internal Perflab CI]

Copy link

topsail-bot bot commented Nov 13, 2024

Jenkins Job #1647

🟢 Test of 'rhoai test test_ci' succeeded after 01 hours 07 minutes 28 seconds. 🟢

• Link to the test results.

• Link to the reports index.

Test configuration:

# RHOAI: run kserve test test_ci
PR_POSITIONAL_ARGS: vllm_instructlab_single_model_gating
PR_POSITIONAL_ARG_0: kserve-perf-ci
PR_POSITIONAL_ARG_1: vllm_instructlab_single_model_gating

• Link to the Rebuild page.

[Test ran on the internal Perflab CI]

Copy link

topsail-bot bot commented Nov 13, 2024

Jenkins Job #1648

🔴 Test of 'rhoai test test_ci' failed after 01 hours 07 minutes 15 seconds. 🔴

• Link to the test results.

• Link to the reports index.

Test configuration:

# RHOAI: run kserve test test_ci
PR_POSITIONAL_ARGS: vllm_instructlab_single_model_gating
PR_POSITIONAL_ARG_0: kserve-perf-ci
PR_POSITIONAL_ARG_1: vllm_instructlab_single_model_gating

• Link to the Rebuild page.

Failure indicator:

/logs/artifacts/002_test_ci/000__local_ci__run_multi_e2e_perf_test/FAILURE | [000__local_ci__run_multi_e2e_perf_test] ./run_toolbox.py from_config local_ci run_multi --suffix=deploy_and_test_sequentially --extra={} --> 2
/logs/artifacts/002_test_ci/000__local_ci__run_multi_e2e_perf_test/artifacts/ci-pod-0/002__granite-3-0-8b-instruct/002__kserve__deploy_model/FAILURE | [002__kserve__deploy_model] ./run_toolbox.py kserve deploy_model --namespace=kserve-e2e-perf --runtime=vllm --model_name=granite-3-0-8b-instruct --sr_name=vllm --sr_kserve_image=quay.io/modh/vllm@sha256:a8ba53e1b12309913cd958331dd8dda7f2b1fad39f5350d3c722608835e14512 --inference_service_name=granite-3-0-8b-instruct --delete_others=True --raw_deployment=True --> 2
/logs/artifacts/002_test_ci/000__local_ci__run_multi_e2e_perf_test/artifacts/ci-pod-0/002__granite-3-0-8b-instruct/FAILURE | granite-3-0-8b-instruct failed: CalledProcessError: Command 'set -o errexit;set -o pipefail;set -o nounset;set -o errtrace;ARTIFACT_DIR="/logs/artifacts/002__granite-3-0-8b-instruct" ./run_toolbox.py kserve deploy_model --namespace='kserve-e2e-perf' --runtime='vllm' --model_name='granite-3-0-8b-instruct' --sr_name='vllm' --sr_kserve_image='quay.io/modh/vllm@sha256:a8ba53e1b12309913cd958331dd8dda7f2b1fad39f5350d3c722608835e14512' --inference_service_name='granite-3-0-8b-instruct' --delete_others='True' --raw_deployment='True'' returned non-zero exit status 2.
/logs/artifacts/002_test_ci/003__plots/001__projects.kserve.visualizations.kserve-prom__all/FAILURE | An error happened during the results parsing, aborting the visualization (0_matbench_parse.log).
/logs/artifacts/002_test_ci/003__plots/002__projects.kserve.visualizations.kserve-prom__002__granite-3-0-8b-instruct/FAILURE | An error happened during the results parsing, aborting the visualization (0_matbench_parse.log).
/logs/artifacts/002_test_ci/003__plots/FAILURE | RuntimeError: An error happened during the results parsing, aborting the visualization (0_matbench_parse.log).
Traceback (most recent call last):
  File "/opt/topsail/src/projects/kserve/testing/test.py", line 98, in test_ci
    test_e2e.test_ci()
  File "/opt/topsail/src/projects/kserve/testing/test_e2e.py", line 110, in test_ci

[...]

[Test ran on the internal Perflab CI]

Copy link

topsail-bot bot commented Nov 14, 2024

Jenkins Job #1655

🟢 Test of 'rhoai test test_ci' succeeded after 02 hours 02 minutes 55 seconds. 🟢

• Link to the test results.

• Link to the reports index.

Test configuration:

# RHOAI: run kserve test test_ci
PR_POSITIONAL_ARGS: vllm_instructlab_single_model_gating
PR_POSITIONAL_ARG_0: kserve-perf-ci
PR_POSITIONAL_ARG_1: vllm_instructlab_single_model_gating
matbench.lts.opensearch.export.enabled: false

• Link to the Rebuild page.

[Test ran on the internal Perflab CI]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant