Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SGLang Integration + Accuracy Tests, Restructure app_tests/integration_tests #570

Merged
merged 4 commits into from
Nov 19, 2024

Conversation

stbaione
Copy link
Contributor

Description

This PR implements integration tests for the Shortfin LLM Server w/ the SGLang integration. It uses llama3-8b-instruct on GPU, which is downloaded using sharktank's hf_datasets script.

The tests server two purposes:

  1. Test that the SGLang integration works properly at a functional level.
  2. Test that the accuracy of the responses from the shortfin LLM server are consistent.
    • We have a batch of candidate questions, with expected answers
    • We have temperature set to 1.0, so the responses should be deterministic.

This test is intended to run every 4 hours, which allows for us to detect degradations in shortfin LLM output accuracy. If we do get a failure due to an accuracy degradation, there will only be a small set of shark-ai/iree commits that could be responsible.

@stbaione stbaione requested a review from renxida November 19, 2024 17:02
@stbaione stbaione self-assigned this Nov 19, 2024
Restructure app_tests/integration_tests,
Add copyright headers to files in integration_tests that were missing it
Add more logging and a little cleanup in sglang_frontend_test
@renxida renxida force-pushed the slg-integration-tests branch from 6f56b59 to 627af2d Compare November 19, 2024 21:41
@stbaione stbaione merged commit ac17f86 into nod-ai:main Nov 19, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants