Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to handle SILNLP consistency run #557

Open
johnml1135 opened this issue Dec 4, 2024 · 0 comments
Open

Updates to handle SILNLP consistency run #557

johnml1135 opened this issue Dec 4, 2024 · 0 comments
Assignees

Comments

@johnml1135
Copy link
Collaborator

johnml1135 commented Dec 4, 2024

This idea is detailed here.

After the jobs are confirmed to work on SILNLP, they are moved to SF/Serval. To ensure that the quality has not decreased in the transition (or with successive builds), there need to be some updates to Serval/Machine.py.

The overall proposal would be that when the final configuration is determined by EITL, SILNLP would perform a run and output in a Json file:

  • A set of verses for validation
  • A description of the training setup
  • The bleu score (or set of bleu scores per book/verse, etc., or CHRF++, etc.)

Then, this Json file would be uploaded to SF. SF would then take the validation split verses and create an engine in Serval to be used for performing this type of evaluation run. When a new build is made, SF would specify the validation verses explicitly (and early stopping or a number of training steps if desired). Serval would perform the build and return the Bleu CHRF++ scores per validation verses.

SF would take these scores and compare them against the known good set and send an email (or other type of alert) to the EITL team to address the issue.

@johnml1135 johnml1135 added this to Serval Dec 4, 2024
@github-project-automation github-project-automation bot moved this to 🆕 New in Serval Dec 4, 2024
@johnml1135 johnml1135 changed the title Validation split - and resulting statistics Updates to handle SILNLP consistency run Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🆕 New
Development

No branches or pull requests

2 participants