-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Example Annotation with Showcase Notebook #550
Adding Example Annotation with Showcase Notebook #550
Conversation
…amples dir Signed-off-by: eshwarprasadS <[email protected]>
This looks like a useful example to have in the repository. I rarely use Jupyter notebooks and am not really setup with a way to validate or run this to ensure it works as expected, but perhaps that's something @aakankshaduggal or @khaledsulayman can help with to give this an approval? Also, we should think of how we keep this updated to ensure it doesn't get stale. The other examples all run as part of our unit test suite in test_examples.py. That may not be possible with a notebook in the same way, but something to think about so we know how we'll keep it working and up-to-date if we can't automatically test it. |
…teps Signed-off-by: eshwarprasadS <[email protected]>
@bbrowning Thanks for the comment. You are right, that this would be a new type of artifact that would need its own testing suite. I think there are a few different ways to test notebooks in AI / ML libraries (such as ours), but something I came across that might fit our pattern could be nbmake. This plugs in to pytest and can be part of our CI and can be activated like so (?)
I think it would make sense to make this a new issue, if we have a strong desire to keep our example jupyter notebooks tested and updated regularly. |
@eshwarprasadS Yes, it's fine to make a new issue to track figuring out how or if we can keep the notebooks updated. |
created this for book-keeping #562 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add the prompt creation/prompt string to the YAML file at the start of the notebook, but it's not necessary—just a suggestion. Other than that everything looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick suggestion for the notebook to make it more complete, we can add the creation of the annotation yamls/promts at the start. We assume the user has already done that when they start using the notebook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Abhishek, thanks for the comment. The intent of the notebook was also to demonstrate how to achieve a good prompt through prompt engineering (the way to achieve this in SDG is to change the yamls) step by step. That was the primary reason I did not do it as a 'once at the top' approach.
Having said this, do you think there's any merit to the approach I have taken here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions, but other than that LGTM.
…s, change backend to xgrammar Signed-off-by: eshwarprasadS <[email protected]>
Fixes #527 #525
This PR:
Adds an example notebook showing how to leverage SDG to enable annotation use cases.
The example explicitly calls to
guided_decoding_backend
of vLLM (xgrammar), to make sure that the annotation options (guided_choices
) are respected, while generating.Adds example notebook showcasing an end-to-end custom use case SDG to annotate a classification dataset using the composable components available SDG library (namely
Pipeline
)