Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to do stratified sampling for few-shot examples #111

Merged
merged 4 commits into from
Oct 25, 2023

Conversation

mrorii
Copy link

@mrorii mrorii commented Oct 20, 2023

Overview

This PR does the following:

  • adds an option to use stratified sampling for few-shots, as some of the datasets for tasks may have severe class imbalance (e.g. JNLI).
  • enable stratified sampling for JNLI and JCoLA (note that task versions have been bumped, since the metric definition has changed)

Details

Note that the base branch for this PR is set to use-japanese-prompt-for-jnli which corresponds to #110 (EDIT: the base branch is now jp-stable as #110 has been merged), as both #110 and this PR updates the task version for JNLI.

@mrorii mrorii self-assigned this Oct 20, 2023
@mrorii mrorii requested a review from jon-tow as a code owner October 20, 2023 07:59
@mrorii mrorii changed the title Stratified sampling few shot Add option to do stratified sampling for few-shot examples Oct 20, 2023
@mrorii mrorii requested review from mkshing and polm-stability and removed request for jon-tow October 25, 2023 04:44
Base automatically changed from use-japanese-prompt-for-jnli to jp-stable October 25, 2023 04:54
Copy link

@mkshing mkshing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Collaborator

@polm-stability polm-stability left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

The sampling logic here is tricky, especially the way rounding is handled, but working through some examples it seems OK to me. (This would be a great place for unit testing but I wouldn't want to introduce it right now with the way the code base is...)

@mrorii mrorii merged commit 9b42d41 into jp-stable Oct 25, 2023
1 check passed
@mrorii mrorii deleted the stratified-sampling-few-shot branch October 25, 2023 05:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants